ansel 2.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: ec23fbc485516d81eaf02b0439e5ca9178417035
4
+ data.tar.gz: 569ffd81d040ba067f667f2c7d944f80af9cc71c
5
+ SHA512:
6
+ metadata.gz: 4c81372c81cda650f3a65ee15924b791fb9deb907651693e18a2e46196da6e3c5754f74711fd23c08f363619b28f0099157d20448ba821f9e76488b0a5f173f2
7
+ data.tar.gz: 15642df832bcaa46f24d6de3cf9ef0866b7d5fddb3a0d6461f70f2bf5980e20db059eb516115ad83879cee3d35d5c8786d72e9dd509c3b665d921b0e6bc90177
@@ -0,0 +1,38 @@
1
+ ## 2.0.0
2
+
3
+ - Remove Iconv dependency (requires Ruby 1.9+)
4
+
5
+ ## 1.1.6
6
+
7
+ - Remove dependency on activesupport
8
+ - Remove dependency on jeweler
9
+ - Migrate test suite to Rspec2
10
+
11
+ ## 1.1.4
12
+
13
+ - New gemspec and Rakefile
14
+ - Rename History.txt to CHANGELOG.md
15
+
16
+ ## 1.1.3
17
+
18
+ - MIT license
19
+
20
+ ## 1.1.2
21
+
22
+ - Speed up conversion
23
+
24
+ ## 1.1.0
25
+
26
+ - Ruby 1.9 compatibility
27
+
28
+ ## 1.0.5
29
+
30
+ - Requires activesupport 2.3.5 and works when 3.0 is installed
31
+
32
+ ## 1.0.3
33
+
34
+ - Fix ActiveSupport deprecation warning
35
+
36
+ ## 1.0.0
37
+
38
+ - Initial public release
data/Gemfile ADDED
@@ -0,0 +1,7 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gemspec
4
+
5
+ group :development, :test do
6
+ gem 'rspec'
7
+ end
@@ -0,0 +1,28 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ ansel (2.0.0)
5
+
6
+ GEM
7
+ remote: https://rubygems.org/
8
+ specs:
9
+ diff-lcs (1.2.5)
10
+ rspec (3.1.0)
11
+ rspec-core (~> 3.1.0)
12
+ rspec-expectations (~> 3.1.0)
13
+ rspec-mocks (~> 3.1.0)
14
+ rspec-core (3.1.7)
15
+ rspec-support (~> 3.1.0)
16
+ rspec-expectations (3.1.2)
17
+ diff-lcs (>= 1.2.0, < 2.0)
18
+ rspec-support (~> 3.1.0)
19
+ rspec-mocks (3.1.3)
20
+ rspec-support (~> 3.1.0)
21
+ rspec-support (3.1.2)
22
+
23
+ PLATFORMS
24
+ ruby
25
+
26
+ DEPENDENCIES
27
+ ansel!
28
+ rspec
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009-2012 Keith Morrison <keithm@infused.org>
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,81 @@
1
+ # ANSEL
2
+
3
+ [![Version](http://img.shields.io/gem/v/ansel.svg?style=flat)](https://rubygems.org/gems/ansel)
4
+ [![Build Status](http://img.shields.io/travis/infused/ansel/master.svg?style=flat)](http://travis-ci.org/infused/ansel)
5
+ [![Code Quality](http://img.shields.io/codeclimate/github/infused/ansel.svg?style=flat)](https://codeclimate.com/github/infused/ansel)
6
+
7
+ ANSEL provides character set conversion from ANSEL to UTF-8
8
+
9
+ Copyright (c) 2006-2015 Keith Morrison <mailto:keithm@infused.org>, <http://www.infused.org>
10
+
11
+ - Project page: <http://github.com/infused/ansel>
12
+ - API Documentation: <http://rubydoc.info/github/infused/ansel/frames>
13
+ - Report bugs: <http://github.com/infused/ansel/issues>
14
+ - Questions? Email [keithm@infused.org](mailto:keithm@infused.org?subject=ANSE)
15
+ with ANSEL in the subject line
16
+
17
+ ## Compatibility
18
+
19
+ ANSEL is [tested](https://travis-ci.org/infused/ansel) to be compatible with the following Rubies:
20
+
21
+ * 1.9.2
22
+ * 1.9.3
23
+ * 2.0.0
24
+ * 2.1.0
25
+ * 2.1.1
26
+ * 2.1.2
27
+ * 2.1.3
28
+ * 2.1.4
29
+ * 2.1.5
30
+ * 2.2.0
31
+ * jruby 1.7+
32
+
33
+
34
+ If you need ANSEL convesion in Ruby 1.8, see my [ansel_iconv](http://github.com/infused/ansel_iconv) project.
35
+
36
+ ## Installation
37
+
38
+ gem install ansel
39
+
40
+ ## Basic Usage
41
+
42
+ Conversion from ANSEL to UTF-8 is fully supported.
43
+
44
+ require 'ansel'
45
+
46
+ converter = ANSEL::Converter.new
47
+ converter.convert("\xB9\x004.59") # => "£4.59"
48
+
49
+
50
+ ## About the ANSEL character set
51
+
52
+ [ANSI/NISO
53
+ Z39.47](http://www.niso.org/kst/reports/standards?step=2&gid%3Austring%3Aiso-8859-1=&project_key%3Austring%3Aiso-8859-1=0b5d2bd7b690b60fcc75cde9256ed9f9e526e531),
54
+ also known as ANSEL, is a character set encoding used primarily for
55
+ bibliographic and genealogical data. It is one of the official character
56
+ encodings supported by the [Gedcom
57
+ 5.5](http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gctoc.htm)
58
+ standard.
59
+
60
+ ## LICENSE:
61
+
62
+ Copyright (c) 2006-2015 Keith Morrison <keithm@infused.org>
63
+
64
+ Permission is hereby granted, free of charge, to any person obtaining
65
+ a copy of this software and associated documentation files (the
66
+ 'Software'), to deal in the Software without restriction, including
67
+ without limitation the rights to use, copy, modify, merge, publish,
68
+ distribute, sublicense, and/or sell copies of the Software, and to
69
+ permit persons to whom the Software is furnished to do so, subject to
70
+ the following conditions:
71
+
72
+ The above copyright notice and this permission notice shall be
73
+ included in all copies or substantial portions of the Software.
74
+
75
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
76
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
77
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
78
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
79
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
80
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
81
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,17 @@
1
+ # encoding: ascii-8bit
2
+
3
+ require 'rubygems'
4
+ require 'bundler/setup';
5
+ Bundler.setup(:default, :development)
6
+
7
+ require 'rspec/core/rake_task'
8
+ RSpec::Core::RakeTask.new :spec do |t|
9
+ t.rspec_opts = %w(--color)
10
+ end
11
+
12
+ task :default => :spec
13
+
14
+ desc "Open an irb session preloaded with this library"
15
+ task :console do
16
+ sh "irb -rubygems -I lib -r ansel.rb"
17
+ end
@@ -0,0 +1,23 @@
1
+ # encoding: ascii-8bit
2
+
3
+ lib = File.expand_path('../lib/', __FILE__)
4
+ $:.unshift lib unless $:.include?(lib)
5
+ require 'ansel/version'
6
+
7
+ Gem::Specification.new do |s|
8
+ s.name = 'ansel'
9
+ s.version = ANSEL::VERSION
10
+ s.authors = ["Keith Morrison"]
11
+ s.email = 'keithm@infused.org'
12
+ s.homepage = 'http://github.com/infused/ansel'
13
+ s.summary = 'Convert ANSEL encoded text to UTF-8'
14
+ s.description = 'Convert ANSEL encoded text to UTF-8'
15
+
16
+ s.rdoc_options = ['--charset=UTF-8']
17
+ s.extra_rdoc_files = ['README.md', 'CHANGELOG.md', 'MIT-LICENSE']
18
+ s.files = Dir['[A-Z]*', '{lib,test}/**/*', 'ansel.gemspec']
19
+ s.test_files = Dir.glob('test/**/*_test.rb')
20
+ s.require_paths = ['lib']
21
+
22
+ s.required_rubygems_version = '>= 1.3.0'
23
+ end
@@ -0,0 +1,4 @@
1
+ # encoding: ascii-8bit
2
+
3
+ require 'ansel/character_map'
4
+ require 'ansel/converter'
@@ -0,0 +1,568 @@
1
+ # encoding: ascii-8bit
2
+
3
+ module ANSEL
4
+ module CharacterMap
5
+ @@non_combining = {
6
+ "ERR" => "\xFF\xFD", # � - REPLACEMENT CHARACTER
7
+ "88" => "", # NON-SORT BEGIN / START OF STRING
8
+ "89" => "", # NON-SORT END / STRING TERMINATOR
9
+ "8D" => "", # JOINER / ZERO WIDTH JOINER
10
+ "8E" => "", # NON-JOINER / ZERO WIDTH NON-JOINER
11
+ "A1" => "\x01\x41", # Ł - UPPERCASE POLISH L / LATIN CAPITAL LETTER L WITH STROKE
12
+ "A2" => "\x00\xD8", # Ø - UPPERCASE SCANDINAVIAN O / LATIN CAPITAL LETTER O WITH STROKE
13
+ "A3" => "\x01\x10", # Đ - UPPERCASE D WITH CROSSBAR / LATIN CAPITAL LETTER D WITH STROKE
14
+ "A4" => "\x00\xDE", # Þ - UPPERCASE ICELANDIC THORN / LATIN CAPITAL LETTER THORN (Icelandic)
15
+ "A5" => "\x00\xC6", # Æ - UPPERCASE DIGRAPH AE / LATIN CAPITAL LIGATURE AE
16
+ "A6" => "\x01\x52", # Œ - UPPERCASE DIGRAPH OE / LATIN CAPITAL LIGATURE OE
17
+ "A7" => "\x02\xB9", # ʹ - SOFT SIGN, PRIME / MODIFIER LETTER PRIME
18
+ "A8" => "\x00\xB7", # · - MIDDLE DOT
19
+ "A9" => "\x26\x6D", # ♭ - MUSIC FLAT SIGN
20
+ "AA" => "\x00\xAE", # ® - PATENT MARK / REGISTERED SIGN
21
+ "AB" => "\x00\xB1", # ± - PLUS OR MINUS / PLUS-MINUS SIGN
22
+ "AC" => "\x01\xA0", # Ơ - UPPERCASE O-HOOK / LATIN CAPITAL LETTER O WITH HORN
23
+ "AD" => "\x01\xAF", # Ư - UPPERCASE U-HOOK / LATIN CAPITAL LETTER U WITH HORN
24
+ "AE" => "\x02\xBC", # ʼ - ALIF / MODIFIER LETTER APOSTROPHE
25
+ "B0" => "\x02\xBB", # ʻ - AYN / MODIFIER LETTER TURNED COMMA
26
+ "B1" => "\x01\x42", # ł - LOWERCASE POLISH L / LATIN SMALL LETTER L WITH STROKE
27
+ "B2" => "\x00\xF8", # ø - LOWERCASE SCANDINAVIAN O / LATIN SMALL LETTER O WITH STROKE
28
+ "B3" => "\x01\x11", # đ - LOWERCASE D WITH CROSSBAR / LATIN SMALL LETTER D WITH STROKE
29
+ "B4" => "\x00\xFE", # þ - LOWERCASE ICELANDIC THORN / LATIN SMALL LETTER THORN (Icelandic)
30
+ "B5" => "\x00\xE6", # æ - LOWERCASE DIGRAPH AE / LATIN SMALL LIGATURE AE
31
+ "B6" => "\x01\x53", # œ - LOWERCASE DIGRAPH OE / LATIN SMALL LIGATURE OE
32
+ "B7" => "\x02\xBA", # ʺ - HARD SIGN, DOUBLE PRIME / MODIFIER LETTER DOUBLE PRIME
33
+ "B8" => "\x01\x31", # ı - LOWERCASE TURKISH I / LATIN SMALL LETTER DOTLESS I
34
+ "B9" => "\x00\xA3", # £ - BRITISH POUND / POUND SIGN
35
+ "BA" => "\x00\xF0", # ð - LOWERCASE ETH / LATIN SMALL LETTER ETH (Icelandic)
36
+ "BC" => "\x01\xA1", # ơ - LOWERCASE O-HOOK / LATIN SMALL LETTER O WITH HORN
37
+ "BD" => "\x01\xB0", # ư - LOWERCASE U-HOOK / LATIN SMALL LETTER U WITH HORN
38
+ "C0" => "\x00\xB0", # ° - DEGREE SIGN
39
+ "C1" => "\x21\x13", # ℓ - SCRIPT SMALL L
40
+ "C2" => "\x21\x17", # ℗ - SOUND RECORDING COPYRIGHT
41
+ "C3" => "\x00\xA9", # © - COPYRIGHT SIGN
42
+ "C4" => "\x26\x6F", # ♯ - MUSIC SHARP SIGN
43
+ "C5" => "\x00\xBF", # ¿ - INVERTED QUESTION MARK
44
+ "C6" => "\x00\xA1", # ¡ - INVERTED EXCLAMATION MARK
45
+ "C7" => "\x00\xDF", # ß - ESZETT SYMBOL
46
+ "C8" => "\x20\xAC" # € - EURO SIGN
47
+ }
48
+
49
+ @@combining = {
50
+ "E0+41" => "\x1E\xA2", # Ả - LATIN CAPITAL LETTER A WITH HOOK ABOVE
51
+ "E0+45" => "\x1E\xBA", # LATIN CAPITAL LETTER E WITH HOOK ABOVE
52
+ "E0+49" => "\x1E\xC8", # LATIN CAPITAL LETTER I WITH HOOK ABOVE
53
+ "E0+4F" => "\x1E\xCE", # LATIN CAPITAL LETTER O WITH HOOK ABOVE
54
+ "E0+55" => "\x1E\xE6", # LATIN CAPITAL LETTER U WITH HOOK ABOVE
55
+ "E0+59" => "\x1E\xF6", # LATIN CAPITAL LETTER Y WITH HOOK ABOVE
56
+ "E0+61" => "\x1E\xA3", # LATIN SMALL LETTER A WITH HOOK ABOVE
57
+ "E0+65" => "\x1E\xBB", # LATIN SMALL LETTER E WITH HOOK ABOVE
58
+ "E0+69" => "\x1E\xC9", # LATIN SMALL LETTER I WITH HOOK ABOVE
59
+ "E0+6F" => "\x1E\xCF", # LATIN SMALL LETTER O WITH HOOK ABOVE
60
+ "E0+75" => "\x1E\xE7", # LATIN SMALL LETTER U WITH HOOK ABOVE
61
+ "E0+79" => "\x1E\xF7", # LATIN SMALL LETTER Y WITH HOOK ABOVE
62
+ "E0+E3+41" => "\x1E\xA8", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
63
+ "E0+E3+45" => "\x1E\xC2", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
64
+ "E0+E3+4F" => "\x1E\xD4", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
65
+ "E0+E3+61" => "\x1E\xA9", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
66
+ "E0+E3+65" => "\x1E\xC3", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
67
+ "E0+E3+6F" => "\x1E\xD5", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
68
+ "E0+E6+41" => "\x1E\xB2", # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
69
+ "E0+E6+61" => "\x1E\xB3", # LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
70
+ "E0" => "\x03\x09", # COMBINING HOOK ABOVE
71
+ "E1+41" => "\x00\xC0", # LATIN CAPITAL LETTER A WITH GRAVE
72
+ "E1+45" => "\x00\xC8", # LATIN CAPITAL LETTER E WITH GRAVE
73
+ "E1+49" => "\x00\xCC", # LATIN CAPITAL LETTER I WITH GRAVE
74
+ "E1+4F" => "\x00\xD2", # LATIN CAPITAL LETTER O WITH GRAVE
75
+ "E1+55" => "\x00\xD9", # LATIN CAPITAL LETTER U WITH GRAVE
76
+ "E1+57" => "\x1E\x80", # LATIN CAPITAL LETTER W WITH GRAVE
77
+ "E1+59" => "\x1E\xF2", # LATIN CAPITAL LETTER Y WITH GRAVE
78
+ "E1+61" => "\x00\xE0", # LATIN SMALL LETTER A WITH GRAVE
79
+ "E1+65" => "\x00\xE8", # LATIN SMALL LETTER E WITH GRAVE
80
+ "E1+69" => "\x00\xEC", # LATIN SMALL LETTER I WITH GRAVE
81
+ "E1+6F" => "\x00\xF2", # LATIN SMALL LETTER O WITH GRAVE
82
+ "E1+75" => "\x00\xF9", # LATIN SMALL LETTER U WITH GRAVE
83
+ "E1+77" => "\x1E\x81", # LATIN SMALL LETTER W WITH GRAVE
84
+ "E1+79" => "\x1E\xF3", # LATIN SMALL LETTER Y WITH GRAVE
85
+ "E1+E3+41" => "\x1E\xA6", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
86
+ "E1+E3+45" => "\x1E\xC0", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
87
+ "E1+E3+4F" => "\x1E\xD2", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
88
+ "E1+E3+61" => "\x1E\xA7", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
89
+ "E1+E3+65" => "\x1E\xC1", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
90
+ "E1+E3+6F" => "\x1E\xD3", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
91
+ "E1+E5+45" => "\x1E\x14", # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
92
+ "E1+E5+4F" => "\x1E\x50", # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
93
+ "E1+E5+65" => "\x1E\x15", # LATIN SMALL LETTER E WITH MACRON AND GRAVE
94
+ "E1+E5+6F" => "\x1E\x51", # LATIN SMALL LETTER O WITH MACRON AND GRAVE
95
+ "E1+E6+41" => "\x1E\xB0", # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
96
+ "E1+E6+61" => "\x1E\xB1", # LATIN SMALL LETTER A WITH BREVE AND GRAVE
97
+ "E1+E8+55" => "\x01\xDB", # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
98
+ "E1+E8+75" => "\x01\xDC", # LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE
99
+ "E1" => "\x03\x00", # COMBINING GRAVE ACCENT
100
+ "E2+41" => "\x00\xC1", # LATIN CAPITAL LETTER A WITH ACUTE
101
+ "E2+43" => "\x01\x06", # LATIN CAPITAL LETTER C WITH ACUTE
102
+ "E2+45" => "\x00\xC9", # LATIN CAPITAL LETTER E WITH ACUTE
103
+ "E2+47" => "\x01\xF4", # LATIN CAPITAL LETTER G WITH ACUTE
104
+ "E2+49" => "\x00\xCD", # LATIN CAPITAL LETTER I WITH ACUTE
105
+ "E2+4B" => "\x1E\x30", # LATIN CAPITAL LETTER K WITH ACUTE
106
+ "E2+4C" => "\x01\x39", # LATIN CAPITAL LETTER L WITH ACUTE
107
+ "E2+4D" => "\x1E\x3E", # LATIN CAPITAL LETTER M WITH ACUTE
108
+ "E2+4E" => "\x01\x43", # LATIN CAPITAL LETTER N WITH ACUTE
109
+ "E2+4F" => "\x00\xD3", # LATIN CAPITAL LETTER O WITH ACUTE
110
+ "E2+50" => "\x1E\x54", # LATIN CAPITAL LETTER P WITH ACUTE
111
+ "E2+52" => "\x01\x54", # LATIN CAPITAL LETTER R WITH ACUTE
112
+ "E2+53" => "\x01\x5A", # LATIN CAPITAL LETTER S WITH ACUTE
113
+ "E2+55" => "\x00\xDA", # LATIN CAPITAL LETTER U WITH ACUTE
114
+ "E2+57" => "\x1E\x82", # LATIN CAPITAL LETTER W WITH ACUTE
115
+ "E2+59" => "\x00\xDD", # LATIN CAPITAL LETTER Y WITH ACUTE
116
+ "E2+5A" => "\x01\x79", # LATIN CAPITAL LETTER Z WITH ACUTE
117
+ "E2+61" => "\x00\xE1", # LATIN SMALL LETTER A WITH ACUTE
118
+ "E2+63" => "\x01\x07", # LATIN SMALL LETTER C WITH ACUTE
119
+ "E2+65" => "\x00\xE9", # LATIN SMALL LETTER E WITH ACUTE
120
+ "E2+67" => "\x01\xF5", # LATIN SMALL LETTER G WITH ACUTE
121
+ "E2+69" => "\x00\xED", # LATIN SMALL LETTER I WITH ACUTE
122
+ "E2+6B" => "\x1E\x31", # LATIN SMALL LETTER K WITH ACUTE
123
+ "E2+6C" => "\x01\x3A", # LATIN SMALL LETTER L WITH ACUTE
124
+ "E2+6D" => "\x1E\x3F", # LATIN SMALL LETTER M WITH ACUTE
125
+ "E2+6E" => "\x01\x44", # LATIN SMALL LETTER N WITH ACUTE
126
+ "E2+6F" => "\x00\xF3", # LATIN SMALL LETTER O WITH ACUTE
127
+ "E2+70" => "\x1E\x55", # LATIN SMALL LETTER P WITH ACUTE
128
+ "E2+72" => "\x01\x55", # LATIN SMALL LETTER R WITH ACUTE
129
+ "E2+73" => "\x01\x5B", # LATIN SMALL LETTER S WITH ACUTE
130
+ "E2+75" => "\x00\xFA", # LATIN SMALL LETTER U WITH ACUTE
131
+ "E2+77" => "\x1E\x83", # LATIN SMALL LETTER W WITH ACUTE
132
+ "E2+79" => "\x00\xFD", # LATIN SMALL LETTER Y WITH ACUTE
133
+ "E2+7A" => "\x01\x7A", # LATIN SMALL LETTER Z WITH ACUTE
134
+ "E2+A5" => "\x01\xFC", # LATIN CAPITAL LETTER AE WITH ACUTE
135
+ "E2+B5" => "\x01\xFD", # LATIN SMALL LETTER AE WITH ACUTE
136
+ "E2+E3+41" => "\x1E\xA4", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
137
+ "E2+E3+45" => "\x1E\xBE", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
138
+ "E2+E3+4F" => "\x1E\xD0", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
139
+ "E2+E3+61" => "\x1E\xA5", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
140
+ "E2+E3+65" => "\x1E\xBF", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
141
+ "E2+E3+6F" => "\x1E\xD1", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
142
+ "E2+E4+4F" => "\x1E\x4C", # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
143
+ "E2+E4+55" => "\x1E\x78", # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
144
+ "E2+E4+6F" => "\x1E\x4D", # LATIN SMALL LETTER O WITH TILDE AND ACUTE
145
+ "E2+E4+75" => "\x1E\x79", # LATIN SMALL LETTER U WITH TILDE AND ACUTE
146
+ "E2+E5+45" => "\x1E\x16", # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
147
+ "E2+E5+4F" => "\x1E\x52", # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
148
+ "E2+E5+65" => "\x1E\x17", # LATIN SMALL LETTER E WITH MACRON AND ACUTE
149
+ "E2+E5+6F" => "\x1E\x53", # LATIN SMALL LETTER O WITH MACRON AND ACUTE
150
+ "E2+E6+41" => "\x1E\xAE", # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
151
+ "E2+E6+61" => "\x1E\xAF", # LATIN SMALL LETTER A WITH BREVE AND ACUTE
152
+ "E2+E7+53" => "\x1E\x64", # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
153
+ "E2+E7+73" => "\x1E\x65", # LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
154
+ "E2+E8+49" => "\x1E\x2E", # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
155
+ "E2+E8+55" => "\x01\xD7", # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
156
+ "E2+E8+69" => "\x1E\x2F", # LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
157
+ "E2+E8+75" => "\x01\xD8", # LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
158
+ "E2+EA+41" => "\x01\xFA", # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
159
+ "E2+EA+61" => "\x01\xFB", # LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
160
+ "E2+F0+43" => "\x1E\x08", # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
161
+ "E2+F0+63" => "\x1E\x09", # LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
162
+ "E2" => "\x03\x01", # COMBINING ACUTE ACCENT
163
+ "E3+41" => "\x00\xC2", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
164
+ "E3+43" => "\x01\x08", # LATIN CAPITAL LETTER C WITH CIRCUMFLEX
165
+ "E3+45" => "\x00\xCA", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
166
+ "E3+47" => "\x01\x1C", # LATIN CAPITAL LETTER G WITH CIRCUMFLEX
167
+ "E3+48" => "\x01\x24", # LATIN CAPITAL LETTER H WITH CIRCUMFLEX
168
+ "E3+49" => "\x00\xCE", # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
169
+ "E3+4A" => "\x01\x34", # LATIN CAPITAL LETTER J WITH CIRCUMFLEX
170
+ "E3+4F" => "\x00\xD4", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
171
+ "E3+53" => "\x01\x5C", # LATIN CAPITAL LETTER S WITH CIRCUMFLEX
172
+ "E3+55" => "\x00\xDB", # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
173
+ "E3+57" => "\x01\x74", # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
174
+ "E3+59" => "\x01\x76", # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
175
+ "E3+5A" => "\x1E\x90", # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX
176
+ "E3+61" => "\x00\xE2", # LATIN SMALL LETTER A WITH CIRCUMFLEX
177
+ "E3+63" => "\x01\x09", # LATIN SMALL LETTER C WITH CIRCUMFLEX
178
+ "E3+65" => "\x00\xEA", # LATIN SMALL LETTER E WITH CIRCUMFLEX
179
+ "E3+67" => "\x01\x1D", # LATIN SMALL LETTER G WITH CIRCUMFLEX
180
+ "E3+68" => "\x01\x25", # LATIN SMALL LETTER H WITH CIRCUMFLEX
181
+ "E3+69" => "\x00\xEE", # LATIN SMALL LETTER I WITH CIRCUMFLEX
182
+ "E3+6A" => "\x01\x35", # LATIN SMALL LETTER J WITH CIRCUMFLEX
183
+ "E3+6F" => "\x00\xF4", # LATIN SMALL LETTER O WITH CIRCUMFLEX
184
+ "E3+73" => "\x01\x5D", # LATIN SMALL LETTER S WITH CIRCUMFLEX
185
+ "E3+75" => "\x00\xFB", # LATIN SMALL LETTER U WITH CIRCUMFLEX
186
+ "E3+77" => "\x01\x75", # LATIN SMALL LETTER W WITH CIRCUMFLEX
187
+ "E3+79" => "\x01\x77", # LATIN SMALL LETTER Y WITH CIRCUMFLEX
188
+ "E3+7A" => "\x1E\x91", # LATIN SMALL LETTER Z WITH CIRCUMFLEX
189
+ "E3+E0+41" => "\x1E\xA8", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
190
+ "E3+E0+45" => "\x1E\xC2", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
191
+ "E3+E0+4F" => "\x1E\xD4", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
192
+ "E3+E0+61" => "\x1E\xA9", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
193
+ "E3+E0+65" => "\x1E\xC3", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
194
+ "E3+E0+6F" => "\x1E\xD5", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
195
+ "E3+E1+41" => "\x1E\xA6", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
196
+ "E3+E1+45" => "\x1E\xC0", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
197
+ "E3+E1+4F" => "\x1E\xD2", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
198
+ "E3+E1+61" => "\x1E\xA7", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
199
+ "E3+E1+65" => "\x1E\xC1", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
200
+ "E3+E1+6F" => "\x1E\xD3", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
201
+ "E3+E2+41" => "\x1E\xA4", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
202
+ "E3+E2+45" => "\x1E\xBE", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
203
+ "E3+E2+4F" => "\x1E\xD0", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
204
+ "E3+E2+61" => "\x1E\xA5", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
205
+ "E3+E2+65" => "\x1E\xBF", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
206
+ "E3+E2+6F" => "\x1E\xD1", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
207
+ "E3+E4+41" => "\x1E\xAA", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
208
+ "E3+E4+45" => "\x1E\xC4", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
209
+ "E3+E4+4F" => "\x1E\xD6", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
210
+ "E3+E4+61" => "\x1E\xAB", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
211
+ "E3+E4+65" => "\x1E\xC5", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
212
+ "E3+E4+6F" => "\x1E\xD7", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
213
+ "E3+F2+41" => "\x1E\xAC", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
214
+ "E3+F2+45" => "\x1E\xC6", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
215
+ "E3+F2+4F" => "\x1E\xD8", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
216
+ "E3+F2+61" => "\x1E\xAD", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
217
+ "E3+F2+65" => "\x1E\xC7", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
218
+ "E3+F2+6F" => "\x1E\xD9", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
219
+ "E3" => "\x03\x02", # COMBINING CIRCUMFLEX ACCENT
220
+ "E4+41" => "\x00\xC3", # LATIN CAPITAL LETTER A WITH TILDE
221
+ "E4+45" => "\x1E\xBC", # LATIN CAPITAL LETTER E WITH TILDE
222
+ "E4+49" => "\x01\x28", # LATIN CAPITAL LETTER I WITH TILDE
223
+ "E4+4E" => "\x00\xD1", # LATIN CAPITAL LETTER N WITH TILDE
224
+ "E4+4F" => "\x00\xD5", # LATIN CAPITAL LETTER O WITH TILDE
225
+ "E4+55" => "\x01\x68", # LATIN CAPITAL LETTER U WITH TILDE
226
+ "E4+56" => "\x1E\x7C", # LATIN CAPITAL LETTER V WITH TILDE
227
+ "E4+59" => "\x1E\xF8", # LATIN CAPITAL LETTER Y WITH TILDE
228
+ "E4+61" => "\x00\xE3", # LATIN SMALL LETTER A WITH TILDE
229
+ "E4+65" => "\x1E\xBD", # LATIN SMALL LETTER E WITH TILDE
230
+ "E4+69" => "\x01\x29", # LATIN SMALL LETTER I WITH TILDE
231
+ "E4+6E" => "\x00\xF1", # LATIN SMALL LETTER N WITH TILDE
232
+ "E4+6F" => "\x00\xF5", # LATIN SMALL LETTER O WITH TILDE
233
+ "E4+75" => "\x01\x69", # LATIN SMALL LETTER U WITH TILDE
234
+ "E4+76" => "\x1E\x7D", # LATIN SMALL LETTER V WITH TILDE
235
+ "E4+79" => "\x1E\xF9", # LATIN SMALL LETTER Y WITH TILDE
236
+ "E4+E2+4F" => "\x1E\x4C", # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
237
+ "E4+E2+55" => "\x1E\x78", # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
238
+ "E4+E2+6F" => "\x1E\x4D", # LATIN SMALL LETTER O WITH TILDE AND ACUTE
239
+ "E4+E2+75" => "\x1E\x79", # LATIN SMALL LETTER U WITH TILDE AND ACUTE
240
+ "E4+E3+41" => "\x1E\xAA", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
241
+ "E4+E3+45" => "\x1E\xC4", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
242
+ "E4+E3+4F" => "\x1E\xD6", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
243
+ "E4+E3+61" => "\x1E\xAB", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
244
+ "E4+E3+65" => "\x1E\xC5", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
245
+ "E4+E3+6F" => "\x1E\xD7", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
246
+ "E4+E6+41" => "\x1E\xB4", # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
247
+ "E4+E6+61" => "\x1E\xB5", # LATIN SMALL LETTER A WITH BREVE AND TILDE
248
+ "E4+E8+4F" => "\x1E\x4E", # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
249
+ "E4+E8+6F" => "\x1E\x4F", # LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
250
+ "E4" => "\x03\x03", # COMBINING TILDE
251
+ "E5+41" => "\x01\x00", # LATIN CAPITAL LETTER A WITH MACRON
252
+ "E5+45" => "\x01\x12", # LATIN CAPITAL LETTER E WITH MACRON
253
+ "E5+47" => "\x1E\x20", # LATIN CAPITAL LETTER G WITH MACRON
254
+ "E5+49" => "\x01\x2A", # LATIN CAPITAL LETTER I WITH MACRON
255
+ "E5+4F" => "\x01\x4C", # LATIN CAPITAL LETTER O WITH MACRON
256
+ "E5+55" => "\x01\x6A", # LATIN CAPITAL LETTER U WITH MACRON
257
+ "E5+61" => "\x01\x01", # LATIN SMALL LETTER A WITH MACRON
258
+ "E5+65" => "\x01\x13", # LATIN SMALL LETTER E WITH MACRON
259
+ "E5+67" => "\x1E\x21", # LATIN SMALL LETTER G WITH MACRON
260
+ "E5+69" => "\x01\x2B", # LATIN SMALL LETTER I WITH MACRON
261
+ "E5+6F" => "\x01\x4D", # LATIN SMALL LETTER O WITH MACRON
262
+ "E5+75" => "\x01\x6B", # LATIN SMALL LETTER U WITH MACRON
263
+ "E5+A5" => "\x01\xE2", # LATIN CAPITAL LETTER AE WITH MACRON
264
+ "E5+B5" => "\x01\xE3", # LATIN SMALL LETTER AE WITH MACRON
265
+ "E5+E1+45" => "\x1E\x14", # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
266
+ "E5+E1+4F" => "\x1E\x50", # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
267
+ "E5+E1+65" => "\x1E\x15", # LATIN SMALL LETTER E WITH MACRON AND GRAVE
268
+ "E5+E1+6F" => "\x1E\x51", # LATIN SMALL LETTER O WITH MACRON AND GRAVE
269
+ "E5+E2+45" => "\x1E\x16", # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
270
+ "E5+E2+4F" => "\x1E\x52", # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
271
+ "E5+E2+65" => "\x1E\x17", # LATIN SMALL LETTER E WITH MACRON AND ACUTE
272
+ "E5+E2+6F" => "\x1E\x53", # LATIN SMALL LETTER O WITH MACRON AND ACUTE
273
+ "E5+E7+41" => "\x01\xE0", # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
274
+ "E5+E7+61" => "\x01\xE1", # LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
275
+ "E5+E8+41" => "\x01\xDE", # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
276
+ "E5+E8+55" => "\x1E\x7A", # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
277
+ "E5+E8+61" => "\x01\xDF", # LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
278
+ "E5+E8+75" => "\x1E\x7B", # LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
279
+ "E5+F1+4F" => "\x01\xEC", # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
280
+ "E5+F1+6F" => "\x01\xED", # LATIN SMALL LETTER O WITH OGONEK AND MACRON
281
+ "E5+F2+4C" => "\x1E\x38", # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
282
+ "E5+F2+52" => "\x1E\x5C", # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
283
+ "E5+F2+6C" => "\x1E\x39", # LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
284
+ "E5+F2+72" => "\x1E\x5D", # LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
285
+ "E5" => "\x03\x04", # COMBINING MACRON
286
+ "E6+41" => "\x01\x02", # LATIN CAPITAL LETTER A WITH BREVE
287
+ "E6+45" => "\x01\x14", # LATIN CAPITAL LETTER E WITH BREVE
288
+ "E6+47" => "\x01\x1E", # LATIN CAPITAL LETTER G WITH BREVE
289
+ "E6+49" => "\x01\x2C", # LATIN CAPITAL LETTER I WITH BREVE
290
+ "E6+4F" => "\x01\x4E", # LATIN CAPITAL LETTER O WITH BREVE
291
+ "E6+55" => "\x01\x6C", # LATIN CAPITAL LETTER U WITH BREVE
292
+ "E6+61" => "\x01\x03", # LATIN SMALL LETTER A WITH BREVE
293
+ "E6+65" => "\x01\x15", # LATIN SMALL LETTER E WITH BREVE
294
+ "E6+67" => "\x01\x1F", # LATIN SMALL LETTER G WITH BREVE
295
+ "E6+69" => "\x01\x2D", # LATIN SMALL LETTER I WITH BREVE
296
+ "E6+6F" => "\x01\x4F", # LATIN SMALL LETTER O WITH BREVE
297
+ "E6+75" => "\x01\x6D", # LATIN SMALL LETTER U WITH BREVE
298
+ "E6+E0+41" => "\x1E\xB2", # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
299
+ "E6+E0+61" => "\x1E\xB3", # LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
300
+ "E6+E1+41" => "\x1E\xB0", # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
301
+ "E6+E1+61" => "\x1E\xB1", # LATIN SMALL LETTER A WITH BREVE AND GRAVE
302
+ "E6+E2+41" => "\x1E\xAE", # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
303
+ "E6+E2+61" => "\x1E\xAF", # LATIN SMALL LETTER A WITH BREVE AND ACUTE
304
+ "E6+E4+41" => "\x1E\xB4", # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
305
+ "E6+E4+61" => "\x1E\xB5", # LATIN SMALL LETTER A WITH BREVE AND TILDE
306
+ "E6+F0+45" => "\x1E\x1C", # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
307
+ "E6+F0+65" => "\x1E\x1D", # LATIN SMALL LETTER E WITH CEDILLA AND BREVE
308
+ "E6+F2+41" => "\x1E\xB6", # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
309
+ "E6+F2+61" => "\x1E\xB7", # LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
310
+ "E6" => "\x03\x06", # COMBINING BREVE
311
+ "E7+42" => "\x1E\x02", # LATIN CAPITAL LETTER B WITH DOT ABOVE
312
+ "E7+43" => "\x01\x0A", # LATIN CAPITAL LETTER C WITH DOT ABOVE
313
+ "E7+44" => "\x1E\x0A", # LATIN CAPITAL LETTER D WITH DOT ABOVE
314
+ "E7+45" => "\x01\x16", # LATIN CAPITAL LETTER E WITH DOT ABOVE
315
+ "E7+46" => "\x1E\x1E", # LATIN CAPITAL LETTER F WITH DOT ABOVE
316
+ "E7+47" => "\x01\x20", # LATIN CAPITAL LETTER G WITH DOT ABOVE
317
+ "E7+48" => "\x1E\x22", # LATIN CAPITAL LETTER H WITH DOT ABOVE
318
+ "E7+49" => "\x01\x30", # LATIN CAPITAL LETTER I WITH DOT ABOVE
319
+ "E7+4D" => "\x1E\x40", # LATIN CAPITAL LETTER M WITH DOT ABOVE
320
+ "E7+4E" => "\x1E\x44", # LATIN CAPITAL LETTER N WITH DOT ABOVE
321
+ "E7+50" => "\x1E\x56", # LATIN CAPITAL LETTER P WITH DOT ABOVE
322
+ "E7+52" => "\x1E\x58", # LATIN CAPITAL LETTER R WITH DOT ABOVE
323
+ "E7+53" => "\x1E\x60", # LATIN CAPITAL LETTER S WITH DOT ABOVE
324
+ "E7+54" => "\x1E\x6A", # LATIN CAPITAL LETTER T WITH DOT ABOVE
325
+ "E7+57" => "\x1E\x86", # LATIN CAPITAL LETTER W WITH DOT ABOVE
326
+ "E7+58" => "\x1E\x8A", # LATIN CAPITAL LETTER X WITH DOT ABOVE
327
+ "E7+59" => "\x1E\x8E", # LATIN CAPITAL LETTER Y WITH DOT ABOVE
328
+ "E7+5A" => "\x01\x7B", # LATIN CAPITAL LETTER Z WITH DOT ABOVE
329
+ "E7+62" => "\x1E\x03", # LATIN SMALL LETTER B WITH DOT ABOVE
330
+ "E7+63" => "\x01\x0B", # LATIN SMALL LETTER C WITH DOT ABOVE
331
+ "E7+64" => "\x1E\x0B", # LATIN SMALL LETTER D WITH DOT ABOVE
332
+ "E7+65" => "\x01\x17", # LATIN SMALL LETTER E WITH DOT ABOVE
333
+ "E7+66" => "\x1E\x1F", # LATIN SMALL LETTER F WITH DOT ABOVE
334
+ "E7+67" => "\x01\x21", # LATIN SMALL LETTER G WITH DOT ABOVE
335
+ "E7+68" => "\x1E\x23", # LATIN SMALL LETTER H WITH DOT ABOVE
336
+ "E7+6D" => "\x1E\x41", # LATIN SMALL LETTER M WITH DOT ABOVE
337
+ "E7+6E" => "\x1E\x45", # LATIN SMALL LETTER N WITH DOT ABOVE
338
+ "E7+70" => "\x1E\x57", # LATIN SMALL LETTER P WITH DOT ABOVE
339
+ "E7+72" => "\x1E\x59", # LATIN SMALL LETTER R WITH DOT ABOVE
340
+ "E7+73" => "\x1E\x61", # LATIN SMALL LETTER S WITH DOT ABOVE
341
+ "E7+74" => "\x1E\x6B", # LATIN SMALL LETTER T WITH DOT ABOVE
342
+ "E7+77" => "\x1E\x87", # LATIN SMALL LETTER W WITH DOT ABOVE
343
+ "E7+78" => "\x1E\x8B", # LATIN SMALL LETTER X WITH DOT ABOVE
344
+ "E7+79" => "\x1E\x8F", # LATIN SMALL LETTER Y WITH DOT ABOVE
345
+ "E7+7A" => "\x01\x7C", # LATIN SMALL LETTER Z WITH DOT ABOVE
346
+ "E7+E2+53" => "\x1E\x64", # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
347
+ "E7+E2+73" => "\x1E\x65", # LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
348
+ "E7+E5+41" => "\x01\xE0", # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
349
+ "E7+E5+61" => "\x01\xE1", # LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
350
+ "E7+E9+53" => "\x1E\x66", # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
351
+ "E7+E9+73" => "\x1E\x67", # LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
352
+ "E7+F2+53" => "\x1E\x68", # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
353
+ "E7+F2+73" => "\x1E\x69", # LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
354
+ "E7" => "\x03\x07", # COMBINING DOT ABOVE
355
+ "E8+41" => "\x00\xC4", # LATIN CAPITAL LETTER A WITH DIAERESIS
356
+ "E8+45" => "\x00\xCB", # LATIN CAPITAL LETTER E WITH DIAERESIS
357
+ "E8+48" => "\x1E\x26", # LATIN CAPITAL LETTER H WITH DIAERESIS
358
+ "E8+49" => "\x00\xCF", # LATIN CAPITAL LETTER I WITH DIAERESIS
359
+ "E8+4F" => "\x00\xD6", # LATIN CAPITAL LETTER O WITH DIAERESIS
360
+ "E8+55" => "\x00\xDC", # LATIN CAPITAL LETTER U WITH DIAERESIS
361
+ "E8+57" => "\x1E\x84", # LATIN CAPITAL LETTER W WITH DIAERESIS
362
+ "E8+58" => "\x1E\x8C", # LATIN CAPITAL LETTER X WITH DIAERESIS
363
+ "E8+59" => "\x01\x78", # LATIN CAPITAL LETTER Y WITH DIAERESIS
364
+ "E8+61" => "\x00\xE4", # LATIN SMALL LETTER A WITH DIAERESIS
365
+ "E8+65" => "\x00\xEB", # LATIN SMALL LETTER E WITH DIAERESIS
366
+ "E8+68" => "\x1E\x27", # LATIN SMALL LETTER H WITH DIAERESIS
367
+ "E8+69" => "\x00\xEF", # LATIN SMALL LETTER I WITH DIAERESIS
368
+ "E8+6F" => "\x00\xF6", # LATIN SMALL LETTER O WITH DIAERESIS
369
+ "E8+74" => "\x1E\x97", # LATIN SMALL LETTER T WITH DIAERESIS
370
+ "E8+75" => "\x00\xFC", # LATIN SMALL LETTER U WITH DIAERESIS
371
+ "E8+77" => "\x1E\x85", # LATIN SMALL LETTER W WITH DIAERESIS
372
+ "E8+78" => "\x1E\x8D", # LATIN SMALL LETTER X WITH DIAERESIS
373
+ "E8+79" => "\x00\xFF", # LATIN SMALL LETTER Y WITH DIAERESIS
374
+ "E8+E1+55" => "\x01\xDB", # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
375
+ "E8+E1+75" => "\x01\xDC", # LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE
376
+ "E8+E2+49" => "\x1E\x2E", # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
377
+ "E8+E2+55" => "\x01\xD7", # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
378
+ "E8+E2+69" => "\x1E\x2F", # LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
379
+ "E8+E2+75" => "\x01\xD8", # LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
380
+ "E8+E4+4F" => "\x1E\x4E", # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
381
+ "E8+E4+6F" => "\x1E\x4F", # LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
382
+ "E8+E5+41" => "\x01\xDE", # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
383
+ "E8+E5+55" => "\x1E\x7A", # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
384
+ "E8+E5+61" => "\x01\xDF", # LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
385
+ "E8+E5+75" => "\x1E\x7B", # LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
386
+ "E8+E9+55" => "\x01\xD9", # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
387
+ "E8+E9+75" => "\x01\xDA", # LATIN SMALL LETTER U WITH DIAERESIS AND CARON
388
+ "E8" => "\x03\x08", # COMBINING DIAERESIS
389
+ "E9+41" => "\x01\xCD", # LATIN CAPITAL LETTER A WITH CARON
390
+ "E9+43" => "\x01\x0C", # LATIN CAPITAL LETTER C WITH CARON
391
+ "E9+44" => "\x01\x0E", # LATIN CAPITAL LETTER D WITH CARON
392
+ "E9+45" => "\x01\x1A", # LATIN CAPITAL LETTER E WITH CARON
393
+ "E9+47" => "\x01\xE6", # LATIN CAPITAL LETTER G WITH CARON
394
+ "E9+49" => "\x01\xCF", # LATIN CAPITAL LETTER I WITH CARON
395
+ "E9+4B" => "\x01\xE8", # LATIN CAPITAL LETTER K WITH CARON
396
+ "E9+4C" => "\x01\x3D", # LATIN CAPITAL LETTER L WITH CARON
397
+ "E9+4E" => "\x01\x47", # LATIN CAPITAL LETTER N WITH CARON
398
+ "E9+4F" => "\x01\xD1", # LATIN CAPITAL LETTER O WITH CARON
399
+ "E9+52" => "\x01\x58", # LATIN CAPITAL LETTER R WITH CARON
400
+ "E9+53" => "\x01\x60", # LATIN CAPITAL LETTER S WITH CARON
401
+ "E9+54" => "\x01\x64", # LATIN CAPITAL LETTER T WITH CARON
402
+ "E9+55" => "\x01\xD3", # LATIN CAPITAL LETTER U WITH CARON
403
+ "E9+5A" => "\x01\x7D", # LATIN CAPITAL LETTER Z WITH CARON
404
+ "E9+61" => "\x01\xCE", # LATIN SMALL LETTER A WITH CARON
405
+ "E9+63" => "\x01\x0D", # LATIN SMALL LETTER C WITH CARON
406
+ "E9+64" => "\x01\x0F", # LATIN SMALL LETTER D WITH CARON
407
+ "E9+65" => "\x01\x1B", # LATIN SMALL LETTER E WITH CARON
408
+ "E9+67" => "\x01\xE7", # LATIN SMALL LETTER G WITH CARON
409
+ "E9+69" => "\x01\xD0", # LATIN SMALL LETTER I WITH CARON
410
+ "E9+6A" => "\x01\xF0", # LATIN SMALL LETTER J WITH CARON
411
+ "E9+6B" => "\x01\xE9", # LATIN SMALL LETTER K WITH CARON
412
+ "E9+6C" => "\x01\x3E", # LATIN SMALL LETTER L WITH CARON
413
+ "E9+6E" => "\x01\x48", # LATIN SMALL LETTER N WITH CARON
414
+ "E9+6F" => "\x01\xD2", # LATIN SMALL LETTER O WITH CARON
415
+ "E9+72" => "\x01\x59", # LATIN SMALL LETTER R WITH CARON
416
+ "E9+73" => "\x01\x61", # LATIN SMALL LETTER S WITH CARON
417
+ "E9+74" => "\x01\x65", # LATIN SMALL LETTER T WITH CARON
418
+ "E9+75" => "\x01\xD4", # LATIN SMALL LETTER U WITH CARON
419
+ "E9+7A" => "\x01\x7E", # LATIN SMALL LETTER Z WITH CARON
420
+ "E9+E7+53" => "\x1E\x66", # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
421
+ "E9+E7+73" => "\x1E\x67", # LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
422
+ "E9+E8+55" => "\x01\xD9", # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
423
+ "E9+E8+75" => "\x01\xDA", # LATIN SMALL LETTER U WITH DIAERESIS AND CARON
424
+ "E9" => "\x03\x0C", # COMBINING CARON
425
+ "EA+41" => "\x00\xC5", # LATIN CAPITAL LETTER A WITH RING ABOVE
426
+ "EA+55" => "\x01\x6E", # LATIN CAPITAL LETTER U WITH RING ABOVE
427
+ "EA+61" => "\x00\xE5", # LATIN SMALL LETTER A WITH RING ABOVE
428
+ "EA+75" => "\x01\x6F", # LATIN SMALL LETTER U WITH RING ABOVE
429
+ "EA+77" => "\x1E\x98", # LATIN SMALL LETTER W WITH RING ABOVE
430
+ "EA+79" => "\x1E\x99", # LATIN SMALL LETTER Y WITH RING ABOVE
431
+ "EA+E2+41" => "\x01\xFA", # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
432
+ "EA+E2+61" => "\x01\xFB", # LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
433
+ "EA" => "\x03\x0A", # COMBINING RING ABOVE
434
+ "EB" => "\xFE\x20", # COMBINING LIGATURE LEFT HALF
435
+ "EC" => "\xFE\x21", # COMBINING LIGATURE RIGHT HALF
436
+ "ED" => "\x03\x15", # COMBINING COMMA ABOVE RIGHT
437
+ "EE+4F" => "\x01\x50", # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
438
+ "EE+55" => "\x01\x70", # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
439
+ "EE+6F" => "\x01\x51", # LATIN SMALL LETTER O WITH DOUBLE ACUTE
440
+ "EE+75" => "\x01\x71", # LATIN SMALL LETTER U WITH DOUBLE ACUTE
441
+ "EE" => "\x03\x0B", # COMBINING DOUBLE ACUTE ACCENT
442
+ "EF" => "\x03\x10", # COMBINING CANDRABINDU
443
+ "F0+43" => "\x00\xC7", # LATIN CAPITAL LETTER C WITH CEDILLA
444
+ "F0+44" => "\x1E\x10", # LATIN CAPITAL LETTER D WITH CEDILLA
445
+ "F0+47" => "\x01\x22", # LATIN CAPITAL LETTER G WITH CEDILLA
446
+ "F0+48" => "\x1E\x28", # LATIN CAPITAL LETTER H WITH CEDILLA
447
+ "F0+4B" => "\x01\x36", # LATIN CAPITAL LETTER K WITH CEDILLA
448
+ "F0+4C" => "\x01\x3B", # LATIN CAPITAL LETTER L WITH CEDILLA
449
+ "F0+4E" => "\x01\x45", # LATIN CAPITAL LETTER N WITH CEDILLA
450
+ "F0+52" => "\x01\x56", # LATIN CAPITAL LETTER R WITH CEDILLA
451
+ "F0+53" => "\x01\x5E", # LATIN CAPITAL LETTER S WITH CEDILLA
452
+ "F0+54" => "\x01\x62", # LATIN CAPITAL LETTER T WITH CEDILLA
453
+ "F0+63" => "\x00\xE7", # LATIN SMALL LETTER C WITH CEDILLA
454
+ "F0+64" => "\x1E\x11", # LATIN SMALL LETTER D WITH CEDILLA
455
+ "F0+67" => "\x01\x23", # LATIN SMALL LETTER G WITH CEDILLA
456
+ "F0+68" => "\x1E\x29", # LATIN SMALL LETTER H WITH CEDILLA
457
+ "F0+6B" => "\x01\x37", # LATIN SMALL LETTER K WITH CEDILLA
458
+ "F0+6C" => "\x01\x3C", # LATIN SMALL LETTER L WITH CEDILLA
459
+ "F0+6E" => "\x01\x46", # LATIN SMALL LETTER N WITH CEDILLA
460
+ "F0+72" => "\x01\x57", # LATIN SMALL LETTER R WITH CEDILLA
461
+ "F0+73" => "\x01\x5F", # LATIN SMALL LETTER S WITH CEDILLA
462
+ "F0+74" => "\x01\x63", # LATIN SMALL LETTER T WITH CEDILLA
463
+ "F0+E2+43" => "\x1E\x08", # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
464
+ "F0+E2+63" => "\x1E\x09", # LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
465
+ "F0+E6+45" => "\x1E\x1C", # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
466
+ "F0+E6+65" => "\x1E\x1D", # LATIN SMALL LETTER E WITH CEDILLA AND BREVE
467
+ "F0" => "\x03\x27", # COMBINING CEDILLA
468
+ "F1+41" => "\x01\x04", # LATIN CAPITAL LETTER A WITH OGONEK
469
+ "F1+45" => "\x01\x18", # LATIN CAPITAL LETTER E WITH OGONEK
470
+ "F1+49" => "\x01\x2E", # LATIN CAPITAL LETTER I WITH OGONEK
471
+ "F1+4F" => "\x01\xEA", # LATIN CAPITAL LETTER O WITH OGONEK
472
+ "F1+55" => "\x01\x72", # LATIN CAPITAL LETTER U WITH OGONEK
473
+ "F1+61" => "\x01\x05", # LATIN SMALL LETTER A WITH OGONEK
474
+ "F1+65" => "\x01\x19", # LATIN SMALL LETTER E WITH OGONEK
475
+ "F1+69" => "\x01\x2F", # LATIN SMALL LETTER I WITH OGONEK
476
+ "F1+6F" => "\x01\xEB", # LATIN SMALL LETTER O WITH OGONEK
477
+ "F1+75" => "\x01\x73", # LATIN SMALL LETTER U WITH OGONEK
478
+ "F1+E5+4F" => "\x01\xEC", # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
479
+ "F1+E5+6F" => "\x01\xED", # LATIN SMALL LETTER O WITH OGONEK AND MACRON
480
+ "F1" => "\x03\x28", # COMBINING OGONEK
481
+ "F2+41" => "\x1E\xA0", # LATIN CAPITAL LETTER A WITH DOT BELOW
482
+ "F2+42" => "\x1E\x04", # LATIN CAPITAL LETTER B WITH DOT BELOW
483
+ "F2+44" => "\x1E\x0C", # LATIN CAPITAL LETTER D WITH DOT BELOW
484
+ "F2+45" => "\x1E\xB8", # LATIN CAPITAL LETTER E WITH DOT BELOW
485
+ "F2+48" => "\x1E\x24", # LATIN CAPITAL LETTER H WITH DOT BELOW
486
+ "F2+49" => "\x1E\xCA", # LATIN CAPITAL LETTER I WITH DOT BELOW
487
+ "F2+4B" => "\x1E\x32", # LATIN CAPITAL LETTER K WITH DOT BELOW
488
+ "F2+4C" => "\x1E\x36", # LATIN CAPITAL LETTER L WITH DOT BELOW
489
+ "F2+4D" => "\x1E\x42", # LATIN CAPITAL LETTER M WITH DOT BELOW
490
+ "F2+4E" => "\x1E\x46", # LATIN CAPITAL LETTER N WITH DOT BELOW
491
+ "F2+4F" => "\x1E\xCC", # LATIN CAPITAL LETTER O WITH DOT BELOW
492
+ "F2+52" => "\x1E\x5A", # LATIN CAPITAL LETTER R WITH DOT BELOW
493
+ "F2+53" => "\x1E\x62", # LATIN CAPITAL LETTER S WITH DOT BELOW
494
+ "F2+54" => "\x1E\x6C", # LATIN CAPITAL LETTER T WITH DOT BELOW
495
+ "F2+55" => "\x1E\xE4", # LATIN CAPITAL LETTER U WITH DOT BELOW
496
+ "F2+56" => "\x1E\x7E", # LATIN CAPITAL LETTER V WITH DOT BELOW
497
+ "F2+57" => "\x1E\x88", # LATIN CAPITAL LETTER W WITH DOT BELOW
498
+ "F2+59" => "\x1E\xF4", # LATIN CAPITAL LETTER Y WITH DOT BELOW
499
+ "F2+5A" => "\x1E\x92", # LATIN CAPITAL LETTER Z WITH DOT BELOW
500
+ "F2+61" => "\x1E\xA1", # LATIN SMALL LETTER A WITH DOT BELOW
501
+ "F2+62" => "\x1E\x05", # LATIN SMALL LETTER B WITH DOT BELOW
502
+ "F2+64" => "\x1E\x0D", # LATIN SMALL LETTER D WITH DOT BELOW
503
+ "F2+65" => "\x1E\xB9", # LATIN SMALL LETTER E WITH DOT BELOW
504
+ "F2+68" => "\x1E\x25", # LATIN SMALL LETTER H WITH DOT BELOW
505
+ "F2+69" => "\x1E\xCB", # LATIN SMALL LETTER I WITH DOT BELOW
506
+ "F2+6B" => "\x1E\x33", # LATIN SMALL LETTER K WITH DOT BELOW
507
+ "F2+6C" => "\x1E\x37", # LATIN SMALL LETTER L WITH DOT BELOW
508
+ "F2+6D" => "\x1E\x43", # LATIN SMALL LETTER M WITH DOT BELOW
509
+ "F2+6E" => "\x1E\x47", # LATIN SMALL LETTER N WITH DOT BELOW
510
+ "F2+6F" => "\x1E\xCD", # LATIN SMALL LETTER O WITH DOT BELOW
511
+ "F2+72" => "\x1E\x5B", # LATIN SMALL LETTER R WITH DOT BELOW
512
+ "F2+73" => "\x1E\x63", # LATIN SMALL LETTER S WITH DOT BELOW
513
+ "F2+74" => "\x1E\x6D", # LATIN SMALL LETTER T WITH DOT BELOW
514
+ "F2+75" => "\x1E\xE5", # LATIN SMALL LETTER U WITH DOT BELOW
515
+ "F2+76" => "\x1E\x7F", # LATIN SMALL LETTER V WITH DOT BELOW
516
+ "F2+77" => "\x1E\x89", # LATIN SMALL LETTER W WITH DOT BELOW
517
+ "F2+79" => "\x1E\xF5", # LATIN SMALL LETTER Y WITH DOT BELOW
518
+ "F2+7A" => "\x1E\x93", # LATIN SMALL LETTER Z WITH DOT BELOW
519
+ "F2+E3+41" => "\x1E\xAC", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
520
+ "F2+E3+45" => "\x1E\xC6", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
521
+ "F2+E3+4F" => "\x1E\xD8", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
522
+ "F2+E3+61" => "\x1E\xAD", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
523
+ "F2+E3+65" => "\x1E\xC7", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
524
+ "F2+E3+6F" => "\x1E\xD9", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
525
+ "F2+E5+4C" => "\x1E\x38", # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
526
+ "F2+E5+52" => "\x1E\x5C", # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
527
+ "F2+E5+6C" => "\x1E\x39", # LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
528
+ "F2+E5+72" => "\x1E\x5D", # LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
529
+ "F2+E6+41" => "\x1E\xB6", # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
530
+ "F2+E6+61" => "\x1E\xB7", # LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
531
+ "F2+E7+53" => "\x1E\x68", # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
532
+ "F2+E7+73" => "\x1E\x69", # LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
533
+ "F2" => "\x03\x23", # COMBINING DOT BELOW
534
+ "F3+55" => "\x1E\x72", # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW
535
+ "F3+75" => "\x1E\x73", # LATIN SMALL LETTER U WITH DIAERESIS BELOW
536
+ "F3" => "\x03\x24", # COMBINING DIAERESIS BELOW
537
+ "F4+41" => "\x1E\x00", # LATIN CAPITAL LETTER A WITH RING BELOW
538
+ "F4+61" => "\x1E\x01", # LATIN SMALL LETTER A WITH RING BELOW
539
+ "F4" => "\x03\x25", # COMBINING RING BELOW
540
+ "F5" => "\x03\x33", # COMBINING DOUBLE LOW LINE
541
+ "F6+42" => "\x1E\x06", # LATIN CAPITAL LETTER B WITH LINE BELOW
542
+ "F6+44" => "\x1E\x0E", # LATIN CAPITAL LETTER D WITH LINE BELOW
543
+ "F6+4B" => "\x1E\x34", # LATIN CAPITAL LETTER K WITH LINE BELOW
544
+ "F6+4C" => "\x1E\x3A", # LATIN CAPITAL LETTER L WITH LINE BELOW
545
+ "F6+4E" => "\x1E\x48", # LATIN CAPITAL LETTER N WITH LINE BELOW
546
+ "F6+52" => "\x1E\x5E", # LATIN CAPITAL LETTER R WITH LINE BELOW
547
+ "F6+54" => "\x1E\x6E", # LATIN CAPITAL LETTER T WITH LINE BELOW
548
+ "F6+5A" => "\x1E\x94", # LATIN CAPITAL LETTER Z WITH LINE BELOW
549
+ "F6+62" => "\x1E\x07", # LATIN SMALL LETTER B WITH LINE BELOW
550
+ "F6+64" => "\x1E\x0F", # LATIN SMALL LETTER D WITH LINE BELOW
551
+ "F6+68" => "\x1E\x96", # LATIN SMALL LETTER H WITH LINE BELOW
552
+ "F6+6B" => "\x1E\x35", # LATIN SMALL LETTER K WITH LINE BELOW
553
+ "F6+6C" => "\x1E\x3B", # LATIN SMALL LETTER L WITH LINE BELOW
554
+ "F6+6E" => "\x1E\x49", # LATIN SMALL LETTER N WITH LINE BELOW
555
+ "F6+72" => "\x1E\x5F", # LATIN SMALL LETTER R WITH LINE BELOW
556
+ "F6+74" => "\x1E\x6F", # LATIN SMALL LETTER T WITH LINE BELOW
557
+ "F6+7A" => "\x1E\x95", # LATIN SMALL LETTER Z WITH LINE BELOW
558
+ "F6" => "\x03\x32", # COMBINING LOW LINE
559
+ "F7" => "\x03\x26", # COMBINING COMMA BELOW
560
+ "F8" => "\x03\x21", # COMBINING OGONEK
561
+ "F9+48" => "\x1E\x2A", # LATIN CAPITAL LETTER H WITH BREVE BELOW
562
+ "F9+68" => "\x1E\x2B", # LATIN SMALL LETTER H WITH BREVE BELOW
563
+ "F9" => "\x03\x2E", # COMBINING BREVE BELOW
564
+ "FA" => "\xFE\x22", # COMBINING DOUBLE TILDE LEFT HALF
565
+ "FB" => "\xFE\x23" # COMBINING DOUBLE TILDE RIGHT HALF
566
+ }
567
+ end
568
+ end
@@ -0,0 +1,50 @@
1
+ # encoding: ascii-8bit
2
+
3
+ module ANSEL
4
+ class Converter
5
+ include ANSEL::CharacterMap
6
+
7
+ def initialize(to_charset = 'UTF-8')
8
+ @to_charset = to_charset
9
+ end
10
+
11
+ def ansi_to_utf16
12
+ @ansi_to_utf16 ||= @@non_combining.merge(@@combining)
13
+ end
14
+
15
+ def convert(string)
16
+ output = ''
17
+ scanner = StringScanner.new(string)
18
+ until scanner.eos? do
19
+ byte = scanner.get_byte
20
+ char = byte.unpack('C')[0]
21
+
22
+ case char
23
+ when 0x00..0x7F
24
+ output << byte.force_encoding('UTF-8')
25
+ when 0x88..0xC8
26
+ hex_key = char.to_s(16).upcase
27
+ output << (ansi_to_utf16[hex_key] || ansi_to_utf16['ERR']).force_encoding('UTF-16BE').encode('UTF-8')
28
+ scanner.get_byte # ignore the next byte
29
+ when 0xE0..0xFB
30
+ [2, 1, 0].each do |n| # try 3 bytes, then 2 bytes, then 1 byte
31
+ bytes = [char.to_s(16).upcase]
32
+ scanner.peek(n).each_byte {|b| bytes << b.to_s(16).upcase}
33
+ hex_key = bytes.join('+')
34
+ if ansi_to_utf16.has_key?(hex_key)
35
+ output << ansi_to_utf16[hex_key].force_encoding('UTF-16BE').encode('UTF-8')
36
+ n.times {scanner.get_byte}
37
+ break
38
+ end
39
+ end
40
+ else
41
+ output << ansi_to_utf16['ERR'].force_encoding('UTF-16BE').encode('UTF-8')
42
+ scanner.get_byte if scanner.get_byte.unpack('C')[0] >= 0xE0 # ignore the next byte
43
+ end
44
+ end
45
+
46
+ output.force_encoding('UTF-8')
47
+ end
48
+
49
+ end
50
+ end
@@ -0,0 +1,5 @@
1
+ # encoding: ascii-8bit
2
+
3
+ module ANSEL
4
+ VERSION = '2.0.0'
5
+ end
metadata ADDED
@@ -0,0 +1,57 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ansel
3
+ version: !ruby/object:Gem::Version
4
+ version: 2.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Keith Morrison
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2015-01-31 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: Convert ANSEL encoded text to UTF-8
14
+ email: keithm@infused.org
15
+ executables: []
16
+ extensions: []
17
+ extra_rdoc_files:
18
+ - README.md
19
+ - CHANGELOG.md
20
+ - MIT-LICENSE
21
+ files:
22
+ - CHANGELOG.md
23
+ - Gemfile
24
+ - Gemfile.lock
25
+ - MIT-LICENSE
26
+ - README.md
27
+ - Rakefile
28
+ - ansel.gemspec
29
+ - lib/ansel.rb
30
+ - lib/ansel/character_map.rb
31
+ - lib/ansel/converter.rb
32
+ - lib/ansel/version.rb
33
+ homepage: http://github.com/infused/ansel
34
+ licenses: []
35
+ metadata: {}
36
+ post_install_message:
37
+ rdoc_options:
38
+ - "--charset=UTF-8"
39
+ require_paths:
40
+ - lib
41
+ required_ruby_version: !ruby/object:Gem::Requirement
42
+ requirements:
43
+ - - ">="
44
+ - !ruby/object:Gem::Version
45
+ version: '0'
46
+ required_rubygems_version: !ruby/object:Gem::Requirement
47
+ requirements:
48
+ - - ">="
49
+ - !ruby/object:Gem::Version
50
+ version: 1.3.0
51
+ requirements: []
52
+ rubyforge_project:
53
+ rubygems_version: 2.4.3
54
+ signing_key:
55
+ specification_version: 4
56
+ summary: Convert ANSEL encoded text to UTF-8
57
+ test_files: []