ansel 2.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/CHANGELOG.md +38 -0
- data/Gemfile +7 -0
- data/Gemfile.lock +28 -0
- data/MIT-LICENSE +20 -0
- data/README.md +81 -0
- data/Rakefile +17 -0
- data/ansel.gemspec +23 -0
- data/lib/ansel.rb +4 -0
- data/lib/ansel/character_map.rb +568 -0
- data/lib/ansel/converter.rb +50 -0
- data/lib/ansel/version.rb +5 -0
- metadata +57 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: ec23fbc485516d81eaf02b0439e5ca9178417035
|
4
|
+
data.tar.gz: 569ffd81d040ba067f667f2c7d944f80af9cc71c
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 4c81372c81cda650f3a65ee15924b791fb9deb907651693e18a2e46196da6e3c5754f74711fd23c08f363619b28f0099157d20448ba821f9e76488b0a5f173f2
|
7
|
+
data.tar.gz: 15642df832bcaa46f24d6de3cf9ef0866b7d5fddb3a0d6461f70f2bf5980e20db059eb516115ad83879cee3d35d5c8786d72e9dd509c3b665d921b0e6bc90177
|
data/CHANGELOG.md
ADDED
@@ -0,0 +1,38 @@
|
|
1
|
+
## 2.0.0
|
2
|
+
|
3
|
+
- Remove Iconv dependency (requires Ruby 1.9+)
|
4
|
+
|
5
|
+
## 1.1.6
|
6
|
+
|
7
|
+
- Remove dependency on activesupport
|
8
|
+
- Remove dependency on jeweler
|
9
|
+
- Migrate test suite to Rspec2
|
10
|
+
|
11
|
+
## 1.1.4
|
12
|
+
|
13
|
+
- New gemspec and Rakefile
|
14
|
+
- Rename History.txt to CHANGELOG.md
|
15
|
+
|
16
|
+
## 1.1.3
|
17
|
+
|
18
|
+
- MIT license
|
19
|
+
|
20
|
+
## 1.1.2
|
21
|
+
|
22
|
+
- Speed up conversion
|
23
|
+
|
24
|
+
## 1.1.0
|
25
|
+
|
26
|
+
- Ruby 1.9 compatibility
|
27
|
+
|
28
|
+
## 1.0.5
|
29
|
+
|
30
|
+
- Requires activesupport 2.3.5 and works when 3.0 is installed
|
31
|
+
|
32
|
+
## 1.0.3
|
33
|
+
|
34
|
+
- Fix ActiveSupport deprecation warning
|
35
|
+
|
36
|
+
## 1.0.0
|
37
|
+
|
38
|
+
- Initial public release
|
data/Gemfile
ADDED
data/Gemfile.lock
ADDED
@@ -0,0 +1,28 @@
|
|
1
|
+
PATH
|
2
|
+
remote: .
|
3
|
+
specs:
|
4
|
+
ansel (2.0.0)
|
5
|
+
|
6
|
+
GEM
|
7
|
+
remote: https://rubygems.org/
|
8
|
+
specs:
|
9
|
+
diff-lcs (1.2.5)
|
10
|
+
rspec (3.1.0)
|
11
|
+
rspec-core (~> 3.1.0)
|
12
|
+
rspec-expectations (~> 3.1.0)
|
13
|
+
rspec-mocks (~> 3.1.0)
|
14
|
+
rspec-core (3.1.7)
|
15
|
+
rspec-support (~> 3.1.0)
|
16
|
+
rspec-expectations (3.1.2)
|
17
|
+
diff-lcs (>= 1.2.0, < 2.0)
|
18
|
+
rspec-support (~> 3.1.0)
|
19
|
+
rspec-mocks (3.1.3)
|
20
|
+
rspec-support (~> 3.1.0)
|
21
|
+
rspec-support (3.1.2)
|
22
|
+
|
23
|
+
PLATFORMS
|
24
|
+
ruby
|
25
|
+
|
26
|
+
DEPENDENCIES
|
27
|
+
ansel!
|
28
|
+
rspec
|
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2009-2012 Keith Morrison <keithm@infused.org>
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,81 @@
|
|
1
|
+
# ANSEL
|
2
|
+
|
3
|
+
[![Version](http://img.shields.io/gem/v/ansel.svg?style=flat)](https://rubygems.org/gems/ansel)
|
4
|
+
[![Build Status](http://img.shields.io/travis/infused/ansel/master.svg?style=flat)](http://travis-ci.org/infused/ansel)
|
5
|
+
[![Code Quality](http://img.shields.io/codeclimate/github/infused/ansel.svg?style=flat)](https://codeclimate.com/github/infused/ansel)
|
6
|
+
|
7
|
+
ANSEL provides character set conversion from ANSEL to UTF-8
|
8
|
+
|
9
|
+
Copyright (c) 2006-2015 Keith Morrison <mailto:keithm@infused.org>, <http://www.infused.org>
|
10
|
+
|
11
|
+
- Project page: <http://github.com/infused/ansel>
|
12
|
+
- API Documentation: <http://rubydoc.info/github/infused/ansel/frames>
|
13
|
+
- Report bugs: <http://github.com/infused/ansel/issues>
|
14
|
+
- Questions? Email [keithm@infused.org](mailto:keithm@infused.org?subject=ANSE)
|
15
|
+
with ANSEL in the subject line
|
16
|
+
|
17
|
+
## Compatibility
|
18
|
+
|
19
|
+
ANSEL is [tested](https://travis-ci.org/infused/ansel) to be compatible with the following Rubies:
|
20
|
+
|
21
|
+
* 1.9.2
|
22
|
+
* 1.9.3
|
23
|
+
* 2.0.0
|
24
|
+
* 2.1.0
|
25
|
+
* 2.1.1
|
26
|
+
* 2.1.2
|
27
|
+
* 2.1.3
|
28
|
+
* 2.1.4
|
29
|
+
* 2.1.5
|
30
|
+
* 2.2.0
|
31
|
+
* jruby 1.7+
|
32
|
+
|
33
|
+
|
34
|
+
If you need ANSEL convesion in Ruby 1.8, see my [ansel_iconv](http://github.com/infused/ansel_iconv) project.
|
35
|
+
|
36
|
+
## Installation
|
37
|
+
|
38
|
+
gem install ansel
|
39
|
+
|
40
|
+
## Basic Usage
|
41
|
+
|
42
|
+
Conversion from ANSEL to UTF-8 is fully supported.
|
43
|
+
|
44
|
+
require 'ansel'
|
45
|
+
|
46
|
+
converter = ANSEL::Converter.new
|
47
|
+
converter.convert("\xB9\x004.59") # => "£4.59"
|
48
|
+
|
49
|
+
|
50
|
+
## About the ANSEL character set
|
51
|
+
|
52
|
+
[ANSI/NISO
|
53
|
+
Z39.47](http://www.niso.org/kst/reports/standards?step=2&gid%3Austring%3Aiso-8859-1=&project_key%3Austring%3Aiso-8859-1=0b5d2bd7b690b60fcc75cde9256ed9f9e526e531),
|
54
|
+
also known as ANSEL, is a character set encoding used primarily for
|
55
|
+
bibliographic and genealogical data. It is one of the official character
|
56
|
+
encodings supported by the [Gedcom
|
57
|
+
5.5](http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gctoc.htm)
|
58
|
+
standard.
|
59
|
+
|
60
|
+
## LICENSE:
|
61
|
+
|
62
|
+
Copyright (c) 2006-2015 Keith Morrison <keithm@infused.org>
|
63
|
+
|
64
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
65
|
+
a copy of this software and associated documentation files (the
|
66
|
+
'Software'), to deal in the Software without restriction, including
|
67
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
68
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
69
|
+
permit persons to whom the Software is furnished to do so, subject to
|
70
|
+
the following conditions:
|
71
|
+
|
72
|
+
The above copyright notice and this permission notice shall be
|
73
|
+
included in all copies or substantial portions of the Software.
|
74
|
+
|
75
|
+
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
|
76
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
77
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
78
|
+
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
79
|
+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
80
|
+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
81
|
+
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/Rakefile
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
# encoding: ascii-8bit
|
2
|
+
|
3
|
+
require 'rubygems'
|
4
|
+
require 'bundler/setup';
|
5
|
+
Bundler.setup(:default, :development)
|
6
|
+
|
7
|
+
require 'rspec/core/rake_task'
|
8
|
+
RSpec::Core::RakeTask.new :spec do |t|
|
9
|
+
t.rspec_opts = %w(--color)
|
10
|
+
end
|
11
|
+
|
12
|
+
task :default => :spec
|
13
|
+
|
14
|
+
desc "Open an irb session preloaded with this library"
|
15
|
+
task :console do
|
16
|
+
sh "irb -rubygems -I lib -r ansel.rb"
|
17
|
+
end
|
data/ansel.gemspec
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
# encoding: ascii-8bit
|
2
|
+
|
3
|
+
lib = File.expand_path('../lib/', __FILE__)
|
4
|
+
$:.unshift lib unless $:.include?(lib)
|
5
|
+
require 'ansel/version'
|
6
|
+
|
7
|
+
Gem::Specification.new do |s|
|
8
|
+
s.name = 'ansel'
|
9
|
+
s.version = ANSEL::VERSION
|
10
|
+
s.authors = ["Keith Morrison"]
|
11
|
+
s.email = 'keithm@infused.org'
|
12
|
+
s.homepage = 'http://github.com/infused/ansel'
|
13
|
+
s.summary = 'Convert ANSEL encoded text to UTF-8'
|
14
|
+
s.description = 'Convert ANSEL encoded text to UTF-8'
|
15
|
+
|
16
|
+
s.rdoc_options = ['--charset=UTF-8']
|
17
|
+
s.extra_rdoc_files = ['README.md', 'CHANGELOG.md', 'MIT-LICENSE']
|
18
|
+
s.files = Dir['[A-Z]*', '{lib,test}/**/*', 'ansel.gemspec']
|
19
|
+
s.test_files = Dir.glob('test/**/*_test.rb')
|
20
|
+
s.require_paths = ['lib']
|
21
|
+
|
22
|
+
s.required_rubygems_version = '>= 1.3.0'
|
23
|
+
end
|
data/lib/ansel.rb
ADDED
@@ -0,0 +1,568 @@
|
|
1
|
+
# encoding: ascii-8bit
|
2
|
+
|
3
|
+
module ANSEL
|
4
|
+
module CharacterMap
|
5
|
+
@@non_combining = {
|
6
|
+
"ERR" => "\xFF\xFD", # � - REPLACEMENT CHARACTER
|
7
|
+
"88" => "", # NON-SORT BEGIN / START OF STRING
|
8
|
+
"89" => "", # NON-SORT END / STRING TERMINATOR
|
9
|
+
"8D" => "", # JOINER / ZERO WIDTH JOINER
|
10
|
+
"8E" => "", # NON-JOINER / ZERO WIDTH NON-JOINER
|
11
|
+
"A1" => "\x01\x41", # Ł - UPPERCASE POLISH L / LATIN CAPITAL LETTER L WITH STROKE
|
12
|
+
"A2" => "\x00\xD8", # Ø - UPPERCASE SCANDINAVIAN O / LATIN CAPITAL LETTER O WITH STROKE
|
13
|
+
"A3" => "\x01\x10", # Đ - UPPERCASE D WITH CROSSBAR / LATIN CAPITAL LETTER D WITH STROKE
|
14
|
+
"A4" => "\x00\xDE", # Þ - UPPERCASE ICELANDIC THORN / LATIN CAPITAL LETTER THORN (Icelandic)
|
15
|
+
"A5" => "\x00\xC6", # Æ - UPPERCASE DIGRAPH AE / LATIN CAPITAL LIGATURE AE
|
16
|
+
"A6" => "\x01\x52", # Œ - UPPERCASE DIGRAPH OE / LATIN CAPITAL LIGATURE OE
|
17
|
+
"A7" => "\x02\xB9", # ʹ - SOFT SIGN, PRIME / MODIFIER LETTER PRIME
|
18
|
+
"A8" => "\x00\xB7", # · - MIDDLE DOT
|
19
|
+
"A9" => "\x26\x6D", # ♭ - MUSIC FLAT SIGN
|
20
|
+
"AA" => "\x00\xAE", # ® - PATENT MARK / REGISTERED SIGN
|
21
|
+
"AB" => "\x00\xB1", # ± - PLUS OR MINUS / PLUS-MINUS SIGN
|
22
|
+
"AC" => "\x01\xA0", # Ơ - UPPERCASE O-HOOK / LATIN CAPITAL LETTER O WITH HORN
|
23
|
+
"AD" => "\x01\xAF", # Ư - UPPERCASE U-HOOK / LATIN CAPITAL LETTER U WITH HORN
|
24
|
+
"AE" => "\x02\xBC", # ʼ - ALIF / MODIFIER LETTER APOSTROPHE
|
25
|
+
"B0" => "\x02\xBB", # ʻ - AYN / MODIFIER LETTER TURNED COMMA
|
26
|
+
"B1" => "\x01\x42", # ł - LOWERCASE POLISH L / LATIN SMALL LETTER L WITH STROKE
|
27
|
+
"B2" => "\x00\xF8", # ø - LOWERCASE SCANDINAVIAN O / LATIN SMALL LETTER O WITH STROKE
|
28
|
+
"B3" => "\x01\x11", # đ - LOWERCASE D WITH CROSSBAR / LATIN SMALL LETTER D WITH STROKE
|
29
|
+
"B4" => "\x00\xFE", # þ - LOWERCASE ICELANDIC THORN / LATIN SMALL LETTER THORN (Icelandic)
|
30
|
+
"B5" => "\x00\xE6", # æ - LOWERCASE DIGRAPH AE / LATIN SMALL LIGATURE AE
|
31
|
+
"B6" => "\x01\x53", # œ - LOWERCASE DIGRAPH OE / LATIN SMALL LIGATURE OE
|
32
|
+
"B7" => "\x02\xBA", # ʺ - HARD SIGN, DOUBLE PRIME / MODIFIER LETTER DOUBLE PRIME
|
33
|
+
"B8" => "\x01\x31", # ı - LOWERCASE TURKISH I / LATIN SMALL LETTER DOTLESS I
|
34
|
+
"B9" => "\x00\xA3", # £ - BRITISH POUND / POUND SIGN
|
35
|
+
"BA" => "\x00\xF0", # ð - LOWERCASE ETH / LATIN SMALL LETTER ETH (Icelandic)
|
36
|
+
"BC" => "\x01\xA1", # ơ - LOWERCASE O-HOOK / LATIN SMALL LETTER O WITH HORN
|
37
|
+
"BD" => "\x01\xB0", # ư - LOWERCASE U-HOOK / LATIN SMALL LETTER U WITH HORN
|
38
|
+
"C0" => "\x00\xB0", # ° - DEGREE SIGN
|
39
|
+
"C1" => "\x21\x13", # ℓ - SCRIPT SMALL L
|
40
|
+
"C2" => "\x21\x17", # ℗ - SOUND RECORDING COPYRIGHT
|
41
|
+
"C3" => "\x00\xA9", # © - COPYRIGHT SIGN
|
42
|
+
"C4" => "\x26\x6F", # ♯ - MUSIC SHARP SIGN
|
43
|
+
"C5" => "\x00\xBF", # ¿ - INVERTED QUESTION MARK
|
44
|
+
"C6" => "\x00\xA1", # ¡ - INVERTED EXCLAMATION MARK
|
45
|
+
"C7" => "\x00\xDF", # ß - ESZETT SYMBOL
|
46
|
+
"C8" => "\x20\xAC" # € - EURO SIGN
|
47
|
+
}
|
48
|
+
|
49
|
+
@@combining = {
|
50
|
+
"E0+41" => "\x1E\xA2", # Ả - LATIN CAPITAL LETTER A WITH HOOK ABOVE
|
51
|
+
"E0+45" => "\x1E\xBA", # LATIN CAPITAL LETTER E WITH HOOK ABOVE
|
52
|
+
"E0+49" => "\x1E\xC8", # LATIN CAPITAL LETTER I WITH HOOK ABOVE
|
53
|
+
"E0+4F" => "\x1E\xCE", # LATIN CAPITAL LETTER O WITH HOOK ABOVE
|
54
|
+
"E0+55" => "\x1E\xE6", # LATIN CAPITAL LETTER U WITH HOOK ABOVE
|
55
|
+
"E0+59" => "\x1E\xF6", # LATIN CAPITAL LETTER Y WITH HOOK ABOVE
|
56
|
+
"E0+61" => "\x1E\xA3", # LATIN SMALL LETTER A WITH HOOK ABOVE
|
57
|
+
"E0+65" => "\x1E\xBB", # LATIN SMALL LETTER E WITH HOOK ABOVE
|
58
|
+
"E0+69" => "\x1E\xC9", # LATIN SMALL LETTER I WITH HOOK ABOVE
|
59
|
+
"E0+6F" => "\x1E\xCF", # LATIN SMALL LETTER O WITH HOOK ABOVE
|
60
|
+
"E0+75" => "\x1E\xE7", # LATIN SMALL LETTER U WITH HOOK ABOVE
|
61
|
+
"E0+79" => "\x1E\xF7", # LATIN SMALL LETTER Y WITH HOOK ABOVE
|
62
|
+
"E0+E3+41" => "\x1E\xA8", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
|
63
|
+
"E0+E3+45" => "\x1E\xC2", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
|
64
|
+
"E0+E3+4F" => "\x1E\xD4", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
|
65
|
+
"E0+E3+61" => "\x1E\xA9", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
|
66
|
+
"E0+E3+65" => "\x1E\xC3", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
|
67
|
+
"E0+E3+6F" => "\x1E\xD5", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
|
68
|
+
"E0+E6+41" => "\x1E\xB2", # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
|
69
|
+
"E0+E6+61" => "\x1E\xB3", # LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
|
70
|
+
"E0" => "\x03\x09", # COMBINING HOOK ABOVE
|
71
|
+
"E1+41" => "\x00\xC0", # LATIN CAPITAL LETTER A WITH GRAVE
|
72
|
+
"E1+45" => "\x00\xC8", # LATIN CAPITAL LETTER E WITH GRAVE
|
73
|
+
"E1+49" => "\x00\xCC", # LATIN CAPITAL LETTER I WITH GRAVE
|
74
|
+
"E1+4F" => "\x00\xD2", # LATIN CAPITAL LETTER O WITH GRAVE
|
75
|
+
"E1+55" => "\x00\xD9", # LATIN CAPITAL LETTER U WITH GRAVE
|
76
|
+
"E1+57" => "\x1E\x80", # LATIN CAPITAL LETTER W WITH GRAVE
|
77
|
+
"E1+59" => "\x1E\xF2", # LATIN CAPITAL LETTER Y WITH GRAVE
|
78
|
+
"E1+61" => "\x00\xE0", # LATIN SMALL LETTER A WITH GRAVE
|
79
|
+
"E1+65" => "\x00\xE8", # LATIN SMALL LETTER E WITH GRAVE
|
80
|
+
"E1+69" => "\x00\xEC", # LATIN SMALL LETTER I WITH GRAVE
|
81
|
+
"E1+6F" => "\x00\xF2", # LATIN SMALL LETTER O WITH GRAVE
|
82
|
+
"E1+75" => "\x00\xF9", # LATIN SMALL LETTER U WITH GRAVE
|
83
|
+
"E1+77" => "\x1E\x81", # LATIN SMALL LETTER W WITH GRAVE
|
84
|
+
"E1+79" => "\x1E\xF3", # LATIN SMALL LETTER Y WITH GRAVE
|
85
|
+
"E1+E3+41" => "\x1E\xA6", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
|
86
|
+
"E1+E3+45" => "\x1E\xC0", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
|
87
|
+
"E1+E3+4F" => "\x1E\xD2", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
|
88
|
+
"E1+E3+61" => "\x1E\xA7", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
|
89
|
+
"E1+E3+65" => "\x1E\xC1", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
|
90
|
+
"E1+E3+6F" => "\x1E\xD3", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
|
91
|
+
"E1+E5+45" => "\x1E\x14", # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
|
92
|
+
"E1+E5+4F" => "\x1E\x50", # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
|
93
|
+
"E1+E5+65" => "\x1E\x15", # LATIN SMALL LETTER E WITH MACRON AND GRAVE
|
94
|
+
"E1+E5+6F" => "\x1E\x51", # LATIN SMALL LETTER O WITH MACRON AND GRAVE
|
95
|
+
"E1+E6+41" => "\x1E\xB0", # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
|
96
|
+
"E1+E6+61" => "\x1E\xB1", # LATIN SMALL LETTER A WITH BREVE AND GRAVE
|
97
|
+
"E1+E8+55" => "\x01\xDB", # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
|
98
|
+
"E1+E8+75" => "\x01\xDC", # LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE
|
99
|
+
"E1" => "\x03\x00", # COMBINING GRAVE ACCENT
|
100
|
+
"E2+41" => "\x00\xC1", # LATIN CAPITAL LETTER A WITH ACUTE
|
101
|
+
"E2+43" => "\x01\x06", # LATIN CAPITAL LETTER C WITH ACUTE
|
102
|
+
"E2+45" => "\x00\xC9", # LATIN CAPITAL LETTER E WITH ACUTE
|
103
|
+
"E2+47" => "\x01\xF4", # LATIN CAPITAL LETTER G WITH ACUTE
|
104
|
+
"E2+49" => "\x00\xCD", # LATIN CAPITAL LETTER I WITH ACUTE
|
105
|
+
"E2+4B" => "\x1E\x30", # LATIN CAPITAL LETTER K WITH ACUTE
|
106
|
+
"E2+4C" => "\x01\x39", # LATIN CAPITAL LETTER L WITH ACUTE
|
107
|
+
"E2+4D" => "\x1E\x3E", # LATIN CAPITAL LETTER M WITH ACUTE
|
108
|
+
"E2+4E" => "\x01\x43", # LATIN CAPITAL LETTER N WITH ACUTE
|
109
|
+
"E2+4F" => "\x00\xD3", # LATIN CAPITAL LETTER O WITH ACUTE
|
110
|
+
"E2+50" => "\x1E\x54", # LATIN CAPITAL LETTER P WITH ACUTE
|
111
|
+
"E2+52" => "\x01\x54", # LATIN CAPITAL LETTER R WITH ACUTE
|
112
|
+
"E2+53" => "\x01\x5A", # LATIN CAPITAL LETTER S WITH ACUTE
|
113
|
+
"E2+55" => "\x00\xDA", # LATIN CAPITAL LETTER U WITH ACUTE
|
114
|
+
"E2+57" => "\x1E\x82", # LATIN CAPITAL LETTER W WITH ACUTE
|
115
|
+
"E2+59" => "\x00\xDD", # LATIN CAPITAL LETTER Y WITH ACUTE
|
116
|
+
"E2+5A" => "\x01\x79", # LATIN CAPITAL LETTER Z WITH ACUTE
|
117
|
+
"E2+61" => "\x00\xE1", # LATIN SMALL LETTER A WITH ACUTE
|
118
|
+
"E2+63" => "\x01\x07", # LATIN SMALL LETTER C WITH ACUTE
|
119
|
+
"E2+65" => "\x00\xE9", # LATIN SMALL LETTER E WITH ACUTE
|
120
|
+
"E2+67" => "\x01\xF5", # LATIN SMALL LETTER G WITH ACUTE
|
121
|
+
"E2+69" => "\x00\xED", # LATIN SMALL LETTER I WITH ACUTE
|
122
|
+
"E2+6B" => "\x1E\x31", # LATIN SMALL LETTER K WITH ACUTE
|
123
|
+
"E2+6C" => "\x01\x3A", # LATIN SMALL LETTER L WITH ACUTE
|
124
|
+
"E2+6D" => "\x1E\x3F", # LATIN SMALL LETTER M WITH ACUTE
|
125
|
+
"E2+6E" => "\x01\x44", # LATIN SMALL LETTER N WITH ACUTE
|
126
|
+
"E2+6F" => "\x00\xF3", # LATIN SMALL LETTER O WITH ACUTE
|
127
|
+
"E2+70" => "\x1E\x55", # LATIN SMALL LETTER P WITH ACUTE
|
128
|
+
"E2+72" => "\x01\x55", # LATIN SMALL LETTER R WITH ACUTE
|
129
|
+
"E2+73" => "\x01\x5B", # LATIN SMALL LETTER S WITH ACUTE
|
130
|
+
"E2+75" => "\x00\xFA", # LATIN SMALL LETTER U WITH ACUTE
|
131
|
+
"E2+77" => "\x1E\x83", # LATIN SMALL LETTER W WITH ACUTE
|
132
|
+
"E2+79" => "\x00\xFD", # LATIN SMALL LETTER Y WITH ACUTE
|
133
|
+
"E2+7A" => "\x01\x7A", # LATIN SMALL LETTER Z WITH ACUTE
|
134
|
+
"E2+A5" => "\x01\xFC", # LATIN CAPITAL LETTER AE WITH ACUTE
|
135
|
+
"E2+B5" => "\x01\xFD", # LATIN SMALL LETTER AE WITH ACUTE
|
136
|
+
"E2+E3+41" => "\x1E\xA4", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
|
137
|
+
"E2+E3+45" => "\x1E\xBE", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
|
138
|
+
"E2+E3+4F" => "\x1E\xD0", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
|
139
|
+
"E2+E3+61" => "\x1E\xA5", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
|
140
|
+
"E2+E3+65" => "\x1E\xBF", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
|
141
|
+
"E2+E3+6F" => "\x1E\xD1", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
|
142
|
+
"E2+E4+4F" => "\x1E\x4C", # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
|
143
|
+
"E2+E4+55" => "\x1E\x78", # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
|
144
|
+
"E2+E4+6F" => "\x1E\x4D", # LATIN SMALL LETTER O WITH TILDE AND ACUTE
|
145
|
+
"E2+E4+75" => "\x1E\x79", # LATIN SMALL LETTER U WITH TILDE AND ACUTE
|
146
|
+
"E2+E5+45" => "\x1E\x16", # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
|
147
|
+
"E2+E5+4F" => "\x1E\x52", # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
|
148
|
+
"E2+E5+65" => "\x1E\x17", # LATIN SMALL LETTER E WITH MACRON AND ACUTE
|
149
|
+
"E2+E5+6F" => "\x1E\x53", # LATIN SMALL LETTER O WITH MACRON AND ACUTE
|
150
|
+
"E2+E6+41" => "\x1E\xAE", # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
|
151
|
+
"E2+E6+61" => "\x1E\xAF", # LATIN SMALL LETTER A WITH BREVE AND ACUTE
|
152
|
+
"E2+E7+53" => "\x1E\x64", # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
|
153
|
+
"E2+E7+73" => "\x1E\x65", # LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
|
154
|
+
"E2+E8+49" => "\x1E\x2E", # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
|
155
|
+
"E2+E8+55" => "\x01\xD7", # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
|
156
|
+
"E2+E8+69" => "\x1E\x2F", # LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
|
157
|
+
"E2+E8+75" => "\x01\xD8", # LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
|
158
|
+
"E2+EA+41" => "\x01\xFA", # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
|
159
|
+
"E2+EA+61" => "\x01\xFB", # LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
|
160
|
+
"E2+F0+43" => "\x1E\x08", # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
|
161
|
+
"E2+F0+63" => "\x1E\x09", # LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
|
162
|
+
"E2" => "\x03\x01", # COMBINING ACUTE ACCENT
|
163
|
+
"E3+41" => "\x00\xC2", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
|
164
|
+
"E3+43" => "\x01\x08", # LATIN CAPITAL LETTER C WITH CIRCUMFLEX
|
165
|
+
"E3+45" => "\x00\xCA", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
|
166
|
+
"E3+47" => "\x01\x1C", # LATIN CAPITAL LETTER G WITH CIRCUMFLEX
|
167
|
+
"E3+48" => "\x01\x24", # LATIN CAPITAL LETTER H WITH CIRCUMFLEX
|
168
|
+
"E3+49" => "\x00\xCE", # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
|
169
|
+
"E3+4A" => "\x01\x34", # LATIN CAPITAL LETTER J WITH CIRCUMFLEX
|
170
|
+
"E3+4F" => "\x00\xD4", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
|
171
|
+
"E3+53" => "\x01\x5C", # LATIN CAPITAL LETTER S WITH CIRCUMFLEX
|
172
|
+
"E3+55" => "\x00\xDB", # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
|
173
|
+
"E3+57" => "\x01\x74", # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
|
174
|
+
"E3+59" => "\x01\x76", # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
|
175
|
+
"E3+5A" => "\x1E\x90", # LATIN CAPITAL LETTER Z WITH CIRCUMFLEX
|
176
|
+
"E3+61" => "\x00\xE2", # LATIN SMALL LETTER A WITH CIRCUMFLEX
|
177
|
+
"E3+63" => "\x01\x09", # LATIN SMALL LETTER C WITH CIRCUMFLEX
|
178
|
+
"E3+65" => "\x00\xEA", # LATIN SMALL LETTER E WITH CIRCUMFLEX
|
179
|
+
"E3+67" => "\x01\x1D", # LATIN SMALL LETTER G WITH CIRCUMFLEX
|
180
|
+
"E3+68" => "\x01\x25", # LATIN SMALL LETTER H WITH CIRCUMFLEX
|
181
|
+
"E3+69" => "\x00\xEE", # LATIN SMALL LETTER I WITH CIRCUMFLEX
|
182
|
+
"E3+6A" => "\x01\x35", # LATIN SMALL LETTER J WITH CIRCUMFLEX
|
183
|
+
"E3+6F" => "\x00\xF4", # LATIN SMALL LETTER O WITH CIRCUMFLEX
|
184
|
+
"E3+73" => "\x01\x5D", # LATIN SMALL LETTER S WITH CIRCUMFLEX
|
185
|
+
"E3+75" => "\x00\xFB", # LATIN SMALL LETTER U WITH CIRCUMFLEX
|
186
|
+
"E3+77" => "\x01\x75", # LATIN SMALL LETTER W WITH CIRCUMFLEX
|
187
|
+
"E3+79" => "\x01\x77", # LATIN SMALL LETTER Y WITH CIRCUMFLEX
|
188
|
+
"E3+7A" => "\x1E\x91", # LATIN SMALL LETTER Z WITH CIRCUMFLEX
|
189
|
+
"E3+E0+41" => "\x1E\xA8", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
|
190
|
+
"E3+E0+45" => "\x1E\xC2", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
|
191
|
+
"E3+E0+4F" => "\x1E\xD4", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
|
192
|
+
"E3+E0+61" => "\x1E\xA9", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND HOOK ABOVE
|
193
|
+
"E3+E0+65" => "\x1E\xC3", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND HOOK ABOVE
|
194
|
+
"E3+E0+6F" => "\x1E\xD5", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE
|
195
|
+
"E3+E1+41" => "\x1E\xA6", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE
|
196
|
+
"E3+E1+45" => "\x1E\xC0", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND GRAVE
|
197
|
+
"E3+E1+4F" => "\x1E\xD2", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND GRAVE
|
198
|
+
"E3+E1+61" => "\x1E\xA7", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE
|
199
|
+
"E3+E1+65" => "\x1E\xC1", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE
|
200
|
+
"E3+E1+6F" => "\x1E\xD3", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND GRAVE
|
201
|
+
"E3+E2+41" => "\x1E\xA4", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND ACUTE
|
202
|
+
"E3+E2+45" => "\x1E\xBE", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND ACUTE
|
203
|
+
"E3+E2+4F" => "\x1E\xD0", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND ACUTE
|
204
|
+
"E3+E2+61" => "\x1E\xA5", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE
|
205
|
+
"E3+E2+65" => "\x1E\xBF", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE
|
206
|
+
"E3+E2+6F" => "\x1E\xD1", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE
|
207
|
+
"E3+E4+41" => "\x1E\xAA", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
|
208
|
+
"E3+E4+45" => "\x1E\xC4", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
|
209
|
+
"E3+E4+4F" => "\x1E\xD6", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
|
210
|
+
"E3+E4+61" => "\x1E\xAB", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
|
211
|
+
"E3+E4+65" => "\x1E\xC5", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
|
212
|
+
"E3+E4+6F" => "\x1E\xD7", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
|
213
|
+
"E3+F2+41" => "\x1E\xAC", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
|
214
|
+
"E3+F2+45" => "\x1E\xC6", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
|
215
|
+
"E3+F2+4F" => "\x1E\xD8", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
|
216
|
+
"E3+F2+61" => "\x1E\xAD", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
|
217
|
+
"E3+F2+65" => "\x1E\xC7", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
|
218
|
+
"E3+F2+6F" => "\x1E\xD9", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
|
219
|
+
"E3" => "\x03\x02", # COMBINING CIRCUMFLEX ACCENT
|
220
|
+
"E4+41" => "\x00\xC3", # LATIN CAPITAL LETTER A WITH TILDE
|
221
|
+
"E4+45" => "\x1E\xBC", # LATIN CAPITAL LETTER E WITH TILDE
|
222
|
+
"E4+49" => "\x01\x28", # LATIN CAPITAL LETTER I WITH TILDE
|
223
|
+
"E4+4E" => "\x00\xD1", # LATIN CAPITAL LETTER N WITH TILDE
|
224
|
+
"E4+4F" => "\x00\xD5", # LATIN CAPITAL LETTER O WITH TILDE
|
225
|
+
"E4+55" => "\x01\x68", # LATIN CAPITAL LETTER U WITH TILDE
|
226
|
+
"E4+56" => "\x1E\x7C", # LATIN CAPITAL LETTER V WITH TILDE
|
227
|
+
"E4+59" => "\x1E\xF8", # LATIN CAPITAL LETTER Y WITH TILDE
|
228
|
+
"E4+61" => "\x00\xE3", # LATIN SMALL LETTER A WITH TILDE
|
229
|
+
"E4+65" => "\x1E\xBD", # LATIN SMALL LETTER E WITH TILDE
|
230
|
+
"E4+69" => "\x01\x29", # LATIN SMALL LETTER I WITH TILDE
|
231
|
+
"E4+6E" => "\x00\xF1", # LATIN SMALL LETTER N WITH TILDE
|
232
|
+
"E4+6F" => "\x00\xF5", # LATIN SMALL LETTER O WITH TILDE
|
233
|
+
"E4+75" => "\x01\x69", # LATIN SMALL LETTER U WITH TILDE
|
234
|
+
"E4+76" => "\x1E\x7D", # LATIN SMALL LETTER V WITH TILDE
|
235
|
+
"E4+79" => "\x1E\xF9", # LATIN SMALL LETTER Y WITH TILDE
|
236
|
+
"E4+E2+4F" => "\x1E\x4C", # LATIN CAPITAL LETTER O WITH TILDE AND ACUTE
|
237
|
+
"E4+E2+55" => "\x1E\x78", # LATIN CAPITAL LETTER U WITH TILDE AND ACUTE
|
238
|
+
"E4+E2+6F" => "\x1E\x4D", # LATIN SMALL LETTER O WITH TILDE AND ACUTE
|
239
|
+
"E4+E2+75" => "\x1E\x79", # LATIN SMALL LETTER U WITH TILDE AND ACUTE
|
240
|
+
"E4+E3+41" => "\x1E\xAA", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND TILDE
|
241
|
+
"E4+E3+45" => "\x1E\xC4", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND TILDE
|
242
|
+
"E4+E3+4F" => "\x1E\xD6", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND TILDE
|
243
|
+
"E4+E3+61" => "\x1E\xAB", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE
|
244
|
+
"E4+E3+65" => "\x1E\xC5", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE
|
245
|
+
"E4+E3+6F" => "\x1E\xD7", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE
|
246
|
+
"E4+E6+41" => "\x1E\xB4", # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
|
247
|
+
"E4+E6+61" => "\x1E\xB5", # LATIN SMALL LETTER A WITH BREVE AND TILDE
|
248
|
+
"E4+E8+4F" => "\x1E\x4E", # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
|
249
|
+
"E4+E8+6F" => "\x1E\x4F", # LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
|
250
|
+
"E4" => "\x03\x03", # COMBINING TILDE
|
251
|
+
"E5+41" => "\x01\x00", # LATIN CAPITAL LETTER A WITH MACRON
|
252
|
+
"E5+45" => "\x01\x12", # LATIN CAPITAL LETTER E WITH MACRON
|
253
|
+
"E5+47" => "\x1E\x20", # LATIN CAPITAL LETTER G WITH MACRON
|
254
|
+
"E5+49" => "\x01\x2A", # LATIN CAPITAL LETTER I WITH MACRON
|
255
|
+
"E5+4F" => "\x01\x4C", # LATIN CAPITAL LETTER O WITH MACRON
|
256
|
+
"E5+55" => "\x01\x6A", # LATIN CAPITAL LETTER U WITH MACRON
|
257
|
+
"E5+61" => "\x01\x01", # LATIN SMALL LETTER A WITH MACRON
|
258
|
+
"E5+65" => "\x01\x13", # LATIN SMALL LETTER E WITH MACRON
|
259
|
+
"E5+67" => "\x1E\x21", # LATIN SMALL LETTER G WITH MACRON
|
260
|
+
"E5+69" => "\x01\x2B", # LATIN SMALL LETTER I WITH MACRON
|
261
|
+
"E5+6F" => "\x01\x4D", # LATIN SMALL LETTER O WITH MACRON
|
262
|
+
"E5+75" => "\x01\x6B", # LATIN SMALL LETTER U WITH MACRON
|
263
|
+
"E5+A5" => "\x01\xE2", # LATIN CAPITAL LETTER AE WITH MACRON
|
264
|
+
"E5+B5" => "\x01\xE3", # LATIN SMALL LETTER AE WITH MACRON
|
265
|
+
"E5+E1+45" => "\x1E\x14", # LATIN CAPITAL LETTER E WITH MACRON AND GRAVE
|
266
|
+
"E5+E1+4F" => "\x1E\x50", # LATIN CAPITAL LETTER O WITH MACRON AND GRAVE
|
267
|
+
"E5+E1+65" => "\x1E\x15", # LATIN SMALL LETTER E WITH MACRON AND GRAVE
|
268
|
+
"E5+E1+6F" => "\x1E\x51", # LATIN SMALL LETTER O WITH MACRON AND GRAVE
|
269
|
+
"E5+E2+45" => "\x1E\x16", # LATIN CAPITAL LETTER E WITH MACRON AND ACUTE
|
270
|
+
"E5+E2+4F" => "\x1E\x52", # LATIN CAPITAL LETTER O WITH MACRON AND ACUTE
|
271
|
+
"E5+E2+65" => "\x1E\x17", # LATIN SMALL LETTER E WITH MACRON AND ACUTE
|
272
|
+
"E5+E2+6F" => "\x1E\x53", # LATIN SMALL LETTER O WITH MACRON AND ACUTE
|
273
|
+
"E5+E7+41" => "\x01\xE0", # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
|
274
|
+
"E5+E7+61" => "\x01\xE1", # LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
|
275
|
+
"E5+E8+41" => "\x01\xDE", # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
|
276
|
+
"E5+E8+55" => "\x1E\x7A", # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
|
277
|
+
"E5+E8+61" => "\x01\xDF", # LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
|
278
|
+
"E5+E8+75" => "\x1E\x7B", # LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
|
279
|
+
"E5+F1+4F" => "\x01\xEC", # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
|
280
|
+
"E5+F1+6F" => "\x01\xED", # LATIN SMALL LETTER O WITH OGONEK AND MACRON
|
281
|
+
"E5+F2+4C" => "\x1E\x38", # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
|
282
|
+
"E5+F2+52" => "\x1E\x5C", # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
|
283
|
+
"E5+F2+6C" => "\x1E\x39", # LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
|
284
|
+
"E5+F2+72" => "\x1E\x5D", # LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
|
285
|
+
"E5" => "\x03\x04", # COMBINING MACRON
|
286
|
+
"E6+41" => "\x01\x02", # LATIN CAPITAL LETTER A WITH BREVE
|
287
|
+
"E6+45" => "\x01\x14", # LATIN CAPITAL LETTER E WITH BREVE
|
288
|
+
"E6+47" => "\x01\x1E", # LATIN CAPITAL LETTER G WITH BREVE
|
289
|
+
"E6+49" => "\x01\x2C", # LATIN CAPITAL LETTER I WITH BREVE
|
290
|
+
"E6+4F" => "\x01\x4E", # LATIN CAPITAL LETTER O WITH BREVE
|
291
|
+
"E6+55" => "\x01\x6C", # LATIN CAPITAL LETTER U WITH BREVE
|
292
|
+
"E6+61" => "\x01\x03", # LATIN SMALL LETTER A WITH BREVE
|
293
|
+
"E6+65" => "\x01\x15", # LATIN SMALL LETTER E WITH BREVE
|
294
|
+
"E6+67" => "\x01\x1F", # LATIN SMALL LETTER G WITH BREVE
|
295
|
+
"E6+69" => "\x01\x2D", # LATIN SMALL LETTER I WITH BREVE
|
296
|
+
"E6+6F" => "\x01\x4F", # LATIN SMALL LETTER O WITH BREVE
|
297
|
+
"E6+75" => "\x01\x6D", # LATIN SMALL LETTER U WITH BREVE
|
298
|
+
"E6+E0+41" => "\x1E\xB2", # LATIN CAPITAL LETTER A WITH BREVE AND HOOK ABOVE
|
299
|
+
"E6+E0+61" => "\x1E\xB3", # LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE
|
300
|
+
"E6+E1+41" => "\x1E\xB0", # LATIN CAPITAL LETTER A WITH BREVE AND GRAVE
|
301
|
+
"E6+E1+61" => "\x1E\xB1", # LATIN SMALL LETTER A WITH BREVE AND GRAVE
|
302
|
+
"E6+E2+41" => "\x1E\xAE", # LATIN CAPITAL LETTER A WITH BREVE AND ACUTE
|
303
|
+
"E6+E2+61" => "\x1E\xAF", # LATIN SMALL LETTER A WITH BREVE AND ACUTE
|
304
|
+
"E6+E4+41" => "\x1E\xB4", # LATIN CAPITAL LETTER A WITH BREVE AND TILDE
|
305
|
+
"E6+E4+61" => "\x1E\xB5", # LATIN SMALL LETTER A WITH BREVE AND TILDE
|
306
|
+
"E6+F0+45" => "\x1E\x1C", # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
|
307
|
+
"E6+F0+65" => "\x1E\x1D", # LATIN SMALL LETTER E WITH CEDILLA AND BREVE
|
308
|
+
"E6+F2+41" => "\x1E\xB6", # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
|
309
|
+
"E6+F2+61" => "\x1E\xB7", # LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
|
310
|
+
"E6" => "\x03\x06", # COMBINING BREVE
|
311
|
+
"E7+42" => "\x1E\x02", # LATIN CAPITAL LETTER B WITH DOT ABOVE
|
312
|
+
"E7+43" => "\x01\x0A", # LATIN CAPITAL LETTER C WITH DOT ABOVE
|
313
|
+
"E7+44" => "\x1E\x0A", # LATIN CAPITAL LETTER D WITH DOT ABOVE
|
314
|
+
"E7+45" => "\x01\x16", # LATIN CAPITAL LETTER E WITH DOT ABOVE
|
315
|
+
"E7+46" => "\x1E\x1E", # LATIN CAPITAL LETTER F WITH DOT ABOVE
|
316
|
+
"E7+47" => "\x01\x20", # LATIN CAPITAL LETTER G WITH DOT ABOVE
|
317
|
+
"E7+48" => "\x1E\x22", # LATIN CAPITAL LETTER H WITH DOT ABOVE
|
318
|
+
"E7+49" => "\x01\x30", # LATIN CAPITAL LETTER I WITH DOT ABOVE
|
319
|
+
"E7+4D" => "\x1E\x40", # LATIN CAPITAL LETTER M WITH DOT ABOVE
|
320
|
+
"E7+4E" => "\x1E\x44", # LATIN CAPITAL LETTER N WITH DOT ABOVE
|
321
|
+
"E7+50" => "\x1E\x56", # LATIN CAPITAL LETTER P WITH DOT ABOVE
|
322
|
+
"E7+52" => "\x1E\x58", # LATIN CAPITAL LETTER R WITH DOT ABOVE
|
323
|
+
"E7+53" => "\x1E\x60", # LATIN CAPITAL LETTER S WITH DOT ABOVE
|
324
|
+
"E7+54" => "\x1E\x6A", # LATIN CAPITAL LETTER T WITH DOT ABOVE
|
325
|
+
"E7+57" => "\x1E\x86", # LATIN CAPITAL LETTER W WITH DOT ABOVE
|
326
|
+
"E7+58" => "\x1E\x8A", # LATIN CAPITAL LETTER X WITH DOT ABOVE
|
327
|
+
"E7+59" => "\x1E\x8E", # LATIN CAPITAL LETTER Y WITH DOT ABOVE
|
328
|
+
"E7+5A" => "\x01\x7B", # LATIN CAPITAL LETTER Z WITH DOT ABOVE
|
329
|
+
"E7+62" => "\x1E\x03", # LATIN SMALL LETTER B WITH DOT ABOVE
|
330
|
+
"E7+63" => "\x01\x0B", # LATIN SMALL LETTER C WITH DOT ABOVE
|
331
|
+
"E7+64" => "\x1E\x0B", # LATIN SMALL LETTER D WITH DOT ABOVE
|
332
|
+
"E7+65" => "\x01\x17", # LATIN SMALL LETTER E WITH DOT ABOVE
|
333
|
+
"E7+66" => "\x1E\x1F", # LATIN SMALL LETTER F WITH DOT ABOVE
|
334
|
+
"E7+67" => "\x01\x21", # LATIN SMALL LETTER G WITH DOT ABOVE
|
335
|
+
"E7+68" => "\x1E\x23", # LATIN SMALL LETTER H WITH DOT ABOVE
|
336
|
+
"E7+6D" => "\x1E\x41", # LATIN SMALL LETTER M WITH DOT ABOVE
|
337
|
+
"E7+6E" => "\x1E\x45", # LATIN SMALL LETTER N WITH DOT ABOVE
|
338
|
+
"E7+70" => "\x1E\x57", # LATIN SMALL LETTER P WITH DOT ABOVE
|
339
|
+
"E7+72" => "\x1E\x59", # LATIN SMALL LETTER R WITH DOT ABOVE
|
340
|
+
"E7+73" => "\x1E\x61", # LATIN SMALL LETTER S WITH DOT ABOVE
|
341
|
+
"E7+74" => "\x1E\x6B", # LATIN SMALL LETTER T WITH DOT ABOVE
|
342
|
+
"E7+77" => "\x1E\x87", # LATIN SMALL LETTER W WITH DOT ABOVE
|
343
|
+
"E7+78" => "\x1E\x8B", # LATIN SMALL LETTER X WITH DOT ABOVE
|
344
|
+
"E7+79" => "\x1E\x8F", # LATIN SMALL LETTER Y WITH DOT ABOVE
|
345
|
+
"E7+7A" => "\x01\x7C", # LATIN SMALL LETTER Z WITH DOT ABOVE
|
346
|
+
"E7+E2+53" => "\x1E\x64", # LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE
|
347
|
+
"E7+E2+73" => "\x1E\x65", # LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE
|
348
|
+
"E7+E5+41" => "\x01\xE0", # LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON
|
349
|
+
"E7+E5+61" => "\x01\xE1", # LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON
|
350
|
+
"E7+E9+53" => "\x1E\x66", # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
|
351
|
+
"E7+E9+73" => "\x1E\x67", # LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
|
352
|
+
"E7+F2+53" => "\x1E\x68", # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
|
353
|
+
"E7+F2+73" => "\x1E\x69", # LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
|
354
|
+
"E7" => "\x03\x07", # COMBINING DOT ABOVE
|
355
|
+
"E8+41" => "\x00\xC4", # LATIN CAPITAL LETTER A WITH DIAERESIS
|
356
|
+
"E8+45" => "\x00\xCB", # LATIN CAPITAL LETTER E WITH DIAERESIS
|
357
|
+
"E8+48" => "\x1E\x26", # LATIN CAPITAL LETTER H WITH DIAERESIS
|
358
|
+
"E8+49" => "\x00\xCF", # LATIN CAPITAL LETTER I WITH DIAERESIS
|
359
|
+
"E8+4F" => "\x00\xD6", # LATIN CAPITAL LETTER O WITH DIAERESIS
|
360
|
+
"E8+55" => "\x00\xDC", # LATIN CAPITAL LETTER U WITH DIAERESIS
|
361
|
+
"E8+57" => "\x1E\x84", # LATIN CAPITAL LETTER W WITH DIAERESIS
|
362
|
+
"E8+58" => "\x1E\x8C", # LATIN CAPITAL LETTER X WITH DIAERESIS
|
363
|
+
"E8+59" => "\x01\x78", # LATIN CAPITAL LETTER Y WITH DIAERESIS
|
364
|
+
"E8+61" => "\x00\xE4", # LATIN SMALL LETTER A WITH DIAERESIS
|
365
|
+
"E8+65" => "\x00\xEB", # LATIN SMALL LETTER E WITH DIAERESIS
|
366
|
+
"E8+68" => "\x1E\x27", # LATIN SMALL LETTER H WITH DIAERESIS
|
367
|
+
"E8+69" => "\x00\xEF", # LATIN SMALL LETTER I WITH DIAERESIS
|
368
|
+
"E8+6F" => "\x00\xF6", # LATIN SMALL LETTER O WITH DIAERESIS
|
369
|
+
"E8+74" => "\x1E\x97", # LATIN SMALL LETTER T WITH DIAERESIS
|
370
|
+
"E8+75" => "\x00\xFC", # LATIN SMALL LETTER U WITH DIAERESIS
|
371
|
+
"E8+77" => "\x1E\x85", # LATIN SMALL LETTER W WITH DIAERESIS
|
372
|
+
"E8+78" => "\x1E\x8D", # LATIN SMALL LETTER X WITH DIAERESIS
|
373
|
+
"E8+79" => "\x00\xFF", # LATIN SMALL LETTER Y WITH DIAERESIS
|
374
|
+
"E8+E1+55" => "\x01\xDB", # LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE
|
375
|
+
"E8+E1+75" => "\x01\xDC", # LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE
|
376
|
+
"E8+E2+49" => "\x1E\x2E", # LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE
|
377
|
+
"E8+E2+55" => "\x01\xD7", # LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE
|
378
|
+
"E8+E2+69" => "\x1E\x2F", # LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE
|
379
|
+
"E8+E2+75" => "\x01\xD8", # LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE
|
380
|
+
"E8+E4+4F" => "\x1E\x4E", # LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS
|
381
|
+
"E8+E4+6F" => "\x1E\x4F", # LATIN SMALL LETTER O WITH TILDE AND DIAERESIS
|
382
|
+
"E8+E5+41" => "\x01\xDE", # LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON
|
383
|
+
"E8+E5+55" => "\x1E\x7A", # LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON
|
384
|
+
"E8+E5+61" => "\x01\xDF", # LATIN SMALL LETTER A WITH DIAERESIS AND MACRON
|
385
|
+
"E8+E5+75" => "\x1E\x7B", # LATIN SMALL LETTER U WITH DIAERESIS AND MACRON
|
386
|
+
"E8+E9+55" => "\x01\xD9", # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
|
387
|
+
"E8+E9+75" => "\x01\xDA", # LATIN SMALL LETTER U WITH DIAERESIS AND CARON
|
388
|
+
"E8" => "\x03\x08", # COMBINING DIAERESIS
|
389
|
+
"E9+41" => "\x01\xCD", # LATIN CAPITAL LETTER A WITH CARON
|
390
|
+
"E9+43" => "\x01\x0C", # LATIN CAPITAL LETTER C WITH CARON
|
391
|
+
"E9+44" => "\x01\x0E", # LATIN CAPITAL LETTER D WITH CARON
|
392
|
+
"E9+45" => "\x01\x1A", # LATIN CAPITAL LETTER E WITH CARON
|
393
|
+
"E9+47" => "\x01\xE6", # LATIN CAPITAL LETTER G WITH CARON
|
394
|
+
"E9+49" => "\x01\xCF", # LATIN CAPITAL LETTER I WITH CARON
|
395
|
+
"E9+4B" => "\x01\xE8", # LATIN CAPITAL LETTER K WITH CARON
|
396
|
+
"E9+4C" => "\x01\x3D", # LATIN CAPITAL LETTER L WITH CARON
|
397
|
+
"E9+4E" => "\x01\x47", # LATIN CAPITAL LETTER N WITH CARON
|
398
|
+
"E9+4F" => "\x01\xD1", # LATIN CAPITAL LETTER O WITH CARON
|
399
|
+
"E9+52" => "\x01\x58", # LATIN CAPITAL LETTER R WITH CARON
|
400
|
+
"E9+53" => "\x01\x60", # LATIN CAPITAL LETTER S WITH CARON
|
401
|
+
"E9+54" => "\x01\x64", # LATIN CAPITAL LETTER T WITH CARON
|
402
|
+
"E9+55" => "\x01\xD3", # LATIN CAPITAL LETTER U WITH CARON
|
403
|
+
"E9+5A" => "\x01\x7D", # LATIN CAPITAL LETTER Z WITH CARON
|
404
|
+
"E9+61" => "\x01\xCE", # LATIN SMALL LETTER A WITH CARON
|
405
|
+
"E9+63" => "\x01\x0D", # LATIN SMALL LETTER C WITH CARON
|
406
|
+
"E9+64" => "\x01\x0F", # LATIN SMALL LETTER D WITH CARON
|
407
|
+
"E9+65" => "\x01\x1B", # LATIN SMALL LETTER E WITH CARON
|
408
|
+
"E9+67" => "\x01\xE7", # LATIN SMALL LETTER G WITH CARON
|
409
|
+
"E9+69" => "\x01\xD0", # LATIN SMALL LETTER I WITH CARON
|
410
|
+
"E9+6A" => "\x01\xF0", # LATIN SMALL LETTER J WITH CARON
|
411
|
+
"E9+6B" => "\x01\xE9", # LATIN SMALL LETTER K WITH CARON
|
412
|
+
"E9+6C" => "\x01\x3E", # LATIN SMALL LETTER L WITH CARON
|
413
|
+
"E9+6E" => "\x01\x48", # LATIN SMALL LETTER N WITH CARON
|
414
|
+
"E9+6F" => "\x01\xD2", # LATIN SMALL LETTER O WITH CARON
|
415
|
+
"E9+72" => "\x01\x59", # LATIN SMALL LETTER R WITH CARON
|
416
|
+
"E9+73" => "\x01\x61", # LATIN SMALL LETTER S WITH CARON
|
417
|
+
"E9+74" => "\x01\x65", # LATIN SMALL LETTER T WITH CARON
|
418
|
+
"E9+75" => "\x01\xD4", # LATIN SMALL LETTER U WITH CARON
|
419
|
+
"E9+7A" => "\x01\x7E", # LATIN SMALL LETTER Z WITH CARON
|
420
|
+
"E9+E7+53" => "\x1E\x66", # LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE
|
421
|
+
"E9+E7+73" => "\x1E\x67", # LATIN SMALL LETTER S WITH CARON AND DOT ABOVE
|
422
|
+
"E9+E8+55" => "\x01\xD9", # LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON
|
423
|
+
"E9+E8+75" => "\x01\xDA", # LATIN SMALL LETTER U WITH DIAERESIS AND CARON
|
424
|
+
"E9" => "\x03\x0C", # COMBINING CARON
|
425
|
+
"EA+41" => "\x00\xC5", # LATIN CAPITAL LETTER A WITH RING ABOVE
|
426
|
+
"EA+55" => "\x01\x6E", # LATIN CAPITAL LETTER U WITH RING ABOVE
|
427
|
+
"EA+61" => "\x00\xE5", # LATIN SMALL LETTER A WITH RING ABOVE
|
428
|
+
"EA+75" => "\x01\x6F", # LATIN SMALL LETTER U WITH RING ABOVE
|
429
|
+
"EA+77" => "\x1E\x98", # LATIN SMALL LETTER W WITH RING ABOVE
|
430
|
+
"EA+79" => "\x1E\x99", # LATIN SMALL LETTER Y WITH RING ABOVE
|
431
|
+
"EA+E2+41" => "\x01\xFA", # LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE
|
432
|
+
"EA+E2+61" => "\x01\xFB", # LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE
|
433
|
+
"EA" => "\x03\x0A", # COMBINING RING ABOVE
|
434
|
+
"EB" => "\xFE\x20", # COMBINING LIGATURE LEFT HALF
|
435
|
+
"EC" => "\xFE\x21", # COMBINING LIGATURE RIGHT HALF
|
436
|
+
"ED" => "\x03\x15", # COMBINING COMMA ABOVE RIGHT
|
437
|
+
"EE+4F" => "\x01\x50", # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
|
438
|
+
"EE+55" => "\x01\x70", # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
|
439
|
+
"EE+6F" => "\x01\x51", # LATIN SMALL LETTER O WITH DOUBLE ACUTE
|
440
|
+
"EE+75" => "\x01\x71", # LATIN SMALL LETTER U WITH DOUBLE ACUTE
|
441
|
+
"EE" => "\x03\x0B", # COMBINING DOUBLE ACUTE ACCENT
|
442
|
+
"EF" => "\x03\x10", # COMBINING CANDRABINDU
|
443
|
+
"F0+43" => "\x00\xC7", # LATIN CAPITAL LETTER C WITH CEDILLA
|
444
|
+
"F0+44" => "\x1E\x10", # LATIN CAPITAL LETTER D WITH CEDILLA
|
445
|
+
"F0+47" => "\x01\x22", # LATIN CAPITAL LETTER G WITH CEDILLA
|
446
|
+
"F0+48" => "\x1E\x28", # LATIN CAPITAL LETTER H WITH CEDILLA
|
447
|
+
"F0+4B" => "\x01\x36", # LATIN CAPITAL LETTER K WITH CEDILLA
|
448
|
+
"F0+4C" => "\x01\x3B", # LATIN CAPITAL LETTER L WITH CEDILLA
|
449
|
+
"F0+4E" => "\x01\x45", # LATIN CAPITAL LETTER N WITH CEDILLA
|
450
|
+
"F0+52" => "\x01\x56", # LATIN CAPITAL LETTER R WITH CEDILLA
|
451
|
+
"F0+53" => "\x01\x5E", # LATIN CAPITAL LETTER S WITH CEDILLA
|
452
|
+
"F0+54" => "\x01\x62", # LATIN CAPITAL LETTER T WITH CEDILLA
|
453
|
+
"F0+63" => "\x00\xE7", # LATIN SMALL LETTER C WITH CEDILLA
|
454
|
+
"F0+64" => "\x1E\x11", # LATIN SMALL LETTER D WITH CEDILLA
|
455
|
+
"F0+67" => "\x01\x23", # LATIN SMALL LETTER G WITH CEDILLA
|
456
|
+
"F0+68" => "\x1E\x29", # LATIN SMALL LETTER H WITH CEDILLA
|
457
|
+
"F0+6B" => "\x01\x37", # LATIN SMALL LETTER K WITH CEDILLA
|
458
|
+
"F0+6C" => "\x01\x3C", # LATIN SMALL LETTER L WITH CEDILLA
|
459
|
+
"F0+6E" => "\x01\x46", # LATIN SMALL LETTER N WITH CEDILLA
|
460
|
+
"F0+72" => "\x01\x57", # LATIN SMALL LETTER R WITH CEDILLA
|
461
|
+
"F0+73" => "\x01\x5F", # LATIN SMALL LETTER S WITH CEDILLA
|
462
|
+
"F0+74" => "\x01\x63", # LATIN SMALL LETTER T WITH CEDILLA
|
463
|
+
"F0+E2+43" => "\x1E\x08", # LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE
|
464
|
+
"F0+E2+63" => "\x1E\x09", # LATIN SMALL LETTER C WITH CEDILLA AND ACUTE
|
465
|
+
"F0+E6+45" => "\x1E\x1C", # LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE
|
466
|
+
"F0+E6+65" => "\x1E\x1D", # LATIN SMALL LETTER E WITH CEDILLA AND BREVE
|
467
|
+
"F0" => "\x03\x27", # COMBINING CEDILLA
|
468
|
+
"F1+41" => "\x01\x04", # LATIN CAPITAL LETTER A WITH OGONEK
|
469
|
+
"F1+45" => "\x01\x18", # LATIN CAPITAL LETTER E WITH OGONEK
|
470
|
+
"F1+49" => "\x01\x2E", # LATIN CAPITAL LETTER I WITH OGONEK
|
471
|
+
"F1+4F" => "\x01\xEA", # LATIN CAPITAL LETTER O WITH OGONEK
|
472
|
+
"F1+55" => "\x01\x72", # LATIN CAPITAL LETTER U WITH OGONEK
|
473
|
+
"F1+61" => "\x01\x05", # LATIN SMALL LETTER A WITH OGONEK
|
474
|
+
"F1+65" => "\x01\x19", # LATIN SMALL LETTER E WITH OGONEK
|
475
|
+
"F1+69" => "\x01\x2F", # LATIN SMALL LETTER I WITH OGONEK
|
476
|
+
"F1+6F" => "\x01\xEB", # LATIN SMALL LETTER O WITH OGONEK
|
477
|
+
"F1+75" => "\x01\x73", # LATIN SMALL LETTER U WITH OGONEK
|
478
|
+
"F1+E5+4F" => "\x01\xEC", # LATIN CAPITAL LETTER O WITH OGONEK AND MACRON
|
479
|
+
"F1+E5+6F" => "\x01\xED", # LATIN SMALL LETTER O WITH OGONEK AND MACRON
|
480
|
+
"F1" => "\x03\x28", # COMBINING OGONEK
|
481
|
+
"F2+41" => "\x1E\xA0", # LATIN CAPITAL LETTER A WITH DOT BELOW
|
482
|
+
"F2+42" => "\x1E\x04", # LATIN CAPITAL LETTER B WITH DOT BELOW
|
483
|
+
"F2+44" => "\x1E\x0C", # LATIN CAPITAL LETTER D WITH DOT BELOW
|
484
|
+
"F2+45" => "\x1E\xB8", # LATIN CAPITAL LETTER E WITH DOT BELOW
|
485
|
+
"F2+48" => "\x1E\x24", # LATIN CAPITAL LETTER H WITH DOT BELOW
|
486
|
+
"F2+49" => "\x1E\xCA", # LATIN CAPITAL LETTER I WITH DOT BELOW
|
487
|
+
"F2+4B" => "\x1E\x32", # LATIN CAPITAL LETTER K WITH DOT BELOW
|
488
|
+
"F2+4C" => "\x1E\x36", # LATIN CAPITAL LETTER L WITH DOT BELOW
|
489
|
+
"F2+4D" => "\x1E\x42", # LATIN CAPITAL LETTER M WITH DOT BELOW
|
490
|
+
"F2+4E" => "\x1E\x46", # LATIN CAPITAL LETTER N WITH DOT BELOW
|
491
|
+
"F2+4F" => "\x1E\xCC", # LATIN CAPITAL LETTER O WITH DOT BELOW
|
492
|
+
"F2+52" => "\x1E\x5A", # LATIN CAPITAL LETTER R WITH DOT BELOW
|
493
|
+
"F2+53" => "\x1E\x62", # LATIN CAPITAL LETTER S WITH DOT BELOW
|
494
|
+
"F2+54" => "\x1E\x6C", # LATIN CAPITAL LETTER T WITH DOT BELOW
|
495
|
+
"F2+55" => "\x1E\xE4", # LATIN CAPITAL LETTER U WITH DOT BELOW
|
496
|
+
"F2+56" => "\x1E\x7E", # LATIN CAPITAL LETTER V WITH DOT BELOW
|
497
|
+
"F2+57" => "\x1E\x88", # LATIN CAPITAL LETTER W WITH DOT BELOW
|
498
|
+
"F2+59" => "\x1E\xF4", # LATIN CAPITAL LETTER Y WITH DOT BELOW
|
499
|
+
"F2+5A" => "\x1E\x92", # LATIN CAPITAL LETTER Z WITH DOT BELOW
|
500
|
+
"F2+61" => "\x1E\xA1", # LATIN SMALL LETTER A WITH DOT BELOW
|
501
|
+
"F2+62" => "\x1E\x05", # LATIN SMALL LETTER B WITH DOT BELOW
|
502
|
+
"F2+64" => "\x1E\x0D", # LATIN SMALL LETTER D WITH DOT BELOW
|
503
|
+
"F2+65" => "\x1E\xB9", # LATIN SMALL LETTER E WITH DOT BELOW
|
504
|
+
"F2+68" => "\x1E\x25", # LATIN SMALL LETTER H WITH DOT BELOW
|
505
|
+
"F2+69" => "\x1E\xCB", # LATIN SMALL LETTER I WITH DOT BELOW
|
506
|
+
"F2+6B" => "\x1E\x33", # LATIN SMALL LETTER K WITH DOT BELOW
|
507
|
+
"F2+6C" => "\x1E\x37", # LATIN SMALL LETTER L WITH DOT BELOW
|
508
|
+
"F2+6D" => "\x1E\x43", # LATIN SMALL LETTER M WITH DOT BELOW
|
509
|
+
"F2+6E" => "\x1E\x47", # LATIN SMALL LETTER N WITH DOT BELOW
|
510
|
+
"F2+6F" => "\x1E\xCD", # LATIN SMALL LETTER O WITH DOT BELOW
|
511
|
+
"F2+72" => "\x1E\x5B", # LATIN SMALL LETTER R WITH DOT BELOW
|
512
|
+
"F2+73" => "\x1E\x63", # LATIN SMALL LETTER S WITH DOT BELOW
|
513
|
+
"F2+74" => "\x1E\x6D", # LATIN SMALL LETTER T WITH DOT BELOW
|
514
|
+
"F2+75" => "\x1E\xE5", # LATIN SMALL LETTER U WITH DOT BELOW
|
515
|
+
"F2+76" => "\x1E\x7F", # LATIN SMALL LETTER V WITH DOT BELOW
|
516
|
+
"F2+77" => "\x1E\x89", # LATIN SMALL LETTER W WITH DOT BELOW
|
517
|
+
"F2+79" => "\x1E\xF5", # LATIN SMALL LETTER Y WITH DOT BELOW
|
518
|
+
"F2+7A" => "\x1E\x93", # LATIN SMALL LETTER Z WITH DOT BELOW
|
519
|
+
"F2+E3+41" => "\x1E\xAC", # LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW
|
520
|
+
"F2+E3+45" => "\x1E\xC6", # LATIN CAPITAL LETTER E WITH CIRCUMFLEX AND DOT BELOW
|
521
|
+
"F2+E3+4F" => "\x1E\xD8", # LATIN CAPITAL LETTER O WITH CIRCUMFLEX AND DOT BELOW
|
522
|
+
"F2+E3+61" => "\x1E\xAD", # LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW
|
523
|
+
"F2+E3+65" => "\x1E\xC7", # LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW
|
524
|
+
"F2+E3+6F" => "\x1E\xD9", # LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
|
525
|
+
"F2+E5+4C" => "\x1E\x38", # LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON
|
526
|
+
"F2+E5+52" => "\x1E\x5C", # LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON
|
527
|
+
"F2+E5+6C" => "\x1E\x39", # LATIN SMALL LETTER L WITH DOT BELOW AND MACRON
|
528
|
+
"F2+E5+72" => "\x1E\x5D", # LATIN SMALL LETTER R WITH DOT BELOW AND MACRON
|
529
|
+
"F2+E6+41" => "\x1E\xB6", # LATIN CAPITAL LETTER A WITH BREVE AND DOT BELOW
|
530
|
+
"F2+E6+61" => "\x1E\xB7", # LATIN SMALL LETTER A WITH BREVE AND DOT BELOW
|
531
|
+
"F2+E7+53" => "\x1E\x68", # LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE
|
532
|
+
"F2+E7+73" => "\x1E\x69", # LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE
|
533
|
+
"F2" => "\x03\x23", # COMBINING DOT BELOW
|
534
|
+
"F3+55" => "\x1E\x72", # LATIN CAPITAL LETTER U WITH DIAERESIS BELOW
|
535
|
+
"F3+75" => "\x1E\x73", # LATIN SMALL LETTER U WITH DIAERESIS BELOW
|
536
|
+
"F3" => "\x03\x24", # COMBINING DIAERESIS BELOW
|
537
|
+
"F4+41" => "\x1E\x00", # LATIN CAPITAL LETTER A WITH RING BELOW
|
538
|
+
"F4+61" => "\x1E\x01", # LATIN SMALL LETTER A WITH RING BELOW
|
539
|
+
"F4" => "\x03\x25", # COMBINING RING BELOW
|
540
|
+
"F5" => "\x03\x33", # COMBINING DOUBLE LOW LINE
|
541
|
+
"F6+42" => "\x1E\x06", # LATIN CAPITAL LETTER B WITH LINE BELOW
|
542
|
+
"F6+44" => "\x1E\x0E", # LATIN CAPITAL LETTER D WITH LINE BELOW
|
543
|
+
"F6+4B" => "\x1E\x34", # LATIN CAPITAL LETTER K WITH LINE BELOW
|
544
|
+
"F6+4C" => "\x1E\x3A", # LATIN CAPITAL LETTER L WITH LINE BELOW
|
545
|
+
"F6+4E" => "\x1E\x48", # LATIN CAPITAL LETTER N WITH LINE BELOW
|
546
|
+
"F6+52" => "\x1E\x5E", # LATIN CAPITAL LETTER R WITH LINE BELOW
|
547
|
+
"F6+54" => "\x1E\x6E", # LATIN CAPITAL LETTER T WITH LINE BELOW
|
548
|
+
"F6+5A" => "\x1E\x94", # LATIN CAPITAL LETTER Z WITH LINE BELOW
|
549
|
+
"F6+62" => "\x1E\x07", # LATIN SMALL LETTER B WITH LINE BELOW
|
550
|
+
"F6+64" => "\x1E\x0F", # LATIN SMALL LETTER D WITH LINE BELOW
|
551
|
+
"F6+68" => "\x1E\x96", # LATIN SMALL LETTER H WITH LINE BELOW
|
552
|
+
"F6+6B" => "\x1E\x35", # LATIN SMALL LETTER K WITH LINE BELOW
|
553
|
+
"F6+6C" => "\x1E\x3B", # LATIN SMALL LETTER L WITH LINE BELOW
|
554
|
+
"F6+6E" => "\x1E\x49", # LATIN SMALL LETTER N WITH LINE BELOW
|
555
|
+
"F6+72" => "\x1E\x5F", # LATIN SMALL LETTER R WITH LINE BELOW
|
556
|
+
"F6+74" => "\x1E\x6F", # LATIN SMALL LETTER T WITH LINE BELOW
|
557
|
+
"F6+7A" => "\x1E\x95", # LATIN SMALL LETTER Z WITH LINE BELOW
|
558
|
+
"F6" => "\x03\x32", # COMBINING LOW LINE
|
559
|
+
"F7" => "\x03\x26", # COMBINING COMMA BELOW
|
560
|
+
"F8" => "\x03\x21", # COMBINING OGONEK
|
561
|
+
"F9+48" => "\x1E\x2A", # LATIN CAPITAL LETTER H WITH BREVE BELOW
|
562
|
+
"F9+68" => "\x1E\x2B", # LATIN SMALL LETTER H WITH BREVE BELOW
|
563
|
+
"F9" => "\x03\x2E", # COMBINING BREVE BELOW
|
564
|
+
"FA" => "\xFE\x22", # COMBINING DOUBLE TILDE LEFT HALF
|
565
|
+
"FB" => "\xFE\x23" # COMBINING DOUBLE TILDE RIGHT HALF
|
566
|
+
}
|
567
|
+
end
|
568
|
+
end
|
@@ -0,0 +1,50 @@
|
|
1
|
+
# encoding: ascii-8bit
|
2
|
+
|
3
|
+
module ANSEL
|
4
|
+
class Converter
|
5
|
+
include ANSEL::CharacterMap
|
6
|
+
|
7
|
+
def initialize(to_charset = 'UTF-8')
|
8
|
+
@to_charset = to_charset
|
9
|
+
end
|
10
|
+
|
11
|
+
def ansi_to_utf16
|
12
|
+
@ansi_to_utf16 ||= @@non_combining.merge(@@combining)
|
13
|
+
end
|
14
|
+
|
15
|
+
def convert(string)
|
16
|
+
output = ''
|
17
|
+
scanner = StringScanner.new(string)
|
18
|
+
until scanner.eos? do
|
19
|
+
byte = scanner.get_byte
|
20
|
+
char = byte.unpack('C')[0]
|
21
|
+
|
22
|
+
case char
|
23
|
+
when 0x00..0x7F
|
24
|
+
output << byte.force_encoding('UTF-8')
|
25
|
+
when 0x88..0xC8
|
26
|
+
hex_key = char.to_s(16).upcase
|
27
|
+
output << (ansi_to_utf16[hex_key] || ansi_to_utf16['ERR']).force_encoding('UTF-16BE').encode('UTF-8')
|
28
|
+
scanner.get_byte # ignore the next byte
|
29
|
+
when 0xE0..0xFB
|
30
|
+
[2, 1, 0].each do |n| # try 3 bytes, then 2 bytes, then 1 byte
|
31
|
+
bytes = [char.to_s(16).upcase]
|
32
|
+
scanner.peek(n).each_byte {|b| bytes << b.to_s(16).upcase}
|
33
|
+
hex_key = bytes.join('+')
|
34
|
+
if ansi_to_utf16.has_key?(hex_key)
|
35
|
+
output << ansi_to_utf16[hex_key].force_encoding('UTF-16BE').encode('UTF-8')
|
36
|
+
n.times {scanner.get_byte}
|
37
|
+
break
|
38
|
+
end
|
39
|
+
end
|
40
|
+
else
|
41
|
+
output << ansi_to_utf16['ERR'].force_encoding('UTF-16BE').encode('UTF-8')
|
42
|
+
scanner.get_byte if scanner.get_byte.unpack('C')[0] >= 0xE0 # ignore the next byte
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
output.force_encoding('UTF-8')
|
47
|
+
end
|
48
|
+
|
49
|
+
end
|
50
|
+
end
|
metadata
ADDED
@@ -0,0 +1,57 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: ansel
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 2.0.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Keith Morrison
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2015-01-31 00:00:00.000000000 Z
|
12
|
+
dependencies: []
|
13
|
+
description: Convert ANSEL encoded text to UTF-8
|
14
|
+
email: keithm@infused.org
|
15
|
+
executables: []
|
16
|
+
extensions: []
|
17
|
+
extra_rdoc_files:
|
18
|
+
- README.md
|
19
|
+
- CHANGELOG.md
|
20
|
+
- MIT-LICENSE
|
21
|
+
files:
|
22
|
+
- CHANGELOG.md
|
23
|
+
- Gemfile
|
24
|
+
- Gemfile.lock
|
25
|
+
- MIT-LICENSE
|
26
|
+
- README.md
|
27
|
+
- Rakefile
|
28
|
+
- ansel.gemspec
|
29
|
+
- lib/ansel.rb
|
30
|
+
- lib/ansel/character_map.rb
|
31
|
+
- lib/ansel/converter.rb
|
32
|
+
- lib/ansel/version.rb
|
33
|
+
homepage: http://github.com/infused/ansel
|
34
|
+
licenses: []
|
35
|
+
metadata: {}
|
36
|
+
post_install_message:
|
37
|
+
rdoc_options:
|
38
|
+
- "--charset=UTF-8"
|
39
|
+
require_paths:
|
40
|
+
- lib
|
41
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
42
|
+
requirements:
|
43
|
+
- - ">="
|
44
|
+
- !ruby/object:Gem::Version
|
45
|
+
version: '0'
|
46
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
47
|
+
requirements:
|
48
|
+
- - ">="
|
49
|
+
- !ruby/object:Gem::Version
|
50
|
+
version: 1.3.0
|
51
|
+
requirements: []
|
52
|
+
rubyforge_project:
|
53
|
+
rubygems_version: 2.4.3
|
54
|
+
signing_key:
|
55
|
+
specification_version: 4
|
56
|
+
summary: Convert ANSEL encoded text to UTF-8
|
57
|
+
test_files: []
|