htmlentitties 4.2.1

Sign up to get free protection for your applications and to get access to all the features.
data/COPYING.txt ADDED
@@ -0,0 +1,21 @@
1
+ == Licence (MIT)
2
+
3
+ Copyright (c) 2005-2009 Paul Battley
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/History.txt ADDED
@@ -0,0 +1,66 @@
1
+ == 4.2.1 (2010-04-05)
2
+ * Fixed error on Ruby 1.8.x when $KCODE was not set to "UTF8". Thanks to
3
+ Benoit Larroque for the bug report.
4
+
5
+ == 4.2.0 (2009-08-24)
6
+ * Added benchmarking code and improved performance.
7
+
8
+ == 4.1.0 (2009-08-15)
9
+ * Now works with Ruby 1.9.1 and JRuby.
10
+ * Reverted lazy loading of entity mappings as this is not thread-safe.
11
+ * Finally removed the deprecated String#encode_entities and #decode_entities
12
+ methods.
13
+
14
+ == 4.0.1 (2008-06-03)
15
+ * Added :expanded charset -- the ~1000 SGML entities from
16
+ ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/SGML.TXT (extra code by
17
+ Philip (flip) Kromer <flip@infochimps.org>, entity info from John Cowan
18
+ <cowan@ccil.org> #)
19
+
20
+ == 4.0.0 (2007-03-15)
21
+ * New instantiation-based interface (but legacy interface is preserved for
22
+ compatibility.
23
+ * Handles HTML4 as well as XHTML1 (the former lacks the &apos; entity).
24
+ * Encodes basic entities numerically when :basic isn't specified and :decimal
25
+ or :hexadecimal is.
26
+ * Performs a maximum of two gsub passes instead of three when encoding, which
27
+ should be more efficient on long strings.
28
+
29
+ == 3.1.0 (2007-01-19)
30
+ * Now understands all the entities referred to in the XHTML 1.0 DTD (253
31
+ entities compared with 131 in version 3.0.1).
32
+ * Calls to_s on parameters to play nicely with Rails 1.2.1.
33
+ * Entity mapping data is now lazily loaded.
34
+
35
+ == 3.0.1 (2005-04-08)
36
+ * Improved documentation.
37
+
38
+ == 3.0.0 (2005-04-08)
39
+ * Changed licence to MIT due to confusion with previous 'Fair' licence (my
40
+ intention was to be liberal, not obscure).
41
+ * Moved basic functionality out of String class; for previous behaviour,
42
+ require 'htmlentities/string'.
43
+ * Changed version numbering scheme.
44
+ * Now available as a Gem.
45
+
46
+ == 2.2 (2005-11-07)
47
+ * Important bug fixes -- thanks to Moonwolf.
48
+ * Decoding hexadecimal entities now accepts 'f' as a hex digit. (D'oh!)
49
+ * Decimal decoding edge cases addressed.
50
+ * Test cases added.
51
+
52
+ == 2.1 (2005-10-31)
53
+ * Removed some unnecessary code in basic entity encoding.
54
+ * Improved handling of encoding: commands are now automatically sorted, so the
55
+ user doesn't have to worry about their order.
56
+ * Now using setup.rb.
57
+ * Tests moved to separate file.
58
+
59
+ == 2.0 (2005-08-23)
60
+ * Added encoding to entities.
61
+ * Decoding interface unchanged.
62
+ * Fixed a bug with handling high codepoints.
63
+
64
+ == 1.0 (2005-08-03)
65
+ * Initial release.
66
+ * Decoding only.
data/README.rdoc ADDED
@@ -0,0 +1,44 @@
1
+ == HTMLEntities
2
+
3
+ HTML entity encoding and decoding for Ruby
4
+
5
+ The HTMLEntities module facilitates encoding and decoding of
6
+ (X)HTML entities from/to their corresponding UTF-8 codepoints.
7
+
8
+ To install (requires root/admin privileges):
9
+
10
+ ruby setup.rb
11
+
12
+ Alternatively, you can just use the gem.
13
+
14
+ == Licence
15
+
16
+ This code is free to use under the terms of the MIT licence:
17
+
18
+ Copyright (c) 2005-2009 Paul Battley
19
+
20
+ Permission is hereby granted, free of charge, to any person obtaining a copy
21
+ of this software and associated documentation files (the "Software"), to
22
+ deal in the Software without restriction, including without limitation the
23
+ rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
24
+ sell copies of the Software, and to permit persons to whom the Software is
25
+ furnished to do so, subject to the following conditions:
26
+
27
+ The above copyright notice and this permission notice shall be included in
28
+ all copies or substantial portions of the Software.
29
+
30
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
31
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
32
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
33
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
34
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
35
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
36
+ IN THE SOFTWARE.
37
+
38
+ If you'd like to negotiate a different licence for a specific use, just
39
+ contact me -- I'll almost certainly permit it.
40
+
41
+ == Contact
42
+
43
+ Comments are welcome. Send an email to pbattley@gmail.com.
44
+
@@ -0,0 +1,76 @@
1
+ # encoding: UTF-8
2
+ require 'htmlentities/legacy'
3
+ require 'htmlentities/flavors'
4
+ require 'htmlentities/encoder'
5
+ require 'htmlentities/decoder'
6
+ require 'htmlentities/version'
7
+
8
+ #
9
+ # HTML entity encoding and decoding for Ruby
10
+ #
11
+ class HTMLEntities
12
+ UnknownFlavor = Class.new(RuntimeError)
13
+
14
+ #
15
+ # Create a new HTMLEntities coder for the specified flavor.
16
+ # Available flavors are 'html4', 'expanded' and 'xhtml1' (the default).
17
+ #
18
+ # The only difference in functionality between html4 and xhtml1 is in the
19
+ # handling of the apos (apostrophe) named entity, which is not defined in
20
+ # HTML4.
21
+ #
22
+ # 'expanded' includes a large number of additional SGML entities drawn from
23
+ # ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/SGML.TXT
24
+ # it "maps SGML character entities from various public sets (namely, ISOamsa,
25
+ # ISOamsb, ISOamsc, ISOamsn, ISOamso, ISOamsr, ISObox, ISOcyr1, ISOcyr2,
26
+ # ISOdia, ISOgrk1, ISOgrk2, ISOgrk3, ISOgrk4, ISOlat1, ISOlat2, ISOnum,
27
+ # ISOpub, ISOtech, HTMLspecial, HTMLsymbol) to corresponding Unicode
28
+ # characters." (sgml.txt).
29
+ #
30
+ # 'expanded' is a strict superset of the XHTML entities: every xhtml named
31
+ # entity encodes and decodes the same under :expanded as under :xhtml1
32
+ #
33
+ def initialize(flavor='xhtml1')
34
+ @flavor = flavor.to_s.downcase
35
+ raise UnknownFlavor, "Unknown flavor #{flavor}" unless FLAVORS.include?(@flavor)
36
+ end
37
+
38
+ #
39
+ # Decode entities in a string into their UTF-8
40
+ # equivalents. The string should already be in UTF-8 encoding.
41
+ #
42
+ # Unknown named entities will not be converted
43
+ #
44
+ def decode(source)
45
+ (@decoder ||= Decoder.new(@flavor)).decode(source)
46
+ end
47
+
48
+ #
49
+ # Encode codepoints into their corresponding entities. Various operations
50
+ # are possible, and may be specified in order:
51
+ #
52
+ # :basic :: Convert the five XML entities ('"<>&)
53
+ # :named :: Convert non-ASCII characters to their named HTML 4.01 equivalent
54
+ # :decimal :: Convert non-ASCII characters to decimal entities (e.g. &#1234;)
55
+ # :hexadecimal :: Convert non-ASCII characters to hexadecimal entities (e.g. # &#x12ab;)
56
+ #
57
+ # You can specify the commands in any order, but they will be executed in
58
+ # the order listed above to ensure that entity ampersands are not
59
+ # clobbered and that named entities are replaced before numeric ones.
60
+ #
61
+ # If no instructions are specified, :basic will be used.
62
+ #
63
+ # Examples:
64
+ # encode_entities(str) - XML-safe
65
+ # encode_entities(str, :basic, :decimal) - XML-safe and 7-bit clean
66
+ # encode_entities(str, :basic, :named, :decimal) - 7-bit clean, with all
67
+ # non-ASCII characters replaced with their named entity where possible, and
68
+ # decimal equivalents otherwise.
69
+ #
70
+ # Note: It is the program's responsibility to ensure that the source
71
+ # contains valid UTF-8 before calling this method.
72
+ #
73
+ def encode(source, *instructions)
74
+ Encoder.new(@flavor, instructions).encode(source)
75
+ end
76
+ end
@@ -0,0 +1,29 @@
1
+ class HTMLEntities
2
+ class Decoder #:nodoc:
3
+ def initialize(flavor)
4
+ @flavor = flavor
5
+ @map = HTMLEntities::MAPPINGS[@flavor]
6
+ @named_entity_regexp = named_entity_regexp
7
+ end
8
+
9
+ def decode(source)
10
+ source.to_s.gsub(@named_entity_regexp) {
11
+ (codepoint = @map[$1]) ? [codepoint].pack('U') : $&
12
+ }.gsub(/&#(?:([0-9]{1,7})|x([0-9a-f]{1,6}));/i) {
13
+ $1 ? [$1.to_i].pack('U') : [$2.to_i(16)].pack('U')
14
+ }
15
+ end
16
+
17
+ private
18
+ def named_entity_regexp
19
+ key_lengths = @map.keys.map{ |k| k.length }
20
+ entity_name_pattern =
21
+ if @flavor == 'expanded'
22
+ '(?:b\.)?[a-z][a-z0-9]'
23
+ else
24
+ '[a-z][a-z0-9]'
25
+ end
26
+ /&(#{ entity_name_pattern }{#{ key_lengths.min - 1 },#{ key_lengths.max - 1 }});/i
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,107 @@
1
+ class HTMLEntities
2
+ InstructionError = Class.new(RuntimeError)
3
+
4
+ class Encoder #:nodoc:
5
+ INSTRUCTIONS = [:basic, :named, :decimal, :hexadecimal]
6
+
7
+ def initialize(flavor, instructions)
8
+ @flavor = flavor
9
+ instructions = [:basic] if instructions.empty?
10
+ validate_instructions(instructions)
11
+ build_basic_entity_encoder(instructions)
12
+ build_extended_entity_encoder(instructions)
13
+ end
14
+
15
+ def encode(source)
16
+ string = source.to_s.dup
17
+ string.gsub!(basic_entity_regexp){ encode_basic($&) }
18
+ string.gsub!(extended_entity_regexp){ encode_extended($&) }
19
+ string
20
+ end
21
+
22
+ private
23
+ def basic_entity_regexp
24
+ @basic_entity_regexp ||= (
25
+ case @flavor
26
+ when /^html/
27
+ /[<>"&]/
28
+ else
29
+ /[<>'"&]/
30
+ end
31
+ )
32
+ end
33
+
34
+ def extended_entity_regexp
35
+ @extended_entity_regexp ||= (
36
+ regexp_options = [nil]
37
+ if encoding_aware?
38
+ regexp = '[^\u{20}-\u{7E}]'
39
+ else
40
+ regexp = '[^\x20-\x7E]'
41
+ regexp_options << "U"
42
+ end
43
+ regexp += "|'" if @flavor == 'html4'
44
+ Regexp.new(regexp, *regexp_options)
45
+ )
46
+ end
47
+
48
+ def validate_instructions(instructions)
49
+ unknown_instructions = instructions - INSTRUCTIONS
50
+ if unknown_instructions.any?
51
+ raise InstructionError, "unknown encode_entities command(s): #{unknown_instructions.inspect}"
52
+ end
53
+
54
+ if (instructions.include?(:decimal) && instructions.include?(:hexadecimal))
55
+ raise InstructionError, "hexadecimal and decimal encoding are mutually exclusive"
56
+ end
57
+ end
58
+
59
+ def build_basic_entity_encoder(instructions)
60
+ if instructions.include?(:basic) || instructions.include?(:named)
61
+ method = :encode_named
62
+ elsif instructions.include?(:decimal)
63
+ method = :encode_decimal
64
+ elsif instructions.include?(:hexadecimal)
65
+ method = :encode_hexadecimal
66
+ end
67
+ instance_eval "def encode_basic(char)\n#{method}(char)\nend"
68
+ end
69
+
70
+ def build_extended_entity_encoder(instructions)
71
+ definition = "def encode_extended(char)\n"
72
+ ([:named, :decimal, :hexadecimal] & instructions).each do |encoder|
73
+ definition << "encoded = encode_#{encoder}(char)\n"
74
+ definition << "return encoded if encoded\n"
75
+ end
76
+ definition << "char\n"
77
+ definition << "end"
78
+ instance_eval definition
79
+ end
80
+
81
+ def encode_named(char)
82
+ cp = char.unpack('U')[0]
83
+ (e = reverse_map[cp]) && "&#{e};"
84
+ end
85
+
86
+ def encode_decimal(char)
87
+ "&##{char.unpack('U')[0]};"
88
+ end
89
+
90
+ def encode_hexadecimal(char)
91
+ "&#x#{char.unpack('U')[0].to_s(16)};"
92
+ end
93
+
94
+ def reverse_map
95
+ @reverse_map ||= (
96
+ skips = HTMLEntities::SKIP_DUP_ENCODINGS[@flavor]
97
+ map = HTMLEntities::MAPPINGS[@flavor]
98
+ uniqmap = skips ? map.reject{|ent,hx| skips.include? ent} : map
99
+ uniqmap.invert
100
+ )
101
+ end
102
+
103
+ def encoding_aware?
104
+ "1.9".respond_to?(:encoding)
105
+ end
106
+ end
107
+ end
@@ -0,0 +1,9 @@
1
+ class HTMLEntities
2
+ FLAVORS = %w[html4 xhtml1 expanded]
3
+ MAPPINGS = {} unless defined? MAPPINGS
4
+ SKIP_DUP_ENCODINGS = {} unless defined? SKIP_DUP_ENCODINGS
5
+ end
6
+
7
+ HTMLEntities::FLAVORS.each do |flavor|
8
+ require "htmlentities/mappings/#{flavor}"
9
+ end
@@ -0,0 +1,31 @@
1
+ class HTMLEntities
2
+ class << self
3
+
4
+ #
5
+ # Legacy compatibility class method allowing direct encoding of XHTML1 entities.
6
+ # See HTMLEntities#encode for description of parameters.
7
+ #
8
+ # Deprecated.
9
+ #
10
+ def encode_entities(*args)
11
+ xhtml1_entities.encode(*args)
12
+ end
13
+
14
+ #
15
+ # Legacy compatibility class method allowing direct decoding of XHTML1 entities.
16
+ # See HTMLEntities#decode for description of parameters.
17
+ #
18
+ # Deprecated.
19
+ #
20
+ def decode_entities(*args)
21
+ xhtml1_entities.decode(*args)
22
+ end
23
+
24
+ private
25
+
26
+ def xhtml1_entities
27
+ @xhtml1_entities ||= new('xhtml1')
28
+ end
29
+
30
+ end
31
+ end
@@ -0,0 +1,1050 @@
1
+ # encoding: UTF-8
2
+ class HTMLEntities
3
+ #
4
+ # This table added by Philip (flip) Kromer <flip@infochimps.org>
5
+ # using the mapping by John Cowan <cowan@ccil.org> (25 July 1997) at
6
+ # ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MISC/SGML.TXT
7
+ #
8
+ # The following table maps SGML character entities from various
9
+ # public sets (namely, ISOamsa, ISOamsb, ISOamsc, ISOamsn, ISOamso,
10
+ # ISOamsr, ISObox, ISOcyr1, ISOcyr2, ISOdia, ISOgrk1, ISOgrk2,
11
+ # ISOgrk3, ISOgrk4, ISOlat1, ISOlat2, ISOnum, ISOpub, ISOtech,
12
+ # HTMLspecial, HTMLsymbol) to corresponding Unicode characters.
13
+ #
14
+ # The table has five tab-separated fields:
15
+ # :bare => SGML character entity name
16
+ # :hex => Unicode 2.0 character code
17
+ # :entity => SGML character entity
18
+ # :type => SGML public entity set
19
+ # :udesc => Unicode 2.0 character name (UPPER CASE)
20
+ #
21
+ # Entries which don't have Unicode equivalents have "0x????" for
22
+ # :hex and a lower case :udesc (from the public entity set DTD).
23
+ #
24
+ # For reasons I (flip) don't understand, the source file mapped
25
+ # &apos; to 0x02BC rather than its XML definition of 0x027. I've
26
+ # added a line specifying 0x027; the 'original' is commented out.
27
+ # http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
28
+ #
29
+ # The mapping is not reversible, because many distinctions are
30
+ # unified away in Unicode, particularly between mathematical
31
+ # symbols. To make it reversible, one symbol was arbitrarily chosen
32
+ # to encode from hex using these rules:
33
+ #
34
+ # * if it's also an XHTML 1.0 entity, use its XHTML reverse mapping.
35
+ # * otherwise, just use the first entity encountered,
36
+ # * avoiding the &b.foo; type entities
37
+ #
38
+ # The table is sorted case-blind by SGML character entity name.
39
+ #
40
+ # The contents of this table are drawn from various sources, and are
41
+ # in the public domain.
42
+ #
43
+ MAPPINGS['expanded'] = {
44
+ 'Aacgr' => 0x0386, # Ά GREEK CAPITAL LETTER ALPHA WITH TONOS
45
+ 'aacgr' => 0x03ac, # ά GREEK SMALL LETTER ALPHA WITH TONOS
46
+ 'Aacute' => 0x00c1, # Á xhtml LATIN CAPITAL LETTER A WITH ACUTE
47
+ 'aacute' => 0x00e1, # á xhtml LATIN SMALL LETTER A WITH ACUTE
48
+ 'Abreve' => 0x0102, # Ă LATIN CAPITAL LETTER A WITH BREVE
49
+ 'abreve' => 0x0103, # ă LATIN SMALL LETTER A WITH BREVE
50
+ 'Acirc' => 0x00c2, # Â xhtml LATIN CAPITAL LETTER A WITH CIRCUMFLEX
51
+ 'acirc' => 0x00e2, # â xhtml LATIN SMALL LETTER A WITH CIRCUMFLEX
52
+ 'acute' => 0x00b4, # ´ xhtml ACUTE ACCENT
53
+ 'Acy' => 0x0410, # А CYRILLIC CAPITAL LETTER A
54
+ 'acy' => 0x0430, # а CYRILLIC SMALL LETTER A
55
+ 'AElig' => 0x00c6, # Æ xhtml LATIN CAPITAL LETTER AE
56
+ 'aelig' => 0x00e6, # æ xhtml LATIN SMALL LETTER AE
57
+ 'Agr' => 0x0391, # Α dup skip GREEK CAPITAL LETTER ALPHA
58
+ 'agr' => 0x03b1, # α dup skip GREEK SMALL LETTER ALPHA
59
+ 'Agrave' => 0x00c0, # À xhtml LATIN CAPITAL LETTER A WITH GRAVE
60
+ 'agrave' => 0x00e0, # à xhtml LATIN SMALL LETTER A WITH GRAVE
61
+ 'alefsym' => 0x2135, # ℵ dup xhtml ALEF SYMBOL
62
+ 'aleph' => 0x2135, # ℵ dup skip ALEF SYMBOL
63
+ 'Alpha' => 0x0391, # Α dup xhtml GREEK CAPITAL LETTER ALPHA
64
+ 'alpha' => 0x03b1, # α dup xhtml GREEK SMALL LETTER ALPHA
65
+ 'Amacr' => 0x0100, # Ā LATIN CAPITAL LETTER A WITH MACRON
66
+ 'amacr' => 0x0101, # ā LATIN SMALL LETTER A WITH MACRON
67
+ 'amalg' => 0x2210, # ∐ dup N-ARY COPRODUCT
68
+ 'amp' => 0x0026, # & xhtml AMPERSAND
69
+ 'and' => 0x2227, # ∧ xhtml LOGICAL AND
70
+ 'ang' => 0x2220, # ∠ xhtml ANGLE
71
+ 'ang90' => 0x221f, # ∟ RIGHT ANGLE
72
+ 'angmsd' => 0x2221, # ∡ MEASURED ANGLE
73
+ 'angsph' => 0x2222, # ∢ SPHERICAL ANGLE
74
+ 'angst' => 0x212b, # Å ANGSTROM SIGN
75
+ 'Aogon' => 0x0104, # Ą LATIN CAPITAL LETTER A WITH OGONEK
76
+ 'aogon' => 0x0105, # ą LATIN SMALL LETTER A WITH OGONEK
77
+ 'ap' => 0x2248, # ≈ dup skip ALMOST EQUAL TO
78
+ 'ape' => 0x224a, # ≊ ALMOST EQUAL OR EQUAL TO
79
+ 'apos' => 0x0027, # ' xhtml MODIFIER LETTER APOSTROPHE
80
+ 'Aring' => 0x00c5, # Å xhtml LATIN CAPITAL LETTER A WITH RING ABOVE
81
+ 'aring' => 0x00e5, # å xhtml LATIN SMALL LETTER A WITH RING ABOVE
82
+ 'ast' => 0x002a, # * ASTERISK
83
+ 'asymp' => 0x2248, # ≈ dup xhtml ALMOST EQUAL TO
84
+ 'Atilde' => 0x00c3, # Ã xhtml LATIN CAPITAL LETTER A WITH TILDE
85
+ 'atilde' => 0x00e3, # ã xhtml LATIN SMALL LETTER A WITH TILDE
86
+ 'Auml' => 0x00c4, # Ä xhtml LATIN CAPITAL LETTER A WITH DIAERESIS
87
+ 'auml' => 0x00e4, # ä xhtml LATIN SMALL LETTER A WITH DIAERESIS
88
+ 'b.alpha' => 0x03b1, # α dup skip GREEK SMALL LETTER ALPHA
89
+ 'b.beta' => 0x03b2, # β dup skip GREEK SMALL LETTER BETA
90
+ 'b.chi' => 0x03c7, # χ dup skip GREEK SMALL LETTER CHI
91
+ 'b.Delta' => 0x0394, # Δ dup skip GREEK CAPITAL LETTER DELTA
92
+ 'b.delta' => 0x03b4, # δ dup skip GREEK SMALL LETTER DELTA
93
+ 'b.epsi' => 0x03b5, # ε dup skip GREEK SMALL LETTER EPSILON
94
+ 'b.epsis' => 0x03b5, # ε dup skip GREEK SMALL LETTER EPSILON
95
+ 'b.epsiv' => 0x03b5, # ε dup skip GREEK SMALL LETTER EPSILON
96
+ 'b.eta' => 0x03b7, # η dup skip GREEK SMALL LETTER ETA
97
+ 'b.Gamma' => 0x0393, # Γ dup skip GREEK CAPITAL LETTER GAMMA
98
+ 'b.gamma' => 0x03b3, # γ dup skip GREEK SMALL LETTER GAMMA
99
+ 'b.gammad' => 0x03dc, # Ϝ dup skip GREEK LETTER DIGAMMA
100
+ 'b.iota' => 0x03b9, # ι dup skip GREEK SMALL LETTER IOTA
101
+ 'b.kappa' => 0x03ba, # κ dup skip GREEK SMALL LETTER KAPPA
102
+ 'b.kappav' => 0x03f0, # ϰ dup skip GREEK KAPPA SYMBOL
103
+ 'b.Lambda' => 0x039b, # Λ dup skip GREEK CAPITAL LETTER LAMDA
104
+ 'b.lambda' => 0x03bb, # λ dup skip GREEK SMALL LETTER LAMDA
105
+ 'b.mu' => 0x03bc, # μ dup skip GREEK SMALL LETTER MU
106
+ 'b.nu' => 0x03bd, # ν dup skip GREEK SMALL LETTER NU
107
+ 'b.Omega' => 0x03a9, # Ω dup skip GREEK CAPITAL LETTER OMEGA
108
+ 'b.omega' => 0x03ce, # ώ dup skip GREEK SMALL LETTER OMEGA WITH TONOS
109
+ 'b.Phi' => 0x03a6, # Φ dup skip GREEK CAPITAL LETTER PHI
110
+ 'b.phis' => 0x03c6, # φ dup skip GREEK SMALL LETTER PHI
111
+ 'b.phiv' => 0x03d5, # ϕ dup skip GREEK PHI SYMBOL
112
+ 'b.Pi' => 0x03a0, # Π dup skip GREEK CAPITAL LETTER PI
113
+ 'b.pi' => 0x03c0, # π dup skip GREEK SMALL LETTER PI
114
+ 'b.piv' => 0x03d6, # ϖ dup skip GREEK PI SYMBOL
115
+ 'b.Psi' => 0x03a8, # Ψ dup skip GREEK CAPITAL LETTER PSI
116
+ 'b.psi' => 0x03c8, # ψ dup skip GREEK SMALL LETTER PSI
117
+ 'b.rho' => 0x03c1, # ρ dup skip GREEK SMALL LETTER RHO
118
+ 'b.rhov' => 0x03f1, # ϱ dup skip GREEK RHO SYMBOL
119
+ 'b.Sigma' => 0x03a3, # Σ dup skip GREEK CAPITAL LETTER SIGMA
120
+ 'b.sigma' => 0x03c3, # σ dup skip GREEK SMALL LETTER SIGMA
121
+ 'b.sigmav' => 0x03c2, # ς dup skip GREEK SMALL LETTER FINAL SIGMA
122
+ 'b.tau' => 0x03c4, # τ dup skip GREEK SMALL LETTER TAU
123
+ 'b.Theta' => 0x0398, # Θ dup skip GREEK CAPITAL LETTER THETA
124
+ 'b.thetas' => 0x03b8, # θ dup skip GREEK SMALL LETTER THETA
125
+ 'b.thetav' => 0x03d1, # ϑ dup skip GREEK THETA SYMBOL
126
+ 'b.Upsi' => 0x03a5, # Υ dup skip GREEK CAPITAL LETTER UPSILON
127
+ 'b.upsi' => 0x03c5, # υ dup skip GREEK SMALL LETTER UPSILON
128
+ 'b.Xi' => 0x039e, # Ξ dup skip GREEK CAPITAL LETTER XI
129
+ 'b.xi' => 0x03be, # ξ dup skip GREEK SMALL LETTER XI
130
+ 'b.zeta' => 0x03b6, # ζ dup skip GREEK SMALL LETTER ZETA
131
+ 'barwed' => 0x22bc, # ⊼ NAND
132
+ 'Barwed' => 0x2306, # ⌆ PERSPECTIVE
133
+ 'bcong' => 0x224c, # ≌ ALL EQUAL TO
134
+ 'Bcy' => 0x0411, # Б CYRILLIC CAPITAL LETTER BE
135
+ 'bcy' => 0x0431, # б CYRILLIC SMALL LETTER BE
136
+ 'bdquo' => 0x201e, # „ dup xhtml DOUBLE LOW-9 QUOTATION MARK
137
+ 'becaus' => 0x2235, # ∵ BECAUSE
138
+ 'bepsi' => 0x220d, # ∍ SMALL CONTAINS AS MEMBER
139
+ 'bernou' => 0x212c, # ℬ SCRIPT CAPITAL B
140
+ 'Beta' => 0x0392, # Β dup xhtml GREEK CAPITAL LETTER BETA
141
+ 'beta' => 0x03b2, # β dup xhtml GREEK SMALL LETTER BETA
142
+ 'beth' => 0x2136, # ℶ BET SYMBOL
143
+ 'Bgr' => 0x0392, # Β dup skip GREEK CAPITAL LETTER BETA
144
+ 'bgr' => 0x03b2, # β dup skip GREEK SMALL LETTER BETA
145
+ 'blank' => 0x2423, # ␣ OPEN BOX
146
+ 'blk12' => 0x2592, # ▒ MEDIUM SHADE
147
+ 'blk14' => 0x2591, # ░ LIGHT SHADE
148
+ 'blk34' => 0x2593, # ▓ DARK SHADE
149
+ 'block' => 0x2588, # █ FULL BLOCK
150
+ 'bottom' => 0x22a5, # ⊥ dup skip UP TACK
151
+ 'bowtie' => 0x22c8, # ⋈ BOWTIE
152
+ 'boxdl' => 0x2510, # ┐ BOX DRAWINGS LIGHT DOWN AND LEFT
153
+ 'boxdL' => 0x2555, # ╕ BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
154
+ 'boxDl' => 0x2556, # ╖ BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
155
+ 'boxDL' => 0x2557, # ╗ BOX DRAWINGS DOUBLE DOWN AND LEFT
156
+ 'boxdr' => 0x250c, # ┌ BOX DRAWINGS LIGHT DOWN AND RIGHT
157
+ 'boxdR' => 0x2552, # ╒ BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
158
+ 'boxDr' => 0x2553, # ╓ BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
159
+ 'boxDR' => 0x2554, # ╔ BOX DRAWINGS DOUBLE DOWN AND RIGHT
160
+ 'boxh' => 0x2500, # ─ BOX DRAWINGS LIGHT HORIZONTAL
161
+ 'boxH' => 0x2550, # ═ BOX DRAWINGS DOUBLE HORIZONTAL
162
+ 'boxhd' => 0x252c, # ┬ BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
163
+ 'boxHd' => 0x2564, # ╤ BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
164
+ 'boxhD' => 0x2565, # ╥ BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
165
+ 'boxHD' => 0x2566, # ╦ BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
166
+ 'boxhu' => 0x2534, # ┴ BOX DRAWINGS LIGHT UP AND HORIZONTAL
167
+ 'boxHu' => 0x2567, # ╧ BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
168
+ 'boxhU' => 0x2568, # ╨ BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
169
+ 'boxHU' => 0x2569, # ╩ BOX DRAWINGS DOUBLE UP AND HORIZONTAL
170
+ 'boxul' => 0x2518, # ┘ BOX DRAWINGS LIGHT UP AND LEFT
171
+ 'boxuL' => 0x255b, # ╛ BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
172
+ 'boxUl' => 0x255c, # ╜ BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
173
+ 'boxUL' => 0x255d, # ╝ BOX DRAWINGS DOUBLE UP AND LEFT
174
+ 'boxur' => 0x2514, # └ BOX DRAWINGS LIGHT UP AND RIGHT
175
+ 'boxuR' => 0x2558, # ╘ BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
176
+ 'boxUr' => 0x2559, # ╙ BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
177
+ 'boxUR' => 0x255a, # ╚ BOX DRAWINGS DOUBLE UP AND RIGHT
178
+ 'boxv' => 0x2502, # │ BOX DRAWINGS LIGHT VERTICAL
179
+ 'boxV' => 0x2551, # ║ BOX DRAWINGS DOUBLE VERTICAL
180
+ 'boxvh' => 0x253c, # ┼ BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
181
+ 'boxvH' => 0x256a, # ╪ BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
182
+ 'boxVh' => 0x256b, # ╫ BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
183
+ 'boxVH' => 0x256c, # ╬ BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
184
+ 'boxvl' => 0x2524, # ┤ BOX DRAWINGS LIGHT VERTICAL AND LEFT
185
+ 'boxvL' => 0x2561, # ╡ BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
186
+ 'boxVl' => 0x2562, # ╢ BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
187
+ 'boxVL' => 0x2563, # ╣ BOX DRAWINGS DOUBLE VERTICAL AND LEFT
188
+ 'boxvr' => 0x251c, # ├ BOX DRAWINGS LIGHT VERTICAL AND RIGHT
189
+ 'boxvR' => 0x255e, # ╞ BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
190
+ 'boxVr' => 0x255f, # ╟ BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
191
+ 'boxVR' => 0x2560, # ╠ BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
192
+ 'bprime' => 0x2035, # ‵ REVERSED PRIME
193
+ 'breve' => 0x02d8, # ˘ BREVE
194
+ 'brvbar' => 0x00a6, # ¦ xhtml BROKEN BAR
195
+ 'bsim' => 0x223d, # ∽ REVERSED TILDE
196
+ 'bsime' => 0x22cd, # ⋍ REVERSED TILDE EQUALS
197
+ 'bsol' => 0x005c, # \ dup REVERSE SOLIDUS
198
+ 'bull' => 0x2022, # • xhtml BULLET
199
+ 'bump' => 0x224e, # ≎ GEOMETRICALLY EQUIVALENT TO
200
+ 'bumpe' => 0x224f, # ≏ DIFFERENCE BETWEEN
201
+ 'Cacute' => 0x0106, # Ć LATIN CAPITAL LETTER C WITH ACUTE
202
+ 'cacute' => 0x0107, # ć LATIN SMALL LETTER C WITH ACUTE
203
+ 'cap' => 0x2229, # ∩ xhtml INTERSECTION
204
+ 'Cap' => 0x22d2, # ⋒ DOUBLE INTERSECTION
205
+ 'caret' => 0x2041, # ⁁ CARET INSERTION POINT
206
+ 'caron' => 0x02c7, # ˇ CARON
207
+ 'Ccaron' => 0x010c, # Č LATIN CAPITAL LETTER C WITH CARON
208
+ 'ccaron' => 0x010d, # č LATIN SMALL LETTER C WITH CARON
209
+ 'Ccedil' => 0x00c7, # Ç xhtml LATIN CAPITAL LETTER C WITH CEDILLA
210
+ 'ccedil' => 0x00e7, # ç xhtml LATIN SMALL LETTER C WITH CEDILLA
211
+ 'Ccirc' => 0x0108, # Ĉ LATIN CAPITAL LETTER C WITH CIRCUMFLEX
212
+ 'ccirc' => 0x0109, # ĉ LATIN SMALL LETTER C WITH CIRCUMFLEX
213
+ 'Cdot' => 0x010a, # Ċ LATIN CAPITAL LETTER C WITH DOT ABOVE
214
+ 'cdot' => 0x010b, # ċ LATIN SMALL LETTER C WITH DOT ABOVE
215
+ 'cedil' => 0x00b8, # ¸ xhtml CEDILLA
216
+ 'cent' => 0x00a2, # ¢ xhtml CENT SIGN
217
+ 'CHcy' => 0x0427, # Ч CYRILLIC CAPITAL LETTER CHE
218
+ 'chcy' => 0x0447, # ч CYRILLIC SMALL LETTER CHE
219
+ 'check' => 0x2713, # ✓ CHECK MARK
220
+ 'Chi' => 0x03a7, # Χ dup xhtml GREEK CAPITAL LETTER CHI
221
+ 'chi' => 0x03c7, # χ dup xhtml GREEK SMALL LETTER CHI
222
+ 'cir' => 0x25cb, # ○ dup WHITE CIRCLE
223
+ 'circ' => 0x02c6, # ˆ xhtml MODIFIER LETTER CIRCUMFLEX ACCENT
224
+ 'cire' => 0x2257, # ≗ RING EQUAL TO
225
+ 'clubs' => 0x2663, # ♣ xhtml BLACK CLUB SUIT
226
+ 'colon' => 0x003a, # : COLON
227
+ 'colone' => 0x2254, # ≔ COLON EQUALS
228
+ 'comma' => 0x002c, # , COMMA
229
+ 'commat' => 0x0040, # @ COMMERCIAL AT
230
+ 'comp' => 0x2201, # ∁ COMPLEMENT
231
+ 'compfn' => 0x2218, # ∘ RING OPERATOR
232
+ 'cong' => 0x2245, # ≅ xhtml APPROXIMATELY EQUAL TO
233
+ 'conint' => 0x222e, # ∮ CONTOUR INTEGRAL
234
+ 'coprod' => 0x2210, # ∐ dup skip N-ARY COPRODUCT
235
+ 'copy' => 0x00a9, # © xhtml COPYRIGHT SIGN
236
+ 'copysr' => 0x2117, # ℗ SOUND RECORDING COPYRIGHT
237
+ 'crarr' => 0x21b5, # ↵ xhtml DOWNWARDS ARROW WITH CORNER LEFTWARDS
238
+ 'cross' => 0x2717, # ✗ BALLOT X
239
+ 'cuepr' => 0x22de, # ⋞ EQUAL TO OR PRECEDES
240
+ 'cuesc' => 0x22df, # ⋟ EQUAL TO OR SUCCEEDS
241
+ 'cularr' => 0x21b6, # ↶ ANTICLOCKWISE TOP SEMICIRCLE ARROW
242
+ 'cup' => 0x222a, # ∪ xhtml UNION
243
+ 'Cup' => 0x22d3, # ⋓ DOUBLE UNION
244
+ 'cupre' => 0x227c, # ≼ dup PRECEDES OR EQUAL TO
245
+ 'curarr' => 0x21b7, # ↷ CLOCKWISE TOP SEMICIRCLE ARROW
246
+ 'curren' => 0x00a4, # ¤ xhtml CURRENCY SIGN
247
+ 'cuvee' => 0x22ce, # ⋎ CURLY LOGICAL OR
248
+ 'cuwed' => 0x22cf, # ⋏ CURLY LOGICAL AND
249
+ 'dagger' => 0x2020, # † xhtml DAGGER
250
+ 'Dagger' => 0x2021, # ‡ xhtml DOUBLE DAGGER
251
+ 'daleth' => 0x2138, # ℸ DALET SYMBOL
252
+ 'darr' => 0x2193, # ↓ xhtml DOWNWARDS ARROW
253
+ 'dArr' => 0x21d3, # ⇓ xhtml DOWNWARDS DOUBLE ARROW
254
+ 'darr2' => 0x21ca, # ⇊ DOWNWARDS PAIRED ARROWS
255
+ 'dash' => 0x2010, # ‐ HYPHEN
256
+ 'dashv' => 0x22a3, # ⊣ LEFT TACK
257
+ 'dblac' => 0x02dd, # ˝ DOUBLE ACUTE ACCENT
258
+ 'Dcaron' => 0x010e, # Ď LATIN CAPITAL LETTER D WITH CARON
259
+ 'dcaron' => 0x010f, # ď LATIN SMALL LETTER D WITH CARON
260
+ 'Dcy' => 0x0414, # Д CYRILLIC CAPITAL LETTER DE
261
+ 'dcy' => 0x0434, # д CYRILLIC SMALL LETTER DE
262
+ 'deg' => 0x00b0, # ° xhtml DEGREE SIGN
263
+ 'Delta' => 0x0394, # Δ dup xhtml GREEK CAPITAL LETTER DELTA
264
+ 'delta' => 0x03b4, # δ dup xhtml GREEK SMALL LETTER DELTA
265
+ 'Dgr' => 0x0394, # Δ dup skip GREEK CAPITAL LETTER DELTA
266
+ 'dgr' => 0x03b4, # δ dup skip GREEK SMALL LETTER DELTA
267
+ 'dharl' => 0x21c3, # ⇃ DOWNWARDS HARPOON WITH BARB LEFTWARDS
268
+ 'dharr' => 0x21c2, # ⇂ DOWNWARDS HARPOON WITH BARB RIGHTWARDS
269
+ 'diam' => 0x22c4, # ⋄ DIAMOND OPERATOR
270
+ 'diams' => 0x2666, # ♦ xhtml BLACK DIAMOND SUIT
271
+ 'die' => 0x00a8, # ¨ dup skip DIAERESIS
272
+ 'divide' => 0x00f7, # ÷ xhtml DIVISION SIGN
273
+ 'divonx' => 0x22c7, # ⋇ DIVISION TIMES
274
+ 'DJcy' => 0x0402, # Ђ CYRILLIC CAPITAL LETTER DJE
275
+ 'djcy' => 0x0452, # ђ CYRILLIC SMALL LETTER DJE
276
+ 'dlarr' => 0x2199, # ↙ SOUTH WEST ARROW
277
+ 'dlcorn' => 0x231e, # ⌞ BOTTOM LEFT CORNER
278
+ 'dlcrop' => 0x230d, # ⌍ BOTTOM LEFT CROP
279
+ 'dollar' => 0x0024, # $ DOLLAR SIGN
280
+ 'Dot' => 0x00a8, # ¨ dup skip DIAERESIS
281
+ 'dot' => 0x02d9, # ˙ DOT ABOVE
282
+ 'DotDot' => 0x20dc, # ⃜ COMBINING FOUR DOTS ABOVE
283
+ 'drarr' => 0x2198, # ↘ SOUTH EAST ARROW
284
+ 'drcorn' => 0x231f, # ⌟ BOTTOM RIGHT CORNER
285
+ 'drcrop' => 0x230c, # ⌌ BOTTOM RIGHT CROP
286
+ 'DScy' => 0x0405, # Ѕ CYRILLIC CAPITAL LETTER DZE
287
+ 'dscy' => 0x0455, # ѕ CYRILLIC SMALL LETTER DZE
288
+ 'Dstrok' => 0x0110, # Đ LATIN CAPITAL LETTER D WITH STROKE
289
+ 'dstrok' => 0x0111, # đ LATIN SMALL LETTER D WITH STROKE
290
+ 'dtri' => 0x25bf, # ▿ WHITE DOWN-POINTING SMALL TRIANGLE
291
+ 'dtrif' => 0x25be, # ▾ BLACK DOWN-POINTING SMALL TRIANGLE
292
+ 'DZcy' => 0x040f, # Џ CYRILLIC CAPITAL LETTER DZHE
293
+ 'dzcy' => 0x045f, # џ CYRILLIC SMALL LETTER DZHE
294
+ 'Eacgr' => 0x0388, # Έ GREEK CAPITAL LETTER EPSILON WITH TONOS
295
+ 'eacgr' => 0x03ad, # έ GREEK SMALL LETTER EPSILON WITH TONOS
296
+ 'Eacute' => 0x00c9, # É xhtml LATIN CAPITAL LETTER E WITH ACUTE
297
+ 'eacute' => 0x00e9, # é xhtml LATIN SMALL LETTER E WITH ACUTE
298
+ 'Ecaron' => 0x011a, # Ě LATIN CAPITAL LETTER E WITH CARON
299
+ 'ecaron' => 0x011b, # ě LATIN SMALL LETTER E WITH CARON
300
+ 'ecir' => 0x2256, # ≖ RING IN EQUAL TO
301
+ 'Ecirc' => 0x00ca, # Ê xhtml LATIN CAPITAL LETTER E WITH CIRCUMFLEX
302
+ 'ecirc' => 0x00ea, # ê xhtml LATIN SMALL LETTER E WITH CIRCUMFLEX
303
+ 'ecolon' => 0x2255, # ≕ EQUALS COLON
304
+ 'Ecy' => 0x042d, # Э CYRILLIC CAPITAL LETTER E
305
+ 'ecy' => 0x044d, # э CYRILLIC SMALL LETTER E
306
+ 'Edot' => 0x0116, # Ė LATIN CAPITAL LETTER E WITH DOT ABOVE
307
+ 'edot' => 0x0117, # ė LATIN SMALL LETTER E WITH DOT ABOVE
308
+ 'eDot' => 0x2251, # ≑ GEOMETRICALLY EQUAL TO
309
+ 'EEacgr' => 0x0389, # Ή GREEK CAPITAL LETTER ETA WITH TONOS
310
+ 'eeacgr' => 0x03ae, # ή GREEK SMALL LETTER ETA WITH TONOS
311
+ 'EEgr' => 0x0397, # Η dup skip GREEK CAPITAL LETTER ETA
312
+ 'eegr' => 0x03b7, # η dup skip GREEK SMALL LETTER ETA
313
+ 'efDot' => 0x2252, # ≒ APPROXIMATELY EQUAL TO OR THE IMAGE OF
314
+ 'Egr' => 0x0395, # Ε dup skip GREEK CAPITAL LETTER EPSILON
315
+ 'egr' => 0x03b5, # ε dup skip GREEK SMALL LETTER EPSILON
316
+ 'Egrave' => 0x00c8, # È xhtml LATIN CAPITAL LETTER E WITH GRAVE
317
+ 'egrave' => 0x00e8, # è xhtml LATIN SMALL LETTER E WITH GRAVE
318
+ 'egs' => 0x22dd, # ⋝ EQUAL TO OR GREATER-THAN
319
+ 'ell' => 0x2113, # ℓ SCRIPT SMALL L
320
+ 'els' => 0x22dc, # ⋜ EQUAL TO OR LESS-THAN
321
+ 'Emacr' => 0x0112, # Ē LATIN CAPITAL LETTER E WITH MACRON
322
+ 'emacr' => 0x0113, # ē LATIN SMALL LETTER E WITH MACRON
323
+ 'empty' => 0x2205, # ∅ xhtml EMPTY SET
324
+ 'emsp' => 0x2003, #   xhtml EM SPACE
325
+ 'emsp13' => 0x2004, #   THREE-PER-EM SPACE
326
+ 'emsp14' => 0x2005, #   FOUR-PER-EM SPACE
327
+ 'ENG' => 0x014a, # Ŋ LATIN CAPITAL LETTER ENG
328
+ 'eng' => 0x014b, # ŋ LATIN SMALL LETTER ENG
329
+ 'ensp' => 0x2002, #   xhtml EN SPACE
330
+ 'Eogon' => 0x0118, # Ę LATIN CAPITAL LETTER E WITH OGONEK
331
+ 'eogon' => 0x0119, # ę LATIN SMALL LETTER E WITH OGONEK
332
+ 'epsi' => 0x03b5, # ε dup skip GREEK SMALL LETTER EPSILON
333
+ 'Epsilon' => 0x0395, # Ε dup xhtml GREEK CAPITAL LETTER EPSILON
334
+ 'epsilon' => 0x03b5, # ε dup xhtml GREEK SMALL LETTER EPSILON
335
+ 'epsis' => 0x220a, # ∊ SMALL ELEMENT OF
336
+ 'equals' => 0x003d, # = EQUALS SIGN
337
+ 'equiv' => 0x2261, # ≡ xhtml IDENTICAL TO
338
+ 'erDot' => 0x2253, # ≓ IMAGE OF OR APPROXIMATELY EQUAL TO
339
+ 'esdot' => 0x2250, # ≐ APPROACHES THE LIMIT
340
+ 'Eta' => 0x0397, # Η dup xhtml GREEK CAPITAL LETTER ETA
341
+ 'eta' => 0x03b7, # η dup xhtml GREEK SMALL LETTER ETA
342
+ 'ETH' => 0x00d0, # Ð xhtml LATIN CAPITAL LETTER ETH
343
+ 'eth' => 0x00f0, # ð xhtml LATIN SMALL LETTER ETH
344
+ 'Euml' => 0x00cb, # Ë xhtml LATIN CAPITAL LETTER E WITH DIAERESIS
345
+ 'euml' => 0x00eb, # ë xhtml LATIN SMALL LETTER E WITH DIAERESIS
346
+ 'excl' => 0x0021, # ! EXCLAMATION MARK
347
+ 'exist' => 0x2203, # ∃ xhtml THERE EXISTS
348
+ 'Fcy' => 0x0424, # Ф CYRILLIC CAPITAL LETTER EF
349
+ 'fcy' => 0x0444, # ф CYRILLIC SMALL LETTER EF
350
+ 'female' => 0x2640, # ♀ FEMALE SIGN
351
+ 'ffilig' => 0xfb03, # ffi LATIN SMALL LIGATURE FFI
352
+ 'fflig' => 0xfb00, # ff LATIN SMALL LIGATURE FF
353
+ 'ffllig' => 0xfb04, # ffl LATIN SMALL LIGATURE FFL
354
+ 'filig' => 0xfb01, # fi LATIN SMALL LIGATURE FI
355
+ 'flat' => 0x266d, # ♭ MUSIC FLAT SIGN
356
+ 'fllig' => 0xfb02, # fl LATIN SMALL LIGATURE FL
357
+ 'fnof' => 0x0192, # ƒ xhtml LATIN SMALL LETTER F WITH HOOK
358
+ 'forall' => 0x2200, # ∀ xhtml FOR ALL
359
+ 'fork' => 0x22d4, # ⋔ PITCHFORK
360
+ 'frac12' => 0x00bd, # ½ dup xhtml VULGAR FRACTION ONE HALF
361
+ 'frac13' => 0x2153, # ⅓ VULGAR FRACTION ONE THIRD
362
+ 'frac14' => 0x00bc, # ¼ xhtml VULGAR FRACTION ONE QUARTER
363
+ 'frac15' => 0x2155, # ⅕ VULGAR FRACTION ONE FIFTH
364
+ 'frac16' => 0x2159, # ⅙ VULGAR FRACTION ONE SIXTH
365
+ 'frac18' => 0x215b, # ⅛ VULGAR FRACTION ONE EIGHTH
366
+ 'frac23' => 0x2154, # ⅔ VULGAR FRACTION TWO THIRDS
367
+ 'frac25' => 0x2156, # ⅖ VULGAR FRACTION TWO FIFTHS
368
+ 'frac34' => 0x00be, # ¾ xhtml VULGAR FRACTION THREE QUARTERS
369
+ 'frac35' => 0x2157, # ⅗ VULGAR FRACTION THREE FIFTHS
370
+ 'frac38' => 0x215c, # ⅜ VULGAR FRACTION THREE EIGHTHS
371
+ 'frac45' => 0x2158, # ⅘ VULGAR FRACTION FOUR FIFTHS
372
+ 'frac56' => 0x215a, # ⅚ VULGAR FRACTION FIVE SIXTHS
373
+ 'frac58' => 0x215d, # ⅝ VULGAR FRACTION FIVE EIGHTHS
374
+ 'frac78' => 0x215e, # ⅞ VULGAR FRACTION SEVEN EIGHTHS
375
+ 'frasl' => 0x2044, # ⁄ xhtml FRACTION SLASH
376
+ 'frown' => 0x2322, # ⌢ dup FROWN
377
+ 'gacute' => 0x01f5, # ǵ LATIN SMALL LETTER G WITH ACUTE
378
+ 'Gamma' => 0x0393, # Γ dup xhtml GREEK CAPITAL LETTER GAMMA
379
+ 'gamma' => 0x03b3, # γ dup xhtml GREEK SMALL LETTER GAMMA
380
+ 'gammad' => 0x03dc, # Ϝ dup GREEK LETTER DIGAMMA
381
+ 'Gbreve' => 0x011e, # Ğ LATIN CAPITAL LETTER G WITH BREVE
382
+ 'gbreve' => 0x011f, # ğ LATIN SMALL LETTER G WITH BREVE
383
+ 'Gcedil' => 0x0122, # Ģ LATIN CAPITAL LETTER G WITH CEDILLA
384
+ 'gcedil' => 0x0123, # ģ LATIN SMALL LETTER G WITH CEDILLA
385
+ 'Gcirc' => 0x011c, # Ĝ LATIN CAPITAL LETTER G WITH CIRCUMFLEX
386
+ 'gcirc' => 0x011d, # ĝ LATIN SMALL LETTER G WITH CIRCUMFLEX
387
+ 'Gcy' => 0x0413, # Г CYRILLIC CAPITAL LETTER GHE
388
+ 'gcy' => 0x0433, # г CYRILLIC SMALL LETTER GHE
389
+ 'Gdot' => 0x0120, # Ġ LATIN CAPITAL LETTER G WITH DOT ABOVE
390
+ 'gdot' => 0x0121, # ġ LATIN SMALL LETTER G WITH DOT ABOVE
391
+ 'ge' => 0x2265, # ≥ dup xhtml GREATER-THAN OR EQUAL TO
392
+ 'gE' => 0x2267, # ≧ GREATER-THAN OVER EQUAL TO
393
+ 'gel' => 0x22db, # ⋛ GREATER-THAN EQUAL TO OR LESS-THAN
394
+ 'ges' => 0x2265, # ≥ dup skip GREATER-THAN OR EQUAL TO
395
+ 'Gg' => 0x22d9, # ⋙ VERY MUCH GREATER-THAN
396
+ 'Ggr' => 0x0393, # Γ dup skip GREEK CAPITAL LETTER GAMMA
397
+ 'ggr' => 0x03b3, # γ dup skip GREEK SMALL LETTER GAMMA
398
+ 'gimel' => 0x2137, # ℷ GIMEL SYMBOL
399
+ 'GJcy' => 0x0403, # Ѓ CYRILLIC CAPITAL LETTER GJE
400
+ 'gjcy' => 0x0453, # ѓ CYRILLIC SMALL LETTER GJE
401
+ 'gl' => 0x2277, # ≷ GREATER-THAN OR LESS-THAN
402
+ 'gnE' => 0x2269, # ≩ dup GREATER-THAN BUT NOT EQUAL TO
403
+ 'gne' => 0x2269, # ≩ dup skip GREATER-THAN BUT NOT EQUAL TO
404
+ 'gnsim' => 0x22e7, # ⋧ GREATER-THAN BUT NOT EQUIVALENT TO
405
+ 'grave' => 0x0060, # ` GRAVE ACCENT
406
+ 'gsdot' => 0x22d7, # ⋗ GREATER-THAN WITH DOT
407
+ 'gsim' => 0x2273, # ≳ GREATER-THAN OR EQUIVALENT TO
408
+ 'gt' => 0x003e, # > xhtml GREATER-THAN SIGN
409
+ 'Gt' => 0x226b, # ≫ MUCH GREATER-THAN
410
+ 'gvnE' => 0x2269, # ≩ dup skip GREATER-THAN BUT NOT EQUAL TO
411
+ 'hairsp' => 0x200a, #   HAIR SPACE
412
+ 'half' => 0x00bd, # ½ dup skip VULGAR FRACTION ONE HALF
413
+ 'hamilt' => 0x210b, # ℋ SCRIPT CAPITAL H
414
+ 'HARDcy' => 0x042a, # Ъ CYRILLIC CAPITAL LETTER HARD SIGN
415
+ 'hardcy' => 0x044a, # ъ CYRILLIC SMALL LETTER HARD SIGN
416
+ 'harr' => 0x2194, # ↔ dup xhtml LEFT RIGHT ARROW
417
+ 'hArr' => 0x21d4, # ⇔ dup xhtml LEFT RIGHT DOUBLE ARROW
418
+ 'harrw' => 0x21ad, # ↭ LEFT RIGHT WAVE ARROW
419
+ 'Hcirc' => 0x0124, # Ĥ LATIN CAPITAL LETTER H WITH CIRCUMFLEX
420
+ 'hcirc' => 0x0125, # ĥ LATIN SMALL LETTER H WITH CIRCUMFLEX
421
+ 'hearts' => 0x2665, # ♥ xhtml BLACK HEART SUIT
422
+ 'hellip' => 0x2026, # … dup xhtml HORIZONTAL ELLIPSIS
423
+ 'horbar' => 0x2015, # ― HORIZONTAL BAR
424
+ 'Hstrok' => 0x0126, # Ħ LATIN CAPITAL LETTER H WITH STROKE
425
+ 'hstrok' => 0x0127, # ħ LATIN SMALL LETTER H WITH STROKE
426
+ 'hybull' => 0x2043, # ⁃ HYPHEN BULLET
427
+ 'hyphen' => 0x002d, # - HYPHEN-MINUS
428
+ 'Iacgr' => 0x038a, # Ί GREEK CAPITAL LETTER IOTA WITH TONOS
429
+ 'iacgr' => 0x03af, # ί GREEK SMALL LETTER IOTA WITH TONOS
430
+ 'Iacute' => 0x00cd, # Í xhtml LATIN CAPITAL LETTER I WITH ACUTE
431
+ 'iacute' => 0x00ed, # í xhtml LATIN SMALL LETTER I WITH ACUTE
432
+ 'Icirc' => 0x00ce, # Î xhtml LATIN CAPITAL LETTER I WITH CIRCUMFLEX
433
+ 'icirc' => 0x00ee, # î xhtml LATIN SMALL LETTER I WITH CIRCUMFLEX
434
+ 'Icy' => 0x0418, # И CYRILLIC CAPITAL LETTER I
435
+ 'icy' => 0x0438, # и CYRILLIC SMALL LETTER I
436
+ 'idiagr' => 0x0390, # ΐ GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
437
+ 'Idigr' => 0x03aa, # Ϊ GREEK CAPITAL LETTER IOTA WITH DIALYTIKA
438
+ 'idigr' => 0x03ca, # ϊ GREEK SMALL LETTER IOTA WITH DIALYTIKA
439
+ 'Idot' => 0x0130, # İ LATIN CAPITAL LETTER I WITH DOT ABOVE
440
+ 'IEcy' => 0x0415, # Е CYRILLIC CAPITAL LETTER IE
441
+ 'iecy' => 0x0435, # е CYRILLIC SMALL LETTER IE
442
+ 'iexcl' => 0x00a1, # ¡ xhtml INVERTED EXCLAMATION MARK
443
+ 'iff' => 0x21d4, # ⇔ dup skip LEFT RIGHT DOUBLE ARROW
444
+ 'Igr' => 0x0399, # Ι dup skip GREEK CAPITAL LETTER IOTA
445
+ 'igr' => 0x03b9, # ι dup skip GREEK SMALL LETTER IOTA
446
+ 'Igrave' => 0x00cc, # Ì xhtml LATIN CAPITAL LETTER I WITH GRAVE
447
+ 'igrave' => 0x00ec, # ì xhtml LATIN SMALL LETTER I WITH GRAVE
448
+ 'IJlig' => 0x0132, # IJ LATIN CAPITAL LIGATURE IJ
449
+ 'ijlig' => 0x0133, # ij LATIN SMALL LIGATURE IJ
450
+ 'Imacr' => 0x012a, # Ī LATIN CAPITAL LETTER I WITH MACRON
451
+ 'imacr' => 0x012b, # ī LATIN SMALL LETTER I WITH MACRON
452
+ 'image' => 0x2111, # ℑ xhtml BLACK-LETTER CAPITAL I
453
+ 'incare' => 0x2105, # ℅ CARE OF
454
+ 'infin' => 0x221e, # ∞ xhtml INFINITY
455
+ 'inodot' => 0x0131, # ı dup LATIN SMALL LETTER DOTLESS I
456
+ 'inodot' => 0x0131, # ı dup LATIN SMALL LETTER DOTLESS I
457
+ 'int' => 0x222b, # ∫ xhtml INTEGRAL
458
+ 'intcal' => 0x22ba, # ⊺ INTERCALATE
459
+ 'IOcy' => 0x0401, # Ё CYRILLIC CAPITAL LETTER IO
460
+ 'iocy' => 0x0451, # ё CYRILLIC SMALL LETTER IO
461
+ 'Iogon' => 0x012e, # Į LATIN CAPITAL LETTER I WITH OGONEK
462
+ 'iogon' => 0x012f, # į LATIN SMALL LETTER I WITH OGONEK
463
+ 'Iota' => 0x0399, # Ι dup xhtml GREEK CAPITAL LETTER IOTA
464
+ 'iota' => 0x03b9, # ι dup xhtml GREEK SMALL LETTER IOTA
465
+ 'iquest' => 0x00bf, # ¿ xhtml INVERTED QUESTION MARK
466
+ 'isin' => 0x2208, # ∈ xhtml ELEMENT OF
467
+ 'Itilde' => 0x0128, # Ĩ LATIN CAPITAL LETTER I WITH TILDE
468
+ 'itilde' => 0x0129, # ĩ LATIN SMALL LETTER I WITH TILDE
469
+ 'Iukcy' => 0x0406, # І CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
470
+ 'iukcy' => 0x0456, # і CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
471
+ 'Iuml' => 0x00cf, # Ï xhtml LATIN CAPITAL LETTER I WITH DIAERESIS
472
+ 'iuml' => 0x00ef, # ï xhtml LATIN SMALL LETTER I WITH DIAERESIS
473
+ 'Jcirc' => 0x0134, # Ĵ LATIN CAPITAL LETTER J WITH CIRCUMFLEX
474
+ 'jcirc' => 0x0135, # ĵ LATIN SMALL LETTER J WITH CIRCUMFLEX
475
+ 'Jcy' => 0x0419, # Й CYRILLIC CAPITAL LETTER SHORT I
476
+ 'jcy' => 0x0439, # й CYRILLIC SMALL LETTER SHORT I
477
+ 'Jsercy' => 0x0408, # Ј CYRILLIC CAPITAL LETTER JE
478
+ 'jsercy' => 0x0458, # ј CYRILLIC SMALL LETTER JE
479
+ 'Jukcy' => 0x0404, # Є CYRILLIC CAPITAL LETTER UKRAINIAN IE
480
+ 'jukcy' => 0x0454, # є CYRILLIC SMALL LETTER UKRAINIAN IE
481
+ 'Kappa' => 0x039a, # Κ dup xhtml GREEK CAPITAL LETTER KAPPA
482
+ 'kappa' => 0x03ba, # κ dup xhtml GREEK SMALL LETTER KAPPA
483
+ 'kappav' => 0x03f0, # ϰ dup GREEK KAPPA SYMBOL
484
+ 'Kcedil' => 0x0136, # Ķ LATIN CAPITAL LETTER K WITH CEDILLA
485
+ 'kcedil' => 0x0137, # ķ LATIN SMALL LETTER K WITH CEDILLA
486
+ 'Kcy' => 0x041a, # К CYRILLIC CAPITAL LETTER KA
487
+ 'kcy' => 0x043a, # к CYRILLIC SMALL LETTER KA
488
+ 'Kgr' => 0x039a, # Κ dup skip GREEK CAPITAL LETTER KAPPA
489
+ 'kgr' => 0x03ba, # κ dup skip GREEK SMALL LETTER KAPPA
490
+ 'kgreen' => 0x0138, # ĸ LATIN SMALL LETTER KRA
491
+ 'KHcy' => 0x0425, # Х CYRILLIC CAPITAL LETTER HA
492
+ 'khcy' => 0x0445, # х CYRILLIC SMALL LETTER HA
493
+ 'KHgr' => 0x03a7, # Χ dup skip GREEK CAPITAL LETTER CHI
494
+ 'khgr' => 0x03c7, # χ dup skip GREEK SMALL LETTER CHI
495
+ 'KJcy' => 0x040c, # Ќ CYRILLIC CAPITAL LETTER KJE
496
+ 'kjcy' => 0x045c, # ќ CYRILLIC SMALL LETTER KJE
497
+ 'lAarr' => 0x21da, # ⇚ LEFTWARDS TRIPLE ARROW
498
+ 'Lacute' => 0x0139, # Ĺ LATIN CAPITAL LETTER L WITH ACUTE
499
+ 'lacute' => 0x013a, # ĺ LATIN SMALL LETTER L WITH ACUTE
500
+ 'lagran' => 0x2112, # ℒ SCRIPT CAPITAL L
501
+ 'Lambda' => 0x039b, # Λ dup xhtml GREEK CAPITAL LETTER LAMDA
502
+ 'lambda' => 0x03bb, # λ dup xhtml GREEK SMALL LETTER LAMDA
503
+ 'lang' => 0x2329, # 〈 xhtml LEFT-POINTING ANGLE BRACKET
504
+ 'laquo' => 0x00ab, # « xhtml LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
505
+ 'larr' => 0x2190, # ← xhtml LEFTWARDS ARROW
506
+ 'Larr' => 0x219e, # ↞ LEFTWARDS TWO HEADED ARROW
507
+ 'lArr' => 0x21d0, # ⇐ dup xhtml LEFTWARDS DOUBLE ARROW
508
+ 'larr2' => 0x21c7, # ⇇ LEFTWARDS PAIRED ARROWS
509
+ 'larrhk' => 0x21a9, # ↩ LEFTWARDS ARROW WITH HOOK
510
+ 'larrlp' => 0x21ab, # ↫ LEFTWARDS ARROW WITH LOOP
511
+ 'larrtl' => 0x21a2, # ↢ LEFTWARDS ARROW WITH TAIL
512
+ 'Lcaron' => 0x013d, # Ľ LATIN CAPITAL LETTER L WITH CARON
513
+ 'lcaron' => 0x013e, # ľ LATIN SMALL LETTER L WITH CARON
514
+ 'Lcedil' => 0x013b, # Ļ LATIN CAPITAL LETTER L WITH CEDILLA
515
+ 'lcedil' => 0x013c, # ļ LATIN SMALL LETTER L WITH CEDILLA
516
+ 'lceil' => 0x2308, # ⌈ xhtml LEFT CEILING
517
+ 'lcub' => 0x007b, # { LEFT CURLY BRACKET
518
+ 'Lcy' => 0x041b, # Л CYRILLIC CAPITAL LETTER EL
519
+ 'lcy' => 0x043b, # л CYRILLIC SMALL LETTER EL
520
+ 'ldot' => 0x22d6, # ⋖ LESS-THAN WITH DOT
521
+ 'ldquo' => 0x201c, # “ dup xhtml LEFT DOUBLE QUOTATION MARK
522
+ 'ldquor' => 0x201e, # „ dup skip DOUBLE LOW-9 QUOTATION MARK
523
+ 'le' => 0x2264, # ≤ dup xhtml LESS-THAN OR EQUAL TO
524
+ 'lE' => 0x2266, # ≦ LESS-THAN OVER EQUAL TO
525
+ 'leg' => 0x22da, # ⋚ LESS-THAN EQUAL TO OR GREATER-THAN
526
+ 'les' => 0x2264, # ≤ dup skip LESS-THAN OR EQUAL TO
527
+ 'lfloor' => 0x230a, # ⌊ xhtml LEFT FLOOR
528
+ 'lg' => 0x2276, # ≶ LESS-THAN OR GREATER-THAN
529
+ 'Lgr' => 0x039b, # Λ dup skip GREEK CAPITAL LETTER LAMDA
530
+ 'lgr' => 0x03bb, # λ dup skip GREEK SMALL LETTER LAMDA
531
+ 'lhard' => 0x21bd, # ↽ LEFTWARDS HARPOON WITH BARB DOWNWARDS
532
+ 'lharu' => 0x21bc, # ↼ LEFTWARDS HARPOON WITH BARB UPWARDS
533
+ 'lhblk' => 0x2584, # ▄ LOWER HALF BLOCK
534
+ 'LJcy' => 0x0409, # Љ CYRILLIC CAPITAL LETTER LJE
535
+ 'ljcy' => 0x0459, # љ CYRILLIC SMALL LETTER LJE
536
+ 'Ll' => 0x22d8, # ⋘ VERY MUCH LESS-THAN
537
+ 'Lmidot' => 0x013f, # Ŀ LATIN CAPITAL LETTER L WITH MIDDLE DOT
538
+ 'lmidot' => 0x0140, # ŀ LATIN SMALL LETTER L WITH MIDDLE DOT
539
+ 'lnE' => 0x2268, # ≨ dup LESS-THAN BUT NOT EQUAL TO
540
+ 'lne' => 0x2268, # ≨ dup skip LESS-THAN BUT NOT EQUAL TO
541
+ 'lnsim' => 0x22e6, # ⋦ LESS-THAN BUT NOT EQUIVALENT TO
542
+ 'lowast' => 0x2217, # ∗ xhtml ASTERISK OPERATOR
543
+ 'lowbar' => 0x005f, # _ LOW LINE
544
+ 'loz' => 0x25ca, # ◊ xhtml LOZENGE
545
+ 'lozf' => 0x2726, # ✦ BLACK FOUR POINTED STAR
546
+ 'lpar' => 0x0028, # ( LEFT PARENTHESIS
547
+ 'lrarr2' => 0x21c6, # ⇆ LEFTWARDS ARROW OVER RIGHTWARDS ARROW
548
+ 'lrhar2' => 0x21cb, # ⇋ LEFTWARDS HARPOON OVER RIGHTWARDS HARPOON
549
+ 'lrm' => 0x200e, # ‎ xhtml LEFT-TO-RIGHT MARK
550
+ 'lsaquo' => 0x2039, # ‹ xhtml SINGLE LEFT-POINTING ANGLE QUOTATION MARK
551
+ 'lsh' => 0x21b0, # ↰ UPWARDS ARROW WITH TIP LEFTWARDS
552
+ 'lsim' => 0x2272, # ≲ LESS-THAN OR EQUIVALENT TO
553
+ 'lsqb' => 0x005b, # [ LEFT SQUARE BRACKET
554
+ 'lsquo' => 0x2018, # ‘ dup xhtml LEFT SINGLE QUOTATION MARK
555
+ 'lsquor' => 0x201a, # ‚ dup skip SINGLE LOW-9 QUOTATION MARK
556
+ 'Lstrok' => 0x0141, # Ł LATIN CAPITAL LETTER L WITH STROKE
557
+ 'lstrok' => 0x0142, # ł LATIN SMALL LETTER L WITH STROKE
558
+ 'lt' => 0x003c, # < xhtml LESS-THAN SIGN
559
+ 'Lt' => 0x226a, # ≪ MUCH LESS-THAN
560
+ 'lthree' => 0x22cb, # ⋋ LEFT SEMIDIRECT PRODUCT
561
+ 'ltimes' => 0x22c9, # ⋉ LEFT NORMAL FACTOR SEMIDIRECT PRODUCT
562
+ 'ltri' => 0x25c3, # ◃ WHITE LEFT-POINTING SMALL TRIANGLE
563
+ 'ltrie' => 0x22b4, # ⊴ NORMAL SUBGROUP OF OR EQUAL TO
564
+ 'ltrif' => 0x25c2, # ◂ BLACK LEFT-POINTING SMALL TRIANGLE
565
+ 'lvnE' => 0x2268, # ≨ dup skip LESS-THAN BUT NOT EQUAL TO
566
+ 'macr' => 0x00af, # ¯ xhtml MACRON
567
+ 'male' => 0x2642, # ♂ MALE SIGN
568
+ 'malt' => 0x2720, # ✠ MALTESE CROSS
569
+ 'map' => 0x21a6, # ↦ RIGHTWARDS ARROW FROM BAR
570
+ 'marker' => 0x25ae, # ▮ BLACK VERTICAL RECTANGLE
571
+ 'Mcy' => 0x041c, # М CYRILLIC CAPITAL LETTER EM
572
+ 'mcy' => 0x043c, # м CYRILLIC SMALL LETTER EM
573
+ 'mdash' => 0x2014, # — xhtml EM DASH
574
+ 'Mgr' => 0x039c, # Μ dup skip GREEK CAPITAL LETTER MU
575
+ 'mgr' => 0x03bc, # μ dup skip GREEK SMALL LETTER MU
576
+ 'micro' => 0x00b5, # µ xhtml MICRO SIGN
577
+ 'mid' => 0x2223, # ∣ DIVIDES
578
+ 'middot' => 0x00b7, # · xhtml MIDDLE DOT
579
+ 'minus' => 0x2212, # − xhtml MINUS SIGN
580
+ 'minusb' => 0x229f, # ⊟ SQUARED MINUS
581
+ 'mldr' => 0x2026, # … dup skip HORIZONTAL ELLIPSIS
582
+ 'mnplus' => 0x2213, # ∓ MINUS-OR-PLUS SIGN
583
+ 'models' => 0x22a7, # ⊧ MODELS
584
+ 'Mu' => 0x039c, # Μ dup xhtml GREEK CAPITAL LETTER MU
585
+ 'mu' => 0x03bc, # μ dup xhtml GREEK SMALL LETTER MU
586
+ 'mumap' => 0x22b8, # ⊸ MULTIMAP
587
+ 'nabla' => 0x2207, # ∇ xhtml NABLA
588
+ 'Nacute' => 0x0143, # Ń LATIN CAPITAL LETTER N WITH ACUTE
589
+ 'nacute' => 0x0144, # ń LATIN SMALL LETTER N WITH ACUTE
590
+ 'nap' => 0x2249, # ≉ NOT ALMOST EQUAL TO
591
+ 'napos' => 0x0149, # ʼn LATIN SMALL LETTER N PRECEDED BY APOSTROPHE
592
+ 'natur' => 0x266e, # ♮ MUSIC NATURAL SIGN
593
+ 'nbsp' => 0x00a0, #   xhtml NO-BREAK SPACE
594
+ 'Ncaron' => 0x0147, # Ň LATIN CAPITAL LETTER N WITH CARON
595
+ 'ncaron' => 0x0148, # ň LATIN SMALL LETTER N WITH CARON
596
+ 'Ncedil' => 0x0145, # Ņ LATIN CAPITAL LETTER N WITH CEDILLA
597
+ 'ncedil' => 0x0146, # ņ LATIN SMALL LETTER N WITH CEDILLA
598
+ 'ncong' => 0x2247, # ≇ NEITHER APPROXIMATELY NOR ACTUALLY EQUAL TO
599
+ 'Ncy' => 0x041d, # Н CYRILLIC CAPITAL LETTER EN
600
+ 'ncy' => 0x043d, # н CYRILLIC SMALL LETTER EN
601
+ 'ndash' => 0x2013, # – xhtml EN DASH
602
+ 'ne' => 0x2260, # ≠ xhtml NOT EQUAL TO
603
+ 'nearr' => 0x2197, # ↗ NORTH EAST ARROW
604
+ 'nequiv' => 0x2262, # ≢ NOT IDENTICAL TO
605
+ 'nexist' => 0x2204, # ∄ THERE DOES NOT EXIST
606
+ 'nge' => 0x2271, # ≱ dup NEITHER GREATER-THAN NOR EQUAL TO
607
+ 'nges' => 0x2271, # ≱ dup skip NEITHER GREATER-THAN NOR EQUAL TO
608
+ 'Ngr' => 0x039d, # Ν dup skip GREEK CAPITAL LETTER NU
609
+ 'ngr' => 0x03bd, # ν dup skip GREEK SMALL LETTER NU
610
+ 'ngt' => 0x226f, # ≯ NOT GREATER-THAN
611
+ 'nharr' => 0x21ae, # ↮ LEFT RIGHT ARROW WITH STROKE
612
+ 'nhArr' => 0x21ce, # ⇎ LEFT RIGHT DOUBLE ARROW WITH STROKE
613
+ 'ni' => 0x220b, # ∋ xhtml CONTAINS AS MEMBER
614
+ 'NJcy' => 0x040a, # Њ CYRILLIC CAPITAL LETTER NJE
615
+ 'njcy' => 0x045a, # њ CYRILLIC SMALL LETTER NJE
616
+ 'nlarr' => 0x219a, # ↚ LEFTWARDS ARROW WITH STROKE
617
+ 'nlArr' => 0x21cd, # ⇍ LEFTWARDS DOUBLE ARROW WITH STROKE
618
+ 'nldr' => 0x2025, # ‥ TWO DOT LEADER
619
+ 'nle' => 0x2270, # ≰ dup NEITHER LESS-THAN NOR EQUAL TO
620
+ 'nles' => 0x2270, # ≰ dup skip NEITHER LESS-THAN NOR EQUAL TO
621
+ 'nlt' => 0x226e, # ≮ NOT LESS-THAN
622
+ 'nltri' => 0x22ea, # ⋪ NOT NORMAL SUBGROUP OF
623
+ 'nltrie' => 0x22ec, # ⋬ NOT NORMAL SUBGROUP OF OR EQUAL TO
624
+ 'nmid' => 0x2224, # ∤ DOES NOT DIVIDE
625
+ 'not' => 0x00ac, # ¬ xhtml NOT SIGN
626
+ 'notin' => 0x2209, # ∉ xhtml NOT AN ELEMENT OF
627
+ 'npar' => 0x2226, # ∦ dup NOT PARALLEL TO
628
+ 'npr' => 0x2280, # ⊀ DOES NOT PRECEDE
629
+ 'npre' => 0x22e0, # ⋠ DOES NOT PRECEDE OR EQUAL
630
+ 'nrarr' => 0x219b, # ↛ RIGHTWARDS ARROW WITH STROKE
631
+ 'nrArr' => 0x21cf, # ⇏ RIGHTWARDS DOUBLE ARROW WITH STROKE
632
+ 'nrtri' => 0x22eb, # ⋫ DOES NOT CONTAIN AS NORMAL SUBGROUP
633
+ 'nrtrie' => 0x22ed, # ⋭ DOES NOT CONTAIN AS NORMAL SUBGROUP OR EQUAL
634
+ 'nsc' => 0x2281, # ⊁ DOES NOT SUCCEED
635
+ 'nsce' => 0x22e1, # ⋡ DOES NOT SUCCEED OR EQUAL
636
+ 'nsim' => 0x2241, # ≁ NOT TILDE
637
+ 'nsime' => 0x2244, # ≄ NOT ASYMPTOTICALLY EQUAL TO
638
+ 'nspar' => 0x2226, # ∦ dup skip NOT PARALLEL TO
639
+ 'nsub' => 0x2284, # ⊄ xhtml NOT A SUBSET OF
640
+ 'nsubE' => 0x2288, # ⊈ dup NEITHER A SUBSET OF NOR EQUAL TO
641
+ 'nsube' => 0x2288, # ⊈ dup skip NEITHER A SUBSET OF NOR EQUAL TO
642
+ 'nsup' => 0x2285, # ⊅ NOT A SUPERSET OF
643
+ 'nsupE' => 0x2289, # ⊉ dup NEITHER A SUPERSET OF NOR EQUAL TO
644
+ 'nsupe' => 0x2289, # ⊉ dup skip NEITHER A SUPERSET OF NOR EQUAL TO
645
+ 'Ntilde' => 0x00d1, # Ñ xhtml LATIN CAPITAL LETTER N WITH TILDE
646
+ 'ntilde' => 0x00f1, # ñ xhtml LATIN SMALL LETTER N WITH TILDE
647
+ 'Nu' => 0x039d, # Ν dup xhtml GREEK CAPITAL LETTER NU
648
+ 'nu' => 0x03bd, # ν dup xhtml GREEK SMALL LETTER NU
649
+ 'num' => 0x0023, # # NUMBER SIGN
650
+ 'numero' => 0x2116, # № NUMERO SIGN
651
+ 'numsp' => 0x2007, #   FIGURE SPACE
652
+ 'nvdash' => 0x22ac, # ⊬ DOES NOT PROVE
653
+ 'nvDash' => 0x22ad, # ⊭ NOT TRUE
654
+ 'nVdash' => 0x22ae, # ⊮ DOES NOT FORCE
655
+ 'nVDash' => 0x22af, # ⊯ NEGATED DOUBLE VERTICAL BAR DOUBLE RIGHT TURNSTILE
656
+ 'nwarr' => 0x2196, # ↖ NORTH WEST ARROW
657
+ 'Oacgr' => 0x038c, # Ό GREEK CAPITAL LETTER OMICRON WITH TONOS
658
+ 'oacgr' => 0x03cc, # ό GREEK SMALL LETTER OMICRON WITH TONOS
659
+ 'Oacute' => 0x00d3, # Ó xhtml LATIN CAPITAL LETTER O WITH ACUTE
660
+ 'oacute' => 0x00f3, # ó xhtml LATIN SMALL LETTER O WITH ACUTE
661
+ 'oast' => 0x229b, # ⊛ CIRCLED ASTERISK OPERATOR
662
+ 'ocir' => 0x229a, # ⊚ CIRCLED RING OPERATOR
663
+ 'Ocirc' => 0x00d4, # Ô xhtml LATIN CAPITAL LETTER O WITH CIRCUMFLEX
664
+ 'ocirc' => 0x00f4, # ô xhtml LATIN SMALL LETTER O WITH CIRCUMFLEX
665
+ 'Ocy' => 0x041e, # О CYRILLIC CAPITAL LETTER O
666
+ 'ocy' => 0x043e, # о CYRILLIC SMALL LETTER O
667
+ 'odash' => 0x229d, # ⊝ CIRCLED DASH
668
+ 'Odblac' => 0x0150, # Ő LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
669
+ 'odblac' => 0x0151, # ő LATIN SMALL LETTER O WITH DOUBLE ACUTE
670
+ 'odot' => 0x2299, # ⊙ CIRCLED DOT OPERATOR
671
+ 'OElig' => 0x0152, # Πxhtml LATIN CAPITAL LIGATURE OE
672
+ 'oelig' => 0x0153, # œ xhtml LATIN SMALL LIGATURE OE
673
+ 'ogon' => 0x02db, # ˛ OGONEK
674
+ 'Ogr' => 0x039f, # Ο dup skip GREEK CAPITAL LETTER OMICRON
675
+ 'ogr' => 0x03bf, # ο dup skip GREEK SMALL LETTER OMICRON
676
+ 'Ograve' => 0x00d2, # Ò xhtml LATIN CAPITAL LETTER O WITH GRAVE
677
+ 'ograve' => 0x00f2, # ò xhtml LATIN SMALL LETTER O WITH GRAVE
678
+ 'OHacgr' => 0x038f, # Ώ GREEK CAPITAL LETTER OMEGA WITH TONOS
679
+ 'ohacgr' => 0x03ce, # ώ dup GREEK SMALL LETTER OMEGA WITH TONOS
680
+ 'OHgr' => 0x03a9, # Ω dup skip GREEK CAPITAL LETTER OMEGA
681
+ 'ohgr' => 0x03c9, # ω dup skip GREEK SMALL LETTER OMEGA
682
+ 'ohm' => 0x2126, # Ω OHM SIGN
683
+ 'olarr' => 0x21ba, # ↺ ANTICLOCKWISE OPEN CIRCLE ARROW
684
+ 'oline' => 0x203e, # ‾ xhtml OVERLINE
685
+ 'Omacr' => 0x014c, # Ō LATIN CAPITAL LETTER O WITH MACRON
686
+ 'omacr' => 0x014d, # ō LATIN SMALL LETTER O WITH MACRON
687
+ 'Omega' => 0x03a9, # Ω dup xhtml GREEK CAPITAL LETTER OMEGA
688
+ 'omega' => 0x03c9, # ω dup xhtml GREEK SMALL LETTER OMEGA
689
+ 'Omicron' => 0x039f, # Ο dup xhtml GREEK CAPITAL LETTER OMICRON
690
+ 'omicron' => 0x03bf, # ο dup xhtml GREEK SMALL LETTER OMICRON
691
+ 'ominus' => 0x2296, # ⊖ CIRCLED MINUS
692
+ 'oplus' => 0x2295, # ⊕ xhtml CIRCLED PLUS
693
+ 'or' => 0x2228, # ∨ xhtml LOGICAL OR
694
+ 'orarr' => 0x21bb, # ↻ CLOCKWISE OPEN CIRCLE ARROW
695
+ 'order' => 0x2134, # ℴ SCRIPT SMALL O
696
+ 'ordf' => 0x00aa, # ª xhtml FEMININE ORDINAL INDICATOR
697
+ 'ordm' => 0x00ba, # º xhtml MASCULINE ORDINAL INDICATOR
698
+ 'oS' => 0x24c8, # Ⓢ CIRCLED LATIN CAPITAL LETTER S
699
+ 'Oslash' => 0x00d8, # Ø xhtml LATIN CAPITAL LETTER O WITH STROKE
700
+ 'oslash' => 0x00f8, # ø xhtml LATIN SMALL LETTER O WITH STROKE
701
+ 'osol' => 0x2298, # ⊘ CIRCLED DIVISION SLASH
702
+ 'Otilde' => 0x00d5, # Õ xhtml LATIN CAPITAL LETTER O WITH TILDE
703
+ 'otilde' => 0x00f5, # õ xhtml LATIN SMALL LETTER O WITH TILDE
704
+ 'otimes' => 0x2297, # ⊗ xhtml CIRCLED TIMES
705
+ 'Ouml' => 0x00d6, # Ö xhtml LATIN CAPITAL LETTER O WITH DIAERESIS
706
+ 'ouml' => 0x00f6, # ö xhtml LATIN SMALL LETTER O WITH DIAERESIS
707
+ 'par' => 0x2225, # ∥ dup PARALLEL TO
708
+ 'para' => 0x00b6, # ¶ xhtml PILCROW SIGN
709
+ 'part' => 0x2202, # ∂ xhtml PARTIAL DIFFERENTIAL
710
+ 'Pcy' => 0x041f, # П CYRILLIC CAPITAL LETTER PE
711
+ 'pcy' => 0x043f, # п CYRILLIC SMALL LETTER PE
712
+ 'percnt' => 0x0025, # % PERCENT SIGN
713
+ 'period' => 0x002e, # . FULL STOP
714
+ 'permil' => 0x2030, # ‰ xhtml PER MILLE SIGN
715
+ 'perp' => 0x22a5, # ⊥ dup xhtml UP TACK
716
+ 'Pgr' => 0x03a0, # Π dup skip GREEK CAPITAL LETTER PI
717
+ 'pgr' => 0x03c0, # π dup skip GREEK SMALL LETTER PI
718
+ 'PHgr' => 0x03a6, # Φ dup skip GREEK CAPITAL LETTER PHI
719
+ 'phgr' => 0x03c6, # φ dup skip GREEK SMALL LETTER PHI
720
+ 'Phi' => 0x03a6, # Φ dup xhtml GREEK CAPITAL LETTER PHI
721
+ 'phi' => 0x03c6, # φ dup xhtml GREEK SMALL LETTER PHI
722
+ 'phis' => 0x03c6, # φ dup skip GREEK SMALL LETTER PHI
723
+ 'phiv' => 0x03d5, # ϕ dup GREEK PHI SYMBOL
724
+ 'phmmat' => 0x2133, # ℳ SCRIPT CAPITAL M
725
+ 'phone' => 0x260e, # ☎ BLACK TELEPHONE
726
+ 'Pi' => 0x03a0, # Π dup xhtml GREEK CAPITAL LETTER PI
727
+ 'pi' => 0x03c0, # π dup xhtml GREEK SMALL LETTER PI
728
+ 'piv' => 0x03d6, # ϖ dup xhtml GREEK PI SYMBOL
729
+ 'planck' => 0x210f, # ℏ PLANCK CONSTANT OVER TWO PI
730
+ 'plus' => 0x002b, # + PLUS SIGN
731
+ 'plusb' => 0x229e, # ⊞ SQUARED PLUS
732
+ 'plusdo' => 0x2214, # ∔ DOT PLUS
733
+ 'plusmn' => 0x00b1, # ± xhtml PLUS-MINUS SIGN
734
+ 'pound' => 0x00a3, # £ xhtml POUND SIGN
735
+ 'pr' => 0x227a, # ≺ PRECEDES
736
+ 'pre' => 0x227c, # ≼ dup skip PRECEDES OR EQUAL TO
737
+ 'prime' => 0x2032, # ′ dup xhtml PRIME
738
+ 'Prime' => 0x2033, # ″ xhtml DOUBLE PRIME
739
+ 'prnsim' => 0x22e8, # ⋨ PRECEDES BUT NOT EQUIVALENT TO
740
+ 'prod' => 0x220f, # ∏ xhtml N-ARY PRODUCT
741
+ 'prop' => 0x221d, # ∝ dup xhtml PROPORTIONAL TO
742
+ 'prsim' => 0x227e, # ≾ PRECEDES OR EQUIVALENT TO
743
+ 'PSgr' => 0x03a8, # Ψ dup skip GREEK CAPITAL LETTER PSI
744
+ 'psgr' => 0x03c8, # ψ dup skip GREEK SMALL LETTER PSI
745
+ 'Psi' => 0x03a8, # Ψ dup xhtml GREEK CAPITAL LETTER PSI
746
+ 'psi' => 0x03c8, # ψ dup xhtml GREEK SMALL LETTER PSI
747
+ 'puncsp' => 0x2008, #   PUNCTUATION SPACE
748
+ 'quest' => 0x003f, # ? QUESTION MARK
749
+ 'quot' => 0x0022, # " xhtml QUOTATION MARK
750
+ 'rAarr' => 0x21db, # ⇛ RIGHTWARDS TRIPLE ARROW
751
+ 'Racute' => 0x0154, # Ŕ LATIN CAPITAL LETTER R WITH ACUTE
752
+ 'racute' => 0x0155, # ŕ LATIN SMALL LETTER R WITH ACUTE
753
+ 'radic' => 0x221a, # √ xhtml SQUARE ROOT
754
+ 'rang' => 0x232a, # 〉 xhtml RIGHT-POINTING ANGLE BRACKET
755
+ 'raquo' => 0x00bb, # » xhtml RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
756
+ 'rarr' => 0x2192, # → xhtml RIGHTWARDS ARROW
757
+ 'Rarr' => 0x21a0, # ↠ RIGHTWARDS TWO HEADED ARROW
758
+ 'rArr' => 0x21d2, # ⇒ dup xhtml RIGHTWARDS DOUBLE ARROW
759
+ 'rarr2' => 0x21c9, # ⇉ RIGHTWARDS PAIRED ARROWS
760
+ 'rarrhk' => 0x21aa, # ↪ RIGHTWARDS ARROW WITH HOOK
761
+ 'rarrlp' => 0x21ac, # ↬ RIGHTWARDS ARROW WITH LOOP
762
+ 'rarrtl' => 0x21a3, # ↣ RIGHTWARDS ARROW WITH TAIL
763
+ 'rarrw' => 0x219d, # ↝ RIGHTWARDS WAVE ARROW
764
+ 'Rcaron' => 0x0158, # Ř LATIN CAPITAL LETTER R WITH CARON
765
+ 'rcaron' => 0x0159, # ř LATIN SMALL LETTER R WITH CARON
766
+ 'Rcedil' => 0x0156, # Ŗ LATIN CAPITAL LETTER R WITH CEDILLA
767
+ 'rcedil' => 0x0157, # ŗ LATIN SMALL LETTER R WITH CEDILLA
768
+ 'rceil' => 0x2309, # ⌉ xhtml RIGHT CEILING
769
+ 'rcub' => 0x007d, # } RIGHT CURLY BRACKET
770
+ 'Rcy' => 0x0420, # Р CYRILLIC CAPITAL LETTER ER
771
+ 'rcy' => 0x0440, # р CYRILLIC SMALL LETTER ER
772
+ 'rdquo' => 0x201d, # ” xhtml RIGHT DOUBLE QUOTATION MARK
773
+ 'rdquor' => 0x201c, # “ dup skip LEFT DOUBLE QUOTATION MARK
774
+ 'real' => 0x211c, # ℜ xhtml BLACK-LETTER CAPITAL R
775
+ 'rect' => 0x25ad, # ▭ WHITE RECTANGLE
776
+ 'reg' => 0x00ae, # ® xhtml REGISTERED SIGN
777
+ 'rfloor' => 0x230b, # ⌋ xhtml RIGHT FLOOR
778
+ 'Rgr' => 0x03a1, # Ρ dup skip GREEK CAPITAL LETTER RHO
779
+ 'rgr' => 0x03c1, # ρ dup skip GREEK SMALL LETTER RHO
780
+ 'rhard' => 0x21c1, # ⇁ RIGHTWARDS HARPOON WITH BARB DOWNWARDS
781
+ 'rharu' => 0x21c0, # ⇀ RIGHTWARDS HARPOON WITH BARB UPWARDS
782
+ 'Rho' => 0x03a1, # Ρ dup xhtml GREEK CAPITAL LETTER RHO
783
+ 'rho' => 0x03c1, # ρ dup xhtml GREEK SMALL LETTER RHO
784
+ 'rhov' => 0x03f1, # ϱ dup GREEK RHO SYMBOL
785
+ 'ring' => 0x02da, # ˚ RING ABOVE
786
+ 'rlarr2' => 0x21c4, # ⇄ RIGHTWARDS ARROW OVER LEFTWARDS ARROW
787
+ 'rlhar2' => 0x21cc, # ⇌ RIGHTWARDS HARPOON OVER LEFTWARDS HARPOON
788
+ 'rlm' => 0x200f, # ‏ xhtml RIGHT-TO-LEFT MARK
789
+ 'rpar' => 0x0029, # ) RIGHT PARENTHESIS
790
+ 'rsaquo' => 0x203a, # › xhtml SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
791
+ 'rsh' => 0x21b1, # ↱ UPWARDS ARROW WITH TIP RIGHTWARDS
792
+ 'rsqb' => 0x005d, # ] RIGHT SQUARE BRACKET
793
+ 'rsquo' => 0x2019, # ’ xhtml RIGHT SINGLE QUOTATION MARK
794
+ 'rsquor' => 0x2018, # ‘ dup skip LEFT SINGLE QUOTATION MARK
795
+ 'rthree' => 0x22cc, # ⋌ RIGHT SEMIDIRECT PRODUCT
796
+ 'rtimes' => 0x22ca, # ⋊ RIGHT NORMAL FACTOR SEMIDIRECT PRODUCT
797
+ 'rtri' => 0x25b9, # ▹ WHITE RIGHT-POINTING SMALL TRIANGLE
798
+ 'rtrie' => 0x22b5, # ⊵ CONTAINS AS NORMAL SUBGROUP OR EQUAL TO
799
+ 'rtrif' => 0x25b8, # ▸ BLACK RIGHT-POINTING SMALL TRIANGLE
800
+ 'rx' => 0x211e, # ℞ PRESCRIPTION TAKE
801
+ 'Sacute' => 0x015a, # Ś LATIN CAPITAL LETTER S WITH ACUTE
802
+ 'sacute' => 0x015b, # ś LATIN SMALL LETTER S WITH ACUTE
803
+ 'samalg' => 0x2210, # ∐ dup skip N-ARY COPRODUCT
804
+ 'sbquo' => 0x201a, # ‚ dup xhtml SINGLE LOW-9 QUOTATION MARK
805
+ 'sbsol' => 0x005c, # \ dup skip REVERSE SOLIDUS
806
+ 'sc' => 0x227b, # ≻ SUCCEEDS
807
+ 'Scaron' => 0x0160, # Š xhtml LATIN CAPITAL LETTER S WITH CARON
808
+ 'scaron' => 0x0161, # š xhtml LATIN SMALL LETTER S WITH CARON
809
+ 'sccue' => 0x227d, # ≽ dup SUCCEEDS OR EQUAL TO
810
+ 'sce' => 0x227d, # ≽ dup skip SUCCEEDS OR EQUAL TO
811
+ 'Scedil' => 0x015e, # Ş LATIN CAPITAL LETTER S WITH CEDILLA
812
+ 'scedil' => 0x015f, # ş LATIN SMALL LETTER S WITH CEDILLA
813
+ 'Scirc' => 0x015c, # Ŝ LATIN CAPITAL LETTER S WITH CIRCUMFLEX
814
+ 'scirc' => 0x015d, # ŝ LATIN SMALL LETTER S WITH CIRCUMFLEX
815
+ 'scnsim' => 0x22e9, # ⋩ SUCCEEDS BUT NOT EQUIVALENT TO
816
+ 'scsim' => 0x227f, # ≿ SUCCEEDS OR EQUIVALENT TO
817
+ 'Scy' => 0x0421, # С CYRILLIC CAPITAL LETTER ES
818
+ 'scy' => 0x0441, # с CYRILLIC SMALL LETTER ES
819
+ 'sdot' => 0x22c5, # ⋅ xhtml DOT OPERATOR
820
+ 'sdotb' => 0x22a1, # ⊡ SQUARED DOT OPERATOR
821
+ 'sect' => 0x00a7, # § xhtml SECTION SIGN
822
+ 'semi' => 0x003b, # ; SEMICOLON
823
+ 'setmn' => 0x2216, # ∖ dup SET MINUS
824
+ 'sext' => 0x2736, # ✶ SIX POINTED BLACK STAR
825
+ 'sfgr' => 0x03c2, # ς dup skip GREEK SMALL LETTER FINAL SIGMA
826
+ 'sfrown' => 0x2322, # ⌢ dup skip FROWN
827
+ 'Sgr' => 0x03a3, # Σ dup skip GREEK CAPITAL LETTER SIGMA
828
+ 'sgr' => 0x03c3, # σ dup skip GREEK SMALL LETTER SIGMA
829
+ 'sharp' => 0x266f, # ♯ MUSIC SHARP SIGN
830
+ 'SHCHcy' => 0x0429, # Щ CYRILLIC CAPITAL LETTER SHCHA
831
+ 'shchcy' => 0x0449, # щ CYRILLIC SMALL LETTER SHCHA
832
+ 'SHcy' => 0x0428, # Ш CYRILLIC CAPITAL LETTER SHA
833
+ 'shcy' => 0x0448, # ш CYRILLIC SMALL LETTER SHA
834
+ 'shy' => 0x00ad, # ­ xhtml SOFT HYPHEN
835
+ 'Sigma' => 0x03a3, # Σ dup xhtml GREEK CAPITAL LETTER SIGMA
836
+ 'sigma' => 0x03c3, # σ dup xhtml GREEK SMALL LETTER SIGMA
837
+ 'sigmaf' => 0x03c2, # ς dup xhtml GREEK SMALL LETTER FINAL SIGMA
838
+ 'sigmav' => 0x03c2, # ς dup skip GREEK SMALL LETTER FINAL SIGMA
839
+ 'sim' => 0x223c, # ∼ dup xhtml TILDE OPERATOR
840
+ 'sime' => 0x2243, # ≃ ASYMPTOTICALLY EQUAL TO
841
+ 'smile' => 0x2323, # ⌣ dup SMILE
842
+ 'SOFTcy' => 0x042c, # Ь CYRILLIC CAPITAL LETTER SOFT SIGN
843
+ 'softcy' => 0x044c, # ь CYRILLIC SMALL LETTER SOFT SIGN
844
+ 'sol' => 0x002f, # / SOLIDUS
845
+ 'spades' => 0x2660, # ♠ xhtml BLACK SPADE SUIT
846
+ 'spar' => 0x2225, # ∥ dup skip PARALLEL TO
847
+ 'sqcap' => 0x2293, # ⊓ SQUARE CAP
848
+ 'sqcup' => 0x2294, # ⊔ SQUARE CUP
849
+ 'sqsub' => 0x228f, # ⊏ SQUARE IMAGE OF
850
+ 'sqsube' => 0x2291, # ⊑ SQUARE IMAGE OF OR EQUAL TO
851
+ 'sqsup' => 0x2290, # ⊐ SQUARE ORIGINAL OF
852
+ 'sqsupe' => 0x2292, # ⊒ SQUARE ORIGINAL OF OR EQUAL TO
853
+ 'squ' => 0x25a1, # □ dup WHITE SQUARE
854
+ 'square' => 0x25a1, # □ dup skip WHITE SQUARE
855
+ 'squf' => 0x25aa, # ▪ BLACK SMALL SQUARE
856
+ 'ssetmn' => 0x2216, # ∖ dup skip SET MINUS
857
+ 'ssmile' => 0x2323, # ⌣ dup skip SMILE
858
+ 'sstarf' => 0x22c6, # ⋆ STAR OPERATOR
859
+ 'star' => 0x2606, # ☆ WHITE STAR
860
+ 'starf' => 0x2605, # ★ BLACK STAR
861
+ 'sub' => 0x2282, # ⊂ xhtml SUBSET OF
862
+ 'Sub' => 0x22d0, # ⋐ DOUBLE SUBSET
863
+ 'subE' => 0x2286, # ⊆ dup skip SUBSET OF OR EQUAL TO
864
+ 'sube' => 0x2286, # ⊆ dup xhtml SUBSET OF OR EQUAL TO
865
+ 'subnE' => 0x228a, # ⊊ dup SUBSET OF WITH NOT EQUAL TO
866
+ 'subne' => 0x228a, # ⊊ dup skip SUBSET OF WITH NOT EQUAL TO
867
+ 'sum' => 0x2211, # ∑ xhtml N-ARY SUMMATION
868
+ 'sung' => 0x266a, # ♪ EIGHTH NOTE
869
+ 'sup' => 0x2283, # ⊃ xhtml SUPERSET OF
870
+ 'Sup' => 0x22d1, # ⋑ DOUBLE SUPERSET
871
+ 'sup1' => 0x00b9, # ¹ xhtml SUPERSCRIPT ONE
872
+ 'sup2' => 0x00b2, # ² xhtml SUPERSCRIPT TWO
873
+ 'sup3' => 0x00b3, # ³ xhtml SUPERSCRIPT THREE
874
+ 'supE' => 0x2287, # ⊇ dup skip SUPERSET OF OR EQUAL TO
875
+ 'supe' => 0x2287, # ⊇ dup xhtml SUPERSET OF OR EQUAL TO
876
+ 'supnE' => 0x228b, # ⊋ dup SUPERSET OF WITH NOT EQUAL TO
877
+ 'supne' => 0x228b, # ⊋ dup skip SUPERSET OF WITH NOT EQUAL TO
878
+ 'szlig' => 0x00df, # ß xhtml LATIN SMALL LETTER SHARP S
879
+ 'target' => 0x2316, # ⌖ POSITION INDICATOR
880
+ 'Tau' => 0x03a4, # Τ dup xhtml GREEK CAPITAL LETTER TAU
881
+ 'tau' => 0x03c4, # τ dup xhtml GREEK SMALL LETTER TAU
882
+ 'Tcaron' => 0x0164, # Ť LATIN CAPITAL LETTER T WITH CARON
883
+ 'tcaron' => 0x0165, # ť LATIN SMALL LETTER T WITH CARON
884
+ 'Tcedil' => 0x0162, # Ţ LATIN CAPITAL LETTER T WITH CEDILLA
885
+ 'tcedil' => 0x0163, # ţ LATIN SMALL LETTER T WITH CEDILLA
886
+ 'Tcy' => 0x0422, # Т CYRILLIC CAPITAL LETTER TE
887
+ 'tcy' => 0x0442, # т CYRILLIC SMALL LETTER TE
888
+ 'tdot' => 0x20db, # ⃛ COMBINING THREE DOTS ABOVE
889
+ 'telrec' => 0x2315, # ⌕ TELEPHONE RECORDER
890
+ 'Tgr' => 0x03a4, # Τ dup skip GREEK CAPITAL LETTER TAU
891
+ 'tgr' => 0x03c4, # τ dup skip GREEK SMALL LETTER TAU
892
+ 'there4' => 0x2234, # ∴ xhtml THEREFORE
893
+ 'Theta' => 0x0398, # Θ dup xhtml GREEK CAPITAL LETTER THETA
894
+ 'theta' => 0x03b8, # θ dup xhtml GREEK SMALL LETTER THETA
895
+ 'thetas' => 0x03b8, # θ dup skip GREEK SMALL LETTER THETA
896
+ 'thetasym' => 0x03d1, # ϑ dup xhtml GREEK THETA SYMBOL
897
+ 'thetav' => 0x03d1, # ϑ dup skip GREEK THETA SYMBOL
898
+ 'THgr' => 0x0398, # Θ dup skip GREEK CAPITAL LETTER THETA
899
+ 'thgr' => 0x03b8, # θ dup skip GREEK SMALL LETTER THETA
900
+ 'thinsp' => 0x2009, #   xhtml THIN SPACE
901
+ 'thkap' => 0x2248, # ≈ dup skip ALMOST EQUAL TO
902
+ 'thksim' => 0x223c, # ∼ dup skip TILDE OPERATOR
903
+ 'THORN' => 0x00de, # Þ xhtml LATIN CAPITAL LETTER THORN
904
+ 'thorn' => 0x00fe, # þ xhtml LATIN SMALL LETTER THORN
905
+ 'tilde' => 0x02dc, # ˜ xhtml SMALL TILDE
906
+ 'times' => 0x00d7, # × xhtml MULTIPLICATION SIGN
907
+ 'timesb' => 0x22a0, # ⊠ SQUARED TIMES
908
+ 'top' => 0x22a4, # ⊤ DOWN TACK
909
+ 'tprime' => 0x2034, # ‴ TRIPLE PRIME
910
+ 'trade' => 0x2122, # ™ xhtml TRADE MARK SIGN
911
+ 'trie' => 0x225c, # ≜ DELTA EQUAL TO
912
+ 'TScy' => 0x0426, # Ц CYRILLIC CAPITAL LETTER TSE
913
+ 'tscy' => 0x0446, # ц CYRILLIC SMALL LETTER TSE
914
+ 'TSHcy' => 0x040b, # Ћ CYRILLIC CAPITAL LETTER TSHE
915
+ 'tshcy' => 0x045b, # ћ CYRILLIC SMALL LETTER TSHE
916
+ 'Tstrok' => 0x0166, # Ŧ LATIN CAPITAL LETTER T WITH STROKE
917
+ 'tstrok' => 0x0167, # ŧ LATIN SMALL LETTER T WITH STROKE
918
+ 'twixt' => 0x226c, # ≬ BETWEEN
919
+ 'Uacgr' => 0x038e, # Ύ GREEK CAPITAL LETTER UPSILON WITH TONOS
920
+ 'uacgr' => 0x03cd, # ύ GREEK SMALL LETTER UPSILON WITH TONOS
921
+ 'Uacute' => 0x00da, # Ú xhtml LATIN CAPITAL LETTER U WITH ACUTE
922
+ 'uacute' => 0x00fa, # ú xhtml LATIN SMALL LETTER U WITH ACUTE
923
+ 'uarr' => 0x2191, # ↑ xhtml UPWARDS ARROW
924
+ 'uArr' => 0x21d1, # ⇑ xhtml UPWARDS DOUBLE ARROW
925
+ 'uarr2' => 0x21c8, # ⇈ UPWARDS PAIRED ARROWS
926
+ 'Ubrcy' => 0x040e, # Ў CYRILLIC CAPITAL LETTER SHORT U
927
+ 'ubrcy' => 0x045e, # ў CYRILLIC SMALL LETTER SHORT U
928
+ 'Ubreve' => 0x016c, # Ŭ LATIN CAPITAL LETTER U WITH BREVE
929
+ 'ubreve' => 0x016d, # ŭ LATIN SMALL LETTER U WITH BREVE
930
+ 'Ucirc' => 0x00db, # Û xhtml LATIN CAPITAL LETTER U WITH CIRCUMFLEX
931
+ 'ucirc' => 0x00fb, # û xhtml LATIN SMALL LETTER U WITH CIRCUMFLEX
932
+ 'Ucy' => 0x0423, # У CYRILLIC CAPITAL LETTER U
933
+ 'ucy' => 0x0443, # у CYRILLIC SMALL LETTER U
934
+ 'Udblac' => 0x0170, # Ű LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
935
+ 'udblac' => 0x0171, # ű LATIN SMALL LETTER U WITH DOUBLE ACUTE
936
+ 'udiagr' => 0x03b0, # ΰ GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
937
+ 'Udigr' => 0x03ab, # Ϋ GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
938
+ 'udigr' => 0x03cb, # ϋ GREEK SMALL LETTER UPSILON WITH DIALYTIKA
939
+ 'Ugr' => 0x03a5, # Υ dup skip GREEK CAPITAL LETTER UPSILON
940
+ 'ugr' => 0x03c5, # υ dup skip GREEK SMALL LETTER UPSILON
941
+ 'Ugrave' => 0x00d9, # Ù xhtml LATIN CAPITAL LETTER U WITH GRAVE
942
+ 'ugrave' => 0x00f9, # ù xhtml LATIN SMALL LETTER U WITH GRAVE
943
+ 'uharl' => 0x21bf, # ↿ UPWARDS HARPOON WITH BARB LEFTWARDS
944
+ 'uharr' => 0x21be, # ↾ UPWARDS HARPOON WITH BARB RIGHTWARDS
945
+ 'uhblk' => 0x2580, # ▀ UPPER HALF BLOCK
946
+ 'ulcorn' => 0x231c, # ⌜ TOP LEFT CORNER
947
+ 'ulcrop' => 0x230f, # ⌏ TOP LEFT CROP
948
+ 'Umacr' => 0x016a, # Ū LATIN CAPITAL LETTER U WITH MACRON
949
+ 'umacr' => 0x016b, # ū LATIN SMALL LETTER U WITH MACRON
950
+ 'uml' => 0x00a8, # ¨ dup xhtml DIAERESIS
951
+ 'Uogon' => 0x0172, # Ų LATIN CAPITAL LETTER U WITH OGONEK
952
+ 'uogon' => 0x0173, # ų LATIN SMALL LETTER U WITH OGONEK
953
+ 'uplus' => 0x228e, # ⊎ MULTISET UNION
954
+ 'Upsi' => 0x03a5, # Υ dup skip GREEK CAPITAL LETTER UPSILON
955
+ 'upsi' => 0x03c5, # υ dup skip GREEK SMALL LETTER UPSILON
956
+ 'upsih' => 0x03d2, # ϒ xhtml GREEK UPSILON WITH HOOK SYMBOL
957
+ 'Upsilon' => 0x03a5, # Υ dup xhtml GREEK CAPITAL LETTER UPSILON
958
+ 'upsilon' => 0x03c5, # υ dup xhtml GREEK SMALL LETTER UPSILON
959
+ 'urcorn' => 0x231d, # ⌝ TOP RIGHT CORNER
960
+ 'urcrop' => 0x230e, # ⌎ TOP RIGHT CROP
961
+ 'Uring' => 0x016e, # Ů LATIN CAPITAL LETTER U WITH RING ABOVE
962
+ 'uring' => 0x016f, # ů LATIN SMALL LETTER U WITH RING ABOVE
963
+ 'Utilde' => 0x0168, # Ũ LATIN CAPITAL LETTER U WITH TILDE
964
+ 'utilde' => 0x0169, # ũ LATIN SMALL LETTER U WITH TILDE
965
+ 'utri' => 0x25b5, # ▵ WHITE UP-POINTING SMALL TRIANGLE
966
+ 'utrif' => 0x25b4, # ▴ BLACK UP-POINTING SMALL TRIANGLE
967
+ 'Uuml' => 0x00dc, # Ü xhtml LATIN CAPITAL LETTER U WITH DIAERESIS
968
+ 'uuml' => 0x00fc, # ü xhtml LATIN SMALL LETTER U WITH DIAERESIS
969
+ 'varr' => 0x2195, # ↕ UP DOWN ARROW
970
+ 'vArr' => 0x21d5, # ⇕ UP DOWN DOUBLE ARROW
971
+ 'Vcy' => 0x0412, # В CYRILLIC CAPITAL LETTER VE
972
+ 'vcy' => 0x0432, # в CYRILLIC SMALL LETTER VE
973
+ 'vdash' => 0x22a2, # ⊢ RIGHT TACK
974
+ 'vDash' => 0x22a8, # ⊨ TRUE
975
+ 'Vdash' => 0x22a9, # ⊩ FORCES
976
+ 'veebar' => 0x22bb, # ⊻ XOR
977
+ 'vellip' => 0x22ee, # ⋮ VERTICAL ELLIPSIS
978
+ 'verbar' => 0x007c, # | VERTICAL LINE
979
+ 'Verbar' => 0x2016, # ‖ DOUBLE VERTICAL LINE
980
+ 'vltri' => 0x22b2, # ⊲ NORMAL SUBGROUP OF
981
+ 'vprime' => 0x2032, # ′ dup skip PRIME
982
+ 'vprop' => 0x221d, # ∝ dup skip PROPORTIONAL TO
983
+ 'vrtri' => 0x22b3, # ⊳ CONTAINS AS NORMAL SUBGROUP
984
+ 'vsubnE' => 0x228a, # ⊊ dup skip SUBSET OF WITH NOT EQUAL TO
985
+ 'vsubne' => 0x228a, # ⊊ dup skip SUBSET OF WITH NOT EQUAL TO
986
+ 'vsupnE' => 0x228b, # ⊋ dup skip SUPERSET OF WITH NOT EQUAL TO
987
+ 'vsupne' => 0x228b, # ⊋ dup skip SUPERSET OF WITH NOT EQUAL TO
988
+ 'Vvdash' => 0x22aa, # ⊪ TRIPLE VERTICAL BAR RIGHT TURNSTILE
989
+ 'Wcirc' => 0x0174, # Ŵ LATIN CAPITAL LETTER W WITH CIRCUMFLEX
990
+ 'wcirc' => 0x0175, # ŵ LATIN SMALL LETTER W WITH CIRCUMFLEX
991
+ 'wedgeq' => 0x2259, # ≙ ESTIMATES
992
+ 'weierp' => 0x2118, # ℘ xhtml SCRIPT CAPITAL P
993
+ 'wreath' => 0x2240, # ≀ WREATH PRODUCT
994
+ 'xcirc' => 0x25cb, # ○ dup skip WHITE CIRCLE
995
+ 'xdtri' => 0x25bd, # ▽ WHITE DOWN-POINTING TRIANGLE
996
+ 'Xgr' => 0x039e, # Ξ dup skip GREEK CAPITAL LETTER XI
997
+ 'xgr' => 0x03be, # ξ dup skip GREEK SMALL LETTER XI
998
+ 'xhArr' => 0x2194, # ↔ dup skip LEFT RIGHT ARROW
999
+ 'xharr' => 0x2194, # ↔ dup skip LEFT RIGHT ARROW
1000
+ 'Xi' => 0x039e, # Ξ dup xhtml GREEK CAPITAL LETTER XI
1001
+ 'xi' => 0x03be, # ξ dup xhtml GREEK SMALL LETTER XI
1002
+ 'xlArr' => 0x21d0, # ⇐ dup skip LEFTWARDS DOUBLE ARROW
1003
+ 'xrArr' => 0x21d2, # ⇒ dup skip RIGHTWARDS DOUBLE ARROW
1004
+ 'xutri' => 0x25b3, # △ WHITE UP-POINTING TRIANGLE
1005
+ 'Yacute' => 0x00dd, # Ý xhtml LATIN CAPITAL LETTER Y WITH ACUTE
1006
+ 'yacute' => 0x00fd, # ý xhtml LATIN SMALL LETTER Y WITH ACUTE
1007
+ 'YAcy' => 0x042f, # Я CYRILLIC CAPITAL LETTER YA
1008
+ 'yacy' => 0x044f, # я CYRILLIC SMALL LETTER YA
1009
+ 'Ycirc' => 0x0176, # Ŷ LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
1010
+ 'ycirc' => 0x0177, # ŷ LATIN SMALL LETTER Y WITH CIRCUMFLEX
1011
+ 'Ycy' => 0x042b, # Ы CYRILLIC CAPITAL LETTER YERU
1012
+ 'ycy' => 0x044b, # ы CYRILLIC SMALL LETTER YERU
1013
+ 'yen' => 0x00a5, # ¥ xhtml YEN SIGN
1014
+ 'YIcy' => 0x0407, # Ї CYRILLIC CAPITAL LETTER YI
1015
+ 'yicy' => 0x0457, # ї CYRILLIC SMALL LETTER YI
1016
+ 'YUcy' => 0x042e, # Ю CYRILLIC CAPITAL LETTER YU
1017
+ 'yucy' => 0x044e, # ю CYRILLIC SMALL LETTER YU
1018
+ 'yuml' => 0x00ff, # ÿ xhtml LATIN SMALL LETTER Y WITH DIAERESIS
1019
+ 'Yuml' => 0x0178, # Ÿ xhtml LATIN CAPITAL LETTER Y WITH DIAERESIS
1020
+ 'Zacute' => 0x0179, # Ź LATIN CAPITAL LETTER Z WITH ACUTE
1021
+ 'zacute' => 0x017a, # ź LATIN SMALL LETTER Z WITH ACUTE
1022
+ 'Zcaron' => 0x017d, # Ž LATIN CAPITAL LETTER Z WITH CARON
1023
+ 'zcaron' => 0x017e, # ž LATIN SMALL LETTER Z WITH CARON
1024
+ 'Zcy' => 0x0417, # З CYRILLIC CAPITAL LETTER ZE
1025
+ 'zcy' => 0x0437, # з CYRILLIC SMALL LETTER ZE
1026
+ 'Zdot' => 0x017b, # Ż LATIN CAPITAL LETTER Z WITH DOT ABOVE
1027
+ 'zdot' => 0x017c, # ż LATIN SMALL LETTER Z WITH DOT ABOVE
1028
+ 'Zeta' => 0x0396, # Ζ dup xhtml GREEK CAPITAL LETTER ZETA
1029
+ 'zeta' => 0x03b6, # ζ dup xhtml GREEK SMALL LETTER ZETA
1030
+ 'Zgr' => 0x0396, # Ζ dup skip GREEK CAPITAL LETTER ZETA
1031
+ 'zgr' => 0x03b6, # ζ dup skip GREEK SMALL LETTER ZETA
1032
+ 'ZHcy' => 0x0416, # Ж CYRILLIC CAPITAL LETTER ZHE
1033
+ 'zhcy' => 0x0436, # ж CYRILLIC SMALL LETTER ZHE
1034
+ 'zwj' => 0x200d, # ‍ xhtml ZERO WIDTH JOINER
1035
+ 'zwnj' => 0x200c, # ‌ xhtml ZERO WIDTH NON-JOINER
1036
+ 'euro' => 0x20ac, # € xhtml EURO SIGN
1037
+ }
1038
+ SKIP_DUP_ENCODINGS['expanded'] = %w[
1039
+ ap thkap rsquor aleph lsquor square rdquor ldquor b.kappav b.rhov mldr xlArr die Dot xrArr iff
1040
+ les ges vprime lne lvnE gne gvnE nles nges half xcirc pre sce Agr Bgr subE b.Gamma Ggr supE
1041
+ b.Delta Dgr nsube nsupe Egr Zgr subne vsubnE vsubne EEgr supne vsupnE vsupne b.Theta THgr Igr
1042
+ Kgr b.Lambda Lgr Mgr Ngr b.Xi Xgr Ogr b.Pi Pgr sfrown Rgr ssmile b.Sigma Sgr Tgr b.Upsi Ugr
1043
+ Upsi b.Phi PHgr KHgr b.Psi PSgr b.Omega OHgr coprod samalg sbsol ssetmn agr b.alpha bottom
1044
+ b.beta bgr b.gamma ggr b.delta dgr b.epsi b.epsis b.epsiv egr epsi b.zeta zgr vprop b.eta eegr
1045
+ b.thetas thetas thgr b.iota igr b.kappa kgr b.lambda lgr xhArr xharr b.mu mgr b.nu ngr b.xi
1046
+ xgr spar ogr nspar b.pi pgr b.rho rgr b.sigmav sfgr sigmav b.sigma sgr b.tau tgr b.upsi ugr
1047
+ upsi b.phis phgr phis b.chi khgr b.psi psgr ohgr b.omega b.thetav thetav b.phiv thksim b.piv
1048
+ b.gammad
1049
+ ]
1050
+ end