alphabets 0.1.0 → 0.1.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: b5d826c435c38e5c8faf7963d7de6e6dcf0e2fb7
4
- data.tar.gz: 5187392b8e6fbb12e249709526edf1bb2def2513
3
+ metadata.gz: 7310d705b53b7f04b8a588b831d71940728aa333
4
+ data.tar.gz: 3306b5393e10208c4f4f97d523b3a7366042d669
5
5
  SHA512:
6
- metadata.gz: 0e8aac5a5a65d137c710a9623d444d9f996171caee9869cb32af78cd09b26ac0404fa88a40499279527ba6d9276c029396a34241f507adf289551e9327a6ce05
7
- data.tar.gz: 7cb3dda8f5804fc39f67c866c319b24af8ee6119609052a7eb7ff6c0927f1ede8b50cbcf89d98709d83ba72bc900d56efafb4ce038d4f80554c916f057b6b5ec
6
+ metadata.gz: c3ba94979d0141b763f9370a520e60124bcadb58d1358f87054f7c10973cb35eecbbc3f36a8fdf75589a4b38911124bc1b5dcad05a638e6fb4b3ae83bc4f1edd
7
+ data.tar.gz: aac6b538a571553b4d6aa8af63d1bbae722863feb9fdaa22a2b7a1ce1caa06798d0e38ce8e8be3d6ecaf98441d433d6ad0f0a410af2056e210e9edf943a910fa
File without changes
@@ -1,4 +1,4 @@
1
- HISTORY.md
1
+ CHANGELOG.md
2
2
  Manifest.txt
3
3
  NOTES.md
4
4
  README.md
data/NOTES.md CHANGED
@@ -14,8 +14,25 @@ Use Upcase, Downcase AND Titlecase (!)
14
14
 
15
15
  ## Libraries
16
16
 
17
+ **Ruby**
18
+
17
19
  - <https://github.com/SixArm/sixarm_ruby_unaccent> - Replace a string's accent characters with ASCII characters. Based on Perl Text::Unaccent from CPAN.
18
20
 
21
+ - <https://github.com/fractalsoft/diacritics> - support downcase, upcase and permanent link with diacritical characters
22
+
23
+ **Perl**
24
+
25
+ - <https://metacpan.org/pod/Unicode::Diacritic::Strip> - strip diacritics from Unicode text
26
+
27
+ **JavaScript**
28
+
29
+ - <https://github.com/dundalek/latinize> - convert accents (diacritics) from strings to latin characters
30
+
31
+ - <https://github.com/tyxla/remove-accents> - removes the accents from a string, converting them to their corresponding non-accented ascii characters
32
+
33
+ **PostgreSQL**
34
+
35
+ - <https://www.postgresql.org/docs/current/unaccent.html> - unaccent is a text search dictionary that removes accents (diacritic signs) from lexemes
19
36
 
20
37
 
21
38
  ## Links
@@ -35,9 +52,142 @@ Proper Unicoding - Ruby's Regexp engine has a powerful feature built in: It can
35
52
  Regex with Class - Ruby's regex engine defines a lot of shortcut character classes. Besides the common meta characters (\w, etc.), there is also the POSIX style expressions and the unicode property syntax. This is an overview of all character classes
36
53
 
37
54
 
55
+ **Unicode**
56
+
57
+ - <https://unicode.org/reports/tr15/> - Unicode Standard Annex #15 - UNICODE NORMALIZATION FORMS
58
+
38
59
  **W3C**
39
60
 
40
61
  - <https://www.w3.org/TR/charmod-norm/>
41
62
  - <https://www.w3.org/International/wiki/Case_folding>
42
63
 
43
64
  In Western European languages, the letter 'i' (U+0069) upper cases to a dotless 'I' (U+0049). In Turkish, this letter upper cases to a dotted upper case letter 'İ' (U+0130). Similarly, 'I' (U+0049) lower cases to 'ı' (U+0131), which is a dotless lowercase letter i.
65
+
66
+ **Wikipedia**
67
+
68
+ - <https://en.wikipedia.org/wiki/Diacritic>
69
+
70
+ **More**
71
+
72
+ - [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/)
73
+ by Joel Spolsky, 2003
74
+
75
+ - [Unicode Normalization in Ruby](https://www.honeybadger.io/blog/ruby-unicode-normalization/) by Starr Horne, 2017
76
+
77
+
78
+ ## Mappings
79
+
80
+ Open questions ...
81
+
82
+ ```
83
+ Þ => TH ???
84
+ þ => th ???
85
+ ```
86
+
87
+
88
+ ## Alphabets
89
+
90
+ Add more alphabets... why? why not?
91
+
92
+
93
+ - Portuguese [Â, "abcdefghijklmnopqrstuvwxyzáâãàçéêíóôõú", "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÂÃÀÇÉÊÍÓÔÕÚ"]
94
+ - Russian [Щ, Ъ, Э, "абвгдеёжзийклмнопрстуфхцчшщъыьэюя", "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"]
95
+ - Greek [Β, Μ, Χ, Ω, Ή, Ύ, Ώ, ΐ, ΰ, Ϊ, Ϋ]
96
+ - Slovak ["aáäeéiíoóôuúyýbcčdďfghjklĺľmnňpqrŕsštťvwxzž", "AÁÄEÉIÍOÓÔUÚYÝBCČDĎFGHJKLĹĽMNŇPQRŔSŠTŤVWXZŽ"]
97
+ - Italian ["aàbcdeèéfghiìíîlmnoòópqrstuùúvz", "AÀBCDEÈÉFGHIÌÍÎLMNOÒÓPQRSTUÙÚVZ"]
98
+ - Romanian ["aăâbcdefghiîjklmnopqrsștțuvwxyz", "AĂÂBCDEFGHIÎJKLMNOPQRSȘTȚUVWXYZ"]
99
+ - Danish [å, â, ô, Å, Â, Ô]
100
+
101
+ ```
102
+ def de
103
+ { # German
104
+ downcase: %w(ä ö ü ß),
105
+ upcase: %w(Ä Ö Ü ẞ),
106
+ permanent: %w(ae oe ue ss)
107
+ }
108
+ end
109
+
110
+ def pl
111
+ { # Polish
112
+ downcase: %w(ą ć ę ł ń ó ś ż ź),
113
+ upcase: %w(Ą Ć Ę Ł Ń Ó Ś Ż Ź),
114
+ permanent: %w(a c e l n o s z z)
115
+ }
116
+ end
117
+
118
+ def cs
119
+ { # Czech uses acute (á é í ó ú ý), caron (č ď ě ň ř š ť ž), ring (ů)
120
+ # aábcčdďeéěfghiíjklmnňoópqrřsštťuúůvwxyýzž
121
+ # AÁBCČDĎEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚŮVWXYÝZŽ
122
+ downcase: %w(á é í ó ú ý č ď ě ň ř š ť ů ž),
123
+ upcase: %w(Á É Í Ó Ú Ý Č Ď Ě Ň Ř Š Ť Ů Ž),
124
+ permanent: %w(a e i o u y c d e n r s t u z)
125
+ }
126
+ end
127
+
128
+ def fr
129
+ { # French
130
+ # abcdefghijklmnopqrstuvwxyzàâæçéèêëîïôœùûüÿ
131
+ # ABCDEFGHIJKLMNOPQRSTUVWXYZÀÂÆÇÉÈÊËÎÏÔŒÙÛÜŸ
132
+ downcase: %w(à â é è ë ê ï î ô ù û ü ÿ ç œ æ),
133
+ upcase: %w(À Â É È Ë Ê Ï Î Ô Ù Û Ü Ÿ Ç Œ Æ),
134
+ permanent: %w(a a e e e e i i o u u ue y c oe ae)
135
+ }
136
+ end
137
+
138
+ def it
139
+ { # Italian
140
+ downcase: %w(à è é ì î ò ó ù),
141
+ upcase: %w(À È É Ì Î Ò Ó Ù),
142
+ permanent: %w(a e e i i o o u)
143
+ }
144
+ end
145
+
146
+ def eo
147
+ { # Esperantohas the symbols ŭ, ĉ, ĝ, ĥ, ĵ and ŝ
148
+ downcase: %w(ĉ ĝ ĥ ĵ ŝ ŭ),
149
+ upcase: %w(Ĉ Ĝ Ĥ Ĵ Ŝ Ŭ),
150
+ permanent: %w(c g h j s u)
151
+ }
152
+ end
153
+
154
+ def is
155
+ { # Iceland
156
+ downcase: %w(ð þ),
157
+ upcase: %w(Ð Þ),
158
+ permanent: %w(d p)
159
+ }
160
+ end
161
+
162
+ def pt
163
+ { # Portugal uses á, â, ã, à, ç, é, ê, í, ó, ô, õ and ú
164
+ downcase: %w(ã ç),
165
+ upcase: %w(Ã Ç),
166
+ permanent: %w(a c)
167
+ }
168
+ end
169
+
170
+ def sp
171
+ { # Spanish
172
+ downcase: ['ñ', 'õ', '¿', '¡'],
173
+ upcase: ['Ñ', 'Õ', '¿', '¡'],
174
+ permanent: ['n', 'o', '', '']
175
+ }
176
+ end
177
+
178
+ def hu
179
+ { # Hungarian
180
+ downcase: %w(ő),
181
+ upcase: %w(Ő),
182
+ permanent: %w(oe)
183
+ }
184
+ end
185
+
186
+ def nn
187
+ { # Norwegian
188
+ downcase: %w(æ å),
189
+ upcase: %w(Æ Å),
190
+ permanent: %w(ae a)
191
+ }
192
+ end
193
+ ```
data/Rakefile CHANGED
@@ -15,7 +15,7 @@ Hoe.spec 'alphabets' do
15
15
 
16
16
  # switch extension to .markdown for gihub formatting
17
17
  self.readme_file = 'README.md'
18
- self.history_file = 'HISTORY.md'
18
+ self.history_file = 'CHANGELOG.md'
19
19
 
20
20
  self.licenses = ['Public Domain']
21
21
 
@@ -12,9 +12,9 @@ UNACCENT = Reader.parse( <<TXT )
12
12
  Æ AE æ ae # ae ligature
13
13
  ā a
14
14
  ă a
15
- ą a
15
+ ą a # ą - U+0105 (261) - LATIN SMALL LETTER A WITH OGONEK
16
16
 
17
- Ç C ç c
17
+ Ç C ç c # ç - U+00E7 (231) - LATIN SMALL LETTER C WITH CEDILLA
18
18
  ć c
19
19
  Č C č c
20
20
 
@@ -31,7 +31,7 @@ UNACCENT = Reader.parse( <<TXT )
31
31
  Í I í i
32
32
  î i
33
33
  ī i
34
- ı i # small dotless i
34
+ ı i # ı - U+0131 (305) - LATIN SMALL LETTER DOTLESS I
35
35
 
36
36
  Ł L ł l
37
37
 
@@ -41,6 +41,7 @@ UNACCENT = Reader.parse( <<TXT )
41
41
 
42
42
  Ö O ö o
43
43
  ó o
44
+ ò o
44
45
  õ o
45
46
  ô o
46
47
  ø o
@@ -50,15 +51,16 @@ UNACCENT = Reader.parse( <<TXT )
50
51
  ř r
51
52
 
52
53
  Ś S ś s
53
- Ş S ş s
54
+ Ş S ş s # ş - U+015F (351) - LATIN SMALL LETTER S WITH CEDILLA
55
+ Ș S ș s # ș - U+0219 (537) - LATIN SMALL LETTER S WITH COMMA BELOW
54
56
  Š S š s
55
- ș s # U+0219
56
- ß ss
57
+ ß ss # ß - U+00DF (223) - LATIN SMALL LETTER SHARP S
57
58
 
58
- ţ t # U+0163
59
- ț t # U+021B
59
+ Ţ t ţ t # ţ - U+0163 (355) - LATIN SMALL LETTER T WITH CEDILLA
60
+ Ț t ț t # ț - U+021B (539) - LATIN SMALL LETTER T WITH COMMA BELOW
60
61
 
61
- þ p #### fix/check!!!! icelandic - use p is p or th - why? why not?
62
+ þ p # þ - U+00FE (254) - LATIN SMALL LETTER THORN
63
+ #### fix/check!!!! icelandic - use p is p or th - why? why not?
62
64
 
63
65
  Ü U ü u
64
66
  Ú U ú u
@@ -71,6 +73,14 @@ UNACCENT = Reader.parse( <<TXT )
71
73
  Ž Z ž z
72
74
  TXT
73
75
 
76
+ ##
77
+ # Notes:
78
+ # Romanian did NOT initially get its Ș/ș and Ț/ț (with comma) letters,
79
+ # because these letters were initially unified with Ş/ş and Ţ/ţ (with cedilla)
80
+ # by the Unicode Consortium, considering the shapes with comma beneath
81
+ # to be glyph variants of the shapes with cedilla.
82
+ # However, the letters with explicit comma below were later added to the Unicode standard and are also in ISO 8859-16.
83
+
74
84
 
75
85
  ## de,at,ch translation for umlauts
76
86
  UNACCENT_DE = Reader.parse( <<TXT )
@@ -90,9 +100,9 @@ DOWNCASE = %w[A B C D E F G H I J K L M N O P Q R S T U V W X Y Z].reduce({}) do
90
100
  Ä ä
91
101
  Á á
92
102
  Å å
93
- Æ æ # ae ligature
103
+ Æ æ # LATIN LETTER AE - ae ligature
94
104
 
95
- Ç ç
105
+ Ç ç # LATIN LETTER C WITH CEDILLA
96
106
  Č č
97
107
 
98
108
  É é
@@ -103,12 +113,16 @@ DOWNCASE = %w[A B C D E F G H I J K L M N O P Q R S T U V W X Y Z].reduce({}) do
103
113
  Ł ł
104
114
 
105
115
  Ö ö
106
- Œ œ # oe ligature
116
+ Œ œ # LATIN LIGATURE OE
107
117
 
108
118
  Ś ś
109
- Ş ş
119
+ Ş ş # LATIN LETTER S WITH CEDILLA
120
+ Ș ș # LATIN LETTER S WITH COMMA BELOW
110
121
  Š š
111
122
 
123
+ Ţ ţ # LATIN LETTER T WITH CEDILLA
124
+ Ț ț # LATIN LETTER T WITH COMMA BELOW
125
+
112
126
  Ü ü
113
127
  Ú ú
114
128
 
@@ -1,62 +1,62 @@
1
-
2
- class Alphabet
3
- class Reader ## todo/check: rename to CharReader or something - why? why not?
4
-
5
- def self.read( path ) ## use - rename to read_file or from_file etc. - why? why not?
6
- txt = File.open( path, 'r:utf-8' ).read
7
- parse( txt )
8
- end
9
-
10
- def self.parse( txt )
11
- h = {} ## char(acter) table mappings
12
-
13
- txt.each_line do |line|
14
- line = line.strip
15
-
16
- next if line.empty?
17
- next if line.start_with?( '#' ) ## skip comments too
18
-
19
- ## strip inline (until end-of-line) comments too
20
- ## e.g ţ t ## U+0163
21
- ## => ţ t
22
- line = line.sub( /#.*/, '' ).strip
23
- ## pp line
24
-
25
- values = line.split( /[ \t]+/ )
26
- ## pp values
27
-
28
- ## check - must be a even - a multiple of two
29
- if values.size % 2 != 0
30
- puts "** !!! ERROR !!! - missing mapping pair - mappings must be even (a multiple of two):"
31
- pp values
32
- exit 1
33
- end
34
-
35
- # add mappings in pairs
36
- values.each_slice(2) do |slice|
37
- ## pp slice
38
- key = slice[0]
39
- value = slice[1]
40
-
41
- ## check - key must be a single-character/letter in unicode
42
- if key.size != 1
43
- puts "** !!! ERROR !!! - mapping character must be a single-character, size is #{key.size}"
44
- pp slice
45
- exit 1
46
- end
47
-
48
- ## check - check for duplicates
49
- if h[ key ]
50
- puts "** !!! ERROR !!! - duplicate mapping character; key already present"
51
- pp slice
52
- exit 1
53
- else
54
- h[ key ] = value
55
- end
56
- end
57
- end
58
- h
59
- end # method parse
60
-
61
- end # class Reader
62
- end # class Alphabet
1
+
2
+ class Alphabet
3
+ class Reader ## todo/check: rename to CharReader or something - why? why not?
4
+
5
+ def self.read( path ) ## use - rename to read_file or from_file etc. - why? why not?
6
+ txt = File.open( path, 'r:utf-8' ).read
7
+ parse( txt )
8
+ end
9
+
10
+ def self.parse( txt )
11
+ h = {} ## char(acter) table mappings
12
+
13
+ txt.each_line do |line|
14
+ line = line.strip
15
+
16
+ next if line.empty?
17
+ next if line.start_with?( '#' ) ## skip comments too
18
+
19
+ ## strip inline (until end-of-line) comments too
20
+ ## e.g ţ t ## U+0163
21
+ ## => ţ t
22
+ line = line.sub( /#.*/, '' ).strip
23
+ ## pp line
24
+
25
+ values = line.split( /[ \t]+/ )
26
+ ## pp values
27
+
28
+ ## check - must be a even - a multiple of two
29
+ if values.size % 2 != 0
30
+ puts "** !!! ERROR !!! - missing mapping pair - mappings must be even (a multiple of two):"
31
+ pp values
32
+ exit 1
33
+ end
34
+
35
+ # add mappings in pairs
36
+ values.each_slice(2) do |slice|
37
+ ## pp slice
38
+ key = slice[0]
39
+ value = slice[1]
40
+
41
+ ## check - key must be a single-character/letter in unicode
42
+ if key.size != 1
43
+ puts "** !!! ERROR !!! - mapping character must be a single-character, size is #{key.size}"
44
+ pp slice
45
+ exit 1
46
+ end
47
+
48
+ ## check - check for duplicates
49
+ if h[ key ]
50
+ puts "** !!! ERROR !!! - duplicate mapping character; key already present"
51
+ pp slice
52
+ exit 1
53
+ else
54
+ h[ key ] = value
55
+ end
56
+ end
57
+ end
58
+ h
59
+ end # method parse
60
+
61
+ end # class Reader
62
+ end # class Alphabet
@@ -1,75 +1,75 @@
1
-
2
- class Alphabet
3
-
4
- def self.frequency_table( name ) ## todo/check: use/rename to char_frequency_table
5
- ## calculate the frequency table of letters, digits, etc.
6
- freq = Hash.new(0)
7
- name.each_char do |ch|
8
- freq[ch] += 1
9
- end
10
- freq
11
- end
12
-
13
-
14
- def self.count( freq, mapping_or_chars )
15
- chars = if mapping_or_chars.is_a?( Hash )
16
- mapping_or_chars.keys
17
- else ## todo/fix: check for is_a? Array and if is String split into Array (on char at a time?) - why? why not?
18
- mapping_or_chars ## assume it's an array/list of characters
19
- end
20
-
21
- chars.reduce(0) do |count,ch|
22
- count += freq[ch]
23
- count
24
- end
25
- end
26
-
27
-
28
- def self.sub( name, mapping ) ## todo/check: use a different/better name - gsub/map/replace/fold/... - why? why not?
29
- buf = String.new
30
- name.each_char do |ch|
31
- buf << if mapping[ch]
32
- mapping[ch]
33
- else
34
- ch
35
- end
36
- end
37
- buf
38
- end
39
-
40
-
41
- class Unaccenter #Worker ## todo/change - find a better name - why? why not?
42
- def initialize( mapping )
43
- @mapping = mapping
44
- end
45
-
46
- def count( freq ) Alphabet.count( freq, @mapping ); end
47
- def unaccent( name ) Alphabet.sub( name, @mapping ); end
48
- end # class Unaccent Worker
49
-
50
-
51
- def self.find_unaccenter( key )
52
- if key == :de
53
- @de ||= Unaccenter.new( UNACCENT_DE )
54
- @de
55
- else
56
- ## use uni(versal) or unicode or something - why? why not?
57
- ## use all or int'l (international) - why? why not?
58
- ## use en (english) - why? why not?
59
- @default ||= Unaccenter.new( UNACCENT )
60
- @default
61
- end
62
- end
63
-
64
- def self.unaccent( name )
65
- @default ||= Unaccenter.new( UNACCENT )
66
- @default.unaccent( name )
67
- end
68
-
69
-
70
- def self.downcase_i18n( name ) ## our very own downcase for int'l characters / letters
71
- sub( name, DOWNCASE )
72
- end
73
- ## add downcase_uni - univeral/unicode - why? why not?
74
-
75
- end # class Alphabet
1
+
2
+ class Alphabet
3
+
4
+ def self.frequency_table( name ) ## todo/check: use/rename to char_frequency_table
5
+ ## calculate the frequency table of letters, digits, etc.
6
+ freq = Hash.new(0)
7
+ name.each_char do |ch|
8
+ freq[ch] += 1
9
+ end
10
+ freq
11
+ end
12
+
13
+
14
+ def self.count( freq, mapping_or_chars )
15
+ chars = if mapping_or_chars.is_a?( Hash )
16
+ mapping_or_chars.keys
17
+ else ## todo/fix: check for is_a? Array and if is String split into Array (on char at a time?) - why? why not?
18
+ mapping_or_chars ## assume it's an array/list of characters
19
+ end
20
+
21
+ chars.reduce(0) do |count,ch|
22
+ count += freq[ch]
23
+ count
24
+ end
25
+ end
26
+
27
+
28
+ def self.sub( name, mapping ) ## todo/check: use a different/better name - gsub/map/replace/fold/... - why? why not?
29
+ buf = String.new
30
+ name.each_char do |ch|
31
+ buf << if mapping[ch]
32
+ mapping[ch]
33
+ else
34
+ ch
35
+ end
36
+ end
37
+ buf
38
+ end
39
+
40
+
41
+ class Unaccenter #Worker ## todo/change - find a better name - why? why not?
42
+ def initialize( mapping )
43
+ @mapping = mapping
44
+ end
45
+
46
+ def count( freq ) Alphabet.count( freq, @mapping ); end
47
+ def unaccent( name ) Alphabet.sub( name, @mapping ); end
48
+ end # class Unaccent Worker
49
+
50
+
51
+ def self.find_unaccenter( key )
52
+ if key == :de
53
+ @de ||= Unaccenter.new( UNACCENT_DE )
54
+ @de
55
+ else
56
+ ## use uni(versal) or unicode or something - why? why not?
57
+ ## use all or int'l (international) - why? why not?
58
+ ## use en (english) - why? why not?
59
+ @default ||= Unaccenter.new( UNACCENT )
60
+ @default
61
+ end
62
+ end
63
+
64
+ def self.unaccent( name )
65
+ @default ||= Unaccenter.new( UNACCENT )
66
+ @default.unaccent( name )
67
+ end
68
+
69
+
70
+ def self.downcase_i18n( name ) ## our very own downcase for int'l characters / letters
71
+ sub( name, DOWNCASE )
72
+ end
73
+ ## add downcase_uni - univeral/unicode - why? why not?
74
+
75
+ end # class Alphabet
@@ -6,7 +6,7 @@
6
6
  class Alphabet
7
7
  MAJOR = 0 ## todo: namespace inside version or something - why? why not??
8
8
  MINOR = 1
9
- PATCH = 0
9
+ PATCH = 1
10
10
  VERSION = [MAJOR,MINOR,PATCH].join('.')
11
11
 
12
12
  def self.version
@@ -1,37 +1,37 @@
1
- ###
2
- # to run use
3
- # ruby -I ./lib -I ./test test/test_reader.rb
4
-
5
-
6
- require 'helper'
7
-
8
- class TestReader < MiniTest::Test
9
-
10
- def test_parse
11
- h = Alphabet::Reader.parse( <<TXT )
12
- ## hello
13
-
14
- Ä A ä a ## hello
15
- Á A á a
16
- à a
17
- ã a
18
- â a ### yada yada
19
- Å A å a
20
- æ ae
21
-
22
- Ç C ç c
23
- ć c
24
-
25
- ß ss
26
- TXT
27
-
28
- pp h
29
-
30
- assert_equal 'A', h['Ä']
31
- assert_equal 'a', h['ä']
32
- assert_equal 'ae', h['æ']
33
-
34
- assert_equal 'ss', h['ß']
35
- end
36
-
37
- end # class TestReader
1
+ ###
2
+ # to run use
3
+ # ruby -I ./lib -I ./test test/test_reader.rb
4
+
5
+
6
+ require 'helper'
7
+
8
+ class TestReader < MiniTest::Test
9
+
10
+ def test_parse
11
+ h = Alphabet::Reader.parse( <<TXT )
12
+ ## hello
13
+
14
+ Ä A ä a ## hello
15
+ Á A á a
16
+ à a
17
+ ã a
18
+ â a ### yada yada
19
+ Å A å a
20
+ æ ae
21
+
22
+ Ç C ç c
23
+ ć c
24
+
25
+ ß ss
26
+ TXT
27
+
28
+ pp h
29
+
30
+ assert_equal 'A', h['Ä']
31
+ assert_equal 'a', h['ä']
32
+ assert_equal 'ae', h['æ']
33
+
34
+ assert_equal 'ss', h['ß']
35
+ end
36
+
37
+ end # class TestReader
metadata CHANGED
@@ -1,60 +1,54 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: alphabets
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Gerald Bauer
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-08-14 00:00:00.000000000 Z
11
+ date: 2020-01-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rdoc
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - ">="
17
+ - - "~>"
18
18
  - !ruby/object:Gem::Version
19
19
  version: '4.0'
20
- - - "<"
21
- - !ruby/object:Gem::Version
22
- version: '7'
23
20
  type: :development
24
21
  prerelease: false
25
22
  version_requirements: !ruby/object:Gem::Requirement
26
23
  requirements:
27
- - - ">="
24
+ - - "~>"
28
25
  - !ruby/object:Gem::Version
29
26
  version: '4.0'
30
- - - "<"
31
- - !ruby/object:Gem::Version
32
- version: '7'
33
27
  - !ruby/object:Gem::Dependency
34
28
  name: hoe
35
29
  requirement: !ruby/object:Gem::Requirement
36
30
  requirements:
37
31
  - - "~>"
38
32
  - !ruby/object:Gem::Version
39
- version: '3.18'
33
+ version: '3.16'
40
34
  type: :development
41
35
  prerelease: false
42
36
  version_requirements: !ruby/object:Gem::Requirement
43
37
  requirements:
44
38
  - - "~>"
45
39
  - !ruby/object:Gem::Version
46
- version: '3.18'
40
+ version: '3.16'
47
41
  description: 'alphabets - '
48
42
  email: opensport@googlegroups.com
49
43
  executables: []
50
44
  extensions: []
51
45
  extra_rdoc_files:
52
- - HISTORY.md
46
+ - CHANGELOG.md
53
47
  - Manifest.txt
54
48
  - NOTES.md
55
49
  - README.md
56
50
  files:
57
- - HISTORY.md
51
+ - CHANGELOG.md
58
52
  - Manifest.txt
59
53
  - NOTES.md
60
54
  - README.md