alphabets 1.0.0 → 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 20cc18ecb226c6c72b3d03c80c58f5ad16b2e385
4
- data.tar.gz: 5d503d887290c139ea920cd06aa05c5f92431a3b
2
+ SHA256:
3
+ metadata.gz: 302aef0fb6736e1c2fa48e12ee810d2c9279ed98c5aabbeee1f2decb2babb943
4
+ data.tar.gz: 693f249af93afb58425a6189737ae2b3e2934dbb4cff5b555726529778bbf750
5
5
  SHA512:
6
- metadata.gz: 4a375e07ac355c395c8e520e216c247ee72bb5b9b8b85c8110e5c1e72f5be11887815ddb5e80976411f914a02911ab5dc39f99e9ce6d6aa786713e5a8791b30d
7
- data.tar.gz: 4259ac17a3c0743e2a70e79ae1dc65def9bfacc7baa5f06f1693d39cbfbd902c88ac222d6784282e18261e7dcf605e12777590ea1f2fae6b8408a10f22206fdd
6
+ metadata.gz: 53b87c802b8fef9be45fa3b294798f302087f683c44af876ad2278e18238c9dd098302c2c4fd5dcc15c8bdddd92d3ad3b3ec1b70f38f229f3bc87c512e264e73
7
+ data.tar.gz: 5d349a536390743c62ccfbc4332645597fda7e075aa829ff3bf06fb139253353452c26a35210f417b2a8461844c144b6ada57d90651dcf88a1cad5203c0f49e3
data/CHANGELOG.md CHANGED
@@ -1,3 +1,5 @@
1
- ### 0.0.1 / 2019-08-13
2
-
3
- * Everything is new. First release.
1
+ ### 1.0.1 / 2024-06-07
2
+
3
+ ### 0.0.1 / 2019-08-13
4
+
5
+ * Everything is new. First release.
data/NOTES.md CHANGED
@@ -1,193 +1,193 @@
1
- # Notes
2
-
3
- ## Todos
4
-
5
-
6
- ## Terminology
7
-
8
- Use Upcase, Downcase AND Titlecase (!)
9
-
10
- - Example: Ö -> Upcase: OE, Downcase: oe, Titlecase: Oe (!)
11
- - Example: Æ -> Upcase: AE, Downcase: ae, Titlecase: Ae (!)
12
-
13
-
14
-
15
- ## Libraries
16
-
17
- **Ruby**
18
-
19
- - <https://github.com/SixArm/sixarm_ruby_unaccent> - Replace a string's accent characters with ASCII characters. Based on Perl Text::Unaccent from CPAN.
20
-
21
- - <https://github.com/fractalsoft/diacritics> - support downcase, upcase and permanent link with diacritical characters
22
-
23
- **Perl**
24
-
25
- - <https://metacpan.org/pod/Unicode::Diacritic::Strip> - strip diacritics from Unicode text
26
-
27
- **JavaScript**
28
-
29
- - <https://github.com/dundalek/latinize> - convert accents (diacritics) from strings to latin characters
30
-
31
- - <https://github.com/tyxla/remove-accents> - removes the accents from a string, converting them to their corresponding non-accented ascii characters
32
-
33
- **PostgreSQL**
34
-
35
- - <https://www.postgresql.org/docs/current/unaccent.html> - unaccent is a text search dictionary that removes accents (diacritic signs) from lexemes
36
-
37
-
38
- ## Links
39
-
40
- **Unicode w/ Ruby - Ruby ♡ Unicode**
41
-
42
- - <https://idiosyncratic-ruby.com/66-ruby-has-character>
43
-
44
- Ruby has Character - Ruby comes with good support for Unicode-related features. Read on if you want to learn more about important Unicode fundamentals and how to use them in Ruby...
45
-
46
- - <https://idiosyncratic-ruby.com/41-proper-unicoding>
47
-
48
- Proper Unicoding - Ruby's Regexp engine has a powerful feature built in: It can match for Unicode character properties. But what exactly are properties you can match for?
49
-
50
- - <https://idiosyncratic-ruby.com/30-regex-with-class>
51
-
52
- Regex with Class - Ruby's regex engine defines a lot of shortcut character classes. Besides the common meta characters (\w, etc.), there is also the POSIX style expressions and the unicode property syntax. This is an overview of all character classes
53
-
54
-
55
- **Unicode**
56
-
57
- - <https://unicode.org/reports/tr15/> - Unicode Standard Annex #15 - UNICODE NORMALIZATION FORMS
58
-
59
- **W3C**
60
-
61
- - <https://www.w3.org/TR/charmod-norm/>
62
- - <https://www.w3.org/International/wiki/Case_folding>
63
-
64
- In Western European languages, the letter 'i' (U+0069) upper cases to a dotless 'I' (U+0049). In Turkish, this letter upper cases to a dotted upper case letter 'İ' (U+0130). Similarly, 'I' (U+0049) lower cases to 'ı' (U+0131), which is a dotless lowercase letter i.
65
-
66
- **Wikipedia**
67
-
68
- - <https://en.wikipedia.org/wiki/Diacritic>
69
-
70
- **More**
71
-
72
- - [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/)
73
- by Joel Spolsky, 2003
74
-
75
- - [Unicode Normalization in Ruby](https://www.honeybadger.io/blog/ruby-unicode-normalization/) by Starr Horne, 2017
76
-
77
-
78
- ## Mappings
79
-
80
- Open questions ...
81
-
82
- ```
83
- Þ => TH ???
84
- þ => th ???
85
- ```
86
-
87
-
88
- ## Alphabets
89
-
90
- Add more alphabets... why? why not?
91
-
92
-
93
- - Portuguese [Â, "abcdefghijklmnopqrstuvwxyzáâãàçéêíóôõú", "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÂÃÀÇÉÊÍÓÔÕÚ"]
94
- - Russian [Щ, Ъ, Э, "абвгдеёжзийклмнопрстуфхцчшщъыьэюя", "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"]
95
- - Greek [Β, Μ, Χ, Ω, Ή, Ύ, Ώ, ΐ, ΰ, Ϊ, Ϋ]
96
- - Slovak ["aáäeéiíoóôuúyýbcčdďfghjklĺľmnňpqrŕsštťvwxzž", "AÁÄEÉIÍOÓÔUÚYÝBCČDĎFGHJKLĹĽMNŇPQRŔSŠTŤVWXZŽ"]
97
- - Italian ["aàbcdeèéfghiìíîlmnoòópqrstuùúvz", "AÀBCDEÈÉFGHIÌÍÎLMNOÒÓPQRSTUÙÚVZ"]
98
- - Romanian ["aăâbcdefghiîjklmnopqrsștțuvwxyz", "AĂÂBCDEFGHIÎJKLMNOPQRSȘTȚUVWXYZ"]
99
- - Danish [å, â, ô, Å, Â, Ô]
100
-
101
- ```
102
- def de
103
- { # German
104
- downcase: %w(ä ö ü ß),
105
- upcase: %w(Ä Ö Ü ẞ),
106
- permanent: %w(ae oe ue ss)
107
- }
108
- end
109
-
110
- def pl
111
- { # Polish
112
- downcase: %w(ą ć ę ł ń ó ś ż ź),
113
- upcase: %w(Ą Ć Ę Ł Ń Ó Ś Ż Ź),
114
- permanent: %w(a c e l n o s z z)
115
- }
116
- end
117
-
118
- def cs
119
- { # Czech uses acute (á é í ó ú ý), caron (č ď ě ň ř š ť ž), ring (ů)
120
- # aábcčdďeéěfghiíjklmnňoópqrřsštťuúůvwxyýzž
121
- # AÁBCČDĎEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚŮVWXYÝZŽ
122
- downcase: %w(á é í ó ú ý č ď ě ň ř š ť ů ž),
123
- upcase: %w(Á É Í Ó Ú Ý Č Ď Ě Ň Ř Š Ť Ů Ž),
124
- permanent: %w(a e i o u y c d e n r s t u z)
125
- }
126
- end
127
-
128
- def fr
129
- { # French
130
- # abcdefghijklmnopqrstuvwxyzàâæçéèêëîïôœùûüÿ
131
- # ABCDEFGHIJKLMNOPQRSTUVWXYZÀÂÆÇÉÈÊËÎÏÔŒÙÛÜŸ
132
- downcase: %w(à â é è ë ê ï î ô ù û ü ÿ ç œ æ),
133
- upcase: %w(À Â É È Ë Ê Ï Î Ô Ù Û Ü Ÿ Ç Œ Æ),
134
- permanent: %w(a a e e e e i i o u u ue y c oe ae)
135
- }
136
- end
137
-
138
- def it
139
- { # Italian
140
- downcase: %w(à è é ì î ò ó ù),
141
- upcase: %w(À È É Ì Î Ò Ó Ù),
142
- permanent: %w(a e e i i o o u)
143
- }
144
- end
145
-
146
- def eo
147
- { # Esperantohas the symbols ŭ, ĉ, ĝ, ĥ, ĵ and ŝ
148
- downcase: %w(ĉ ĝ ĥ ĵ ŝ ŭ),
149
- upcase: %w(Ĉ Ĝ Ĥ Ĵ Ŝ Ŭ),
150
- permanent: %w(c g h j s u)
151
- }
152
- end
153
-
154
- def is
155
- { # Iceland
156
- downcase: %w(ð þ),
157
- upcase: %w(Ð Þ),
158
- permanent: %w(d p)
159
- }
160
- end
161
-
162
- def pt
163
- { # Portugal uses á, â, ã, à, ç, é, ê, í, ó, ô, õ and ú
164
- downcase: %w(ã ç),
165
- upcase: %w(Ã Ç),
166
- permanent: %w(a c)
167
- }
168
- end
169
-
170
- def sp
171
- { # Spanish
172
- downcase: ['ñ', 'õ', '¿', '¡'],
173
- upcase: ['Ñ', 'Õ', '¿', '¡'],
174
- permanent: ['n', 'o', '', '']
175
- }
176
- end
177
-
178
- def hu
179
- { # Hungarian
180
- downcase: %w(ő),
181
- upcase: %w(Ő),
182
- permanent: %w(oe)
183
- }
184
- end
185
-
186
- def nn
187
- { # Norwegian
188
- downcase: %w(æ å),
189
- upcase: %w(Æ Å),
190
- permanent: %w(ae a)
191
- }
192
- end
193
- ```
1
+ # Notes
2
+
3
+ ## Todos
4
+
5
+
6
+ ## Terminology
7
+
8
+ Use Upcase, Downcase AND Titlecase (!)
9
+
10
+ - Example: Ö -> Upcase: OE, Downcase: oe, Titlecase: Oe (!)
11
+ - Example: Æ -> Upcase: AE, Downcase: ae, Titlecase: Ae (!)
12
+
13
+
14
+
15
+ ## Libraries
16
+
17
+ **Ruby**
18
+
19
+ - <https://github.com/SixArm/sixarm_ruby_unaccent> - Replace a string's accent characters with ASCII characters. Based on Perl Text::Unaccent from CPAN.
20
+
21
+ - <https://github.com/fractalsoft/diacritics> - support downcase, upcase and permanent link with diacritical characters
22
+
23
+ **Perl**
24
+
25
+ - <https://metacpan.org/pod/Unicode::Diacritic::Strip> - strip diacritics from Unicode text
26
+
27
+ **JavaScript**
28
+
29
+ - <https://github.com/dundalek/latinize> - convert accents (diacritics) from strings to latin characters
30
+
31
+ - <https://github.com/tyxla/remove-accents> - removes the accents from a string, converting them to their corresponding non-accented ascii characters
32
+
33
+ **PostgreSQL**
34
+
35
+ - <https://www.postgresql.org/docs/current/unaccent.html> - unaccent is a text search dictionary that removes accents (diacritic signs) from lexemes
36
+
37
+
38
+ ## Links
39
+
40
+ **Unicode w/ Ruby - Ruby ♡ Unicode**
41
+
42
+ - <https://idiosyncratic-ruby.com/66-ruby-has-character>
43
+
44
+ Ruby has Character - Ruby comes with good support for Unicode-related features. Read on if you want to learn more about important Unicode fundamentals and how to use them in Ruby...
45
+
46
+ - <https://idiosyncratic-ruby.com/41-proper-unicoding>
47
+
48
+ Proper Unicoding - Ruby's Regexp engine has a powerful feature built in: It can match for Unicode character properties. But what exactly are properties you can match for?
49
+
50
+ - <https://idiosyncratic-ruby.com/30-regex-with-class>
51
+
52
+ Regex with Class - Ruby's regex engine defines a lot of shortcut character classes. Besides the common meta characters (\w, etc.), there is also the POSIX style expressions and the unicode property syntax. This is an overview of all character classes
53
+
54
+
55
+ **Unicode**
56
+
57
+ - <https://unicode.org/reports/tr15/> - Unicode Standard Annex #15 - UNICODE NORMALIZATION FORMS
58
+
59
+ **W3C**
60
+
61
+ - <https://www.w3.org/TR/charmod-norm/>
62
+ - <https://www.w3.org/International/wiki/Case_folding>
63
+
64
+ In Western European languages, the letter 'i' (U+0069) upper cases to a dotless 'I' (U+0049). In Turkish, this letter upper cases to a dotted upper case letter 'İ' (U+0130). Similarly, 'I' (U+0049) lower cases to 'ı' (U+0131), which is a dotless lowercase letter i.
65
+
66
+ **Wikipedia**
67
+
68
+ - <https://en.wikipedia.org/wiki/Diacritic>
69
+
70
+ **More**
71
+
72
+ - [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/)
73
+ by Joel Spolsky, 2003
74
+
75
+ - [Unicode Normalization in Ruby](https://www.honeybadger.io/blog/ruby-unicode-normalization/) by Starr Horne, 2017
76
+
77
+
78
+ ## Mappings
79
+
80
+ Open questions ...
81
+
82
+ ```
83
+ Þ => TH ???
84
+ þ => th ???
85
+ ```
86
+
87
+
88
+ ## Alphabets
89
+
90
+ Add more alphabets... why? why not?
91
+
92
+
93
+ - Portuguese [Â, "abcdefghijklmnopqrstuvwxyzáâãàçéêíóôõú", "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÂÃÀÇÉÊÍÓÔÕÚ"]
94
+ - Russian [Щ, Ъ, Э, "абвгдеёжзийклмнопрстуфхцчшщъыьэюя", "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"]
95
+ - Greek [Β, Μ, Χ, Ω, Ή, Ύ, Ώ, ΐ, ΰ, Ϊ, Ϋ]
96
+ - Slovak ["aáäeéiíoóôuúyýbcčdďfghjklĺľmnňpqrŕsštťvwxzž", "AÁÄEÉIÍOÓÔUÚYÝBCČDĎFGHJKLĹĽMNŇPQRŔSŠTŤVWXZŽ"]
97
+ - Italian ["aàbcdeèéfghiìíîlmnoòópqrstuùúvz", "AÀBCDEÈÉFGHIÌÍÎLMNOÒÓPQRSTUÙÚVZ"]
98
+ - Romanian ["aăâbcdefghiîjklmnopqrsștțuvwxyz", "AĂÂBCDEFGHIÎJKLMNOPQRSȘTȚUVWXYZ"]
99
+ - Danish [å, â, ô, Å, Â, Ô]
100
+
101
+ ```
102
+ def de
103
+ { # German
104
+ downcase: %w(ä ö ü ß),
105
+ upcase: %w(Ä Ö Ü ẞ),
106
+ permanent: %w(ae oe ue ss)
107
+ }
108
+ end
109
+
110
+ def pl
111
+ { # Polish
112
+ downcase: %w(ą ć ę ł ń ó ś ż ź),
113
+ upcase: %w(Ą Ć Ę Ł Ń Ó Ś Ż Ź),
114
+ permanent: %w(a c e l n o s z z)
115
+ }
116
+ end
117
+
118
+ def cs
119
+ { # Czech uses acute (á é í ó ú ý), caron (č ď ě ň ř š ť ž), ring (ů)
120
+ # aábcčdďeéěfghiíjklmnňoópqrřsštťuúůvwxyýzž
121
+ # AÁBCČDĎEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚŮVWXYÝZŽ
122
+ downcase: %w(á é í ó ú ý č ď ě ň ř š ť ů ž),
123
+ upcase: %w(Á É Í Ó Ú Ý Č Ď Ě Ň Ř Š Ť Ů Ž),
124
+ permanent: %w(a e i o u y c d e n r s t u z)
125
+ }
126
+ end
127
+
128
+ def fr
129
+ { # French
130
+ # abcdefghijklmnopqrstuvwxyzàâæçéèêëîïôœùûüÿ
131
+ # ABCDEFGHIJKLMNOPQRSTUVWXYZÀÂÆÇÉÈÊËÎÏÔŒÙÛÜŸ
132
+ downcase: %w(à â é è ë ê ï î ô ù û ü ÿ ç œ æ),
133
+ upcase: %w(À Â É È Ë Ê Ï Î Ô Ù Û Ü Ÿ Ç Œ Æ),
134
+ permanent: %w(a a e e e e i i o u u ue y c oe ae)
135
+ }
136
+ end
137
+
138
+ def it
139
+ { # Italian
140
+ downcase: %w(à è é ì î ò ó ù),
141
+ upcase: %w(À È É Ì Î Ò Ó Ù),
142
+ permanent: %w(a e e i i o o u)
143
+ }
144
+ end
145
+
146
+ def eo
147
+ { # Esperantohas the symbols ŭ, ĉ, ĝ, ĥ, ĵ and ŝ
148
+ downcase: %w(ĉ ĝ ĥ ĵ ŝ ŭ),
149
+ upcase: %w(Ĉ Ĝ Ĥ Ĵ Ŝ Ŭ),
150
+ permanent: %w(c g h j s u)
151
+ }
152
+ end
153
+
154
+ def is
155
+ { # Iceland
156
+ downcase: %w(ð þ),
157
+ upcase: %w(Ð Þ),
158
+ permanent: %w(d p)
159
+ }
160
+ end
161
+
162
+ def pt
163
+ { # Portugal uses á, â, ã, à, ç, é, ê, í, ó, ô, õ and ú
164
+ downcase: %w(ã ç),
165
+ upcase: %w(Ã Ç),
166
+ permanent: %w(a c)
167
+ }
168
+ end
169
+
170
+ def sp
171
+ { # Spanish
172
+ downcase: ['ñ', 'õ', '¿', '¡'],
173
+ upcase: ['Ñ', 'Õ', '¿', '¡'],
174
+ permanent: ['n', 'o', '', '']
175
+ }
176
+ end
177
+
178
+ def hu
179
+ { # Hungarian
180
+ downcase: %w(ő),
181
+ upcase: %w(Ő),
182
+ permanent: %w(oe)
183
+ }
184
+ end
185
+
186
+ def nn
187
+ { # Norwegian
188
+ downcase: %w(æ å),
189
+ upcase: %w(Æ Å),
190
+ permanent: %w(ae a)
191
+ }
192
+ end
193
+ ```
data/README.md CHANGED
@@ -1,26 +1,19 @@
1
- # alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more
2
-
3
-
4
- * home :: [github.com/sportdb/sport.db](https://github.com/sportdb/sport.db)
5
- * bugs :: [github.com/sportdb/sport.db/issues](https://github.com/sportdb/sport.db/issues)
6
- * gem :: [rubygems.org/gems/alphabets](https://rubygems.org/gems/alphabets)
7
- * rdoc :: [rubydoc.info/gems/alphabets](http://rubydoc.info/gems/alphabets)
8
- * forum :: [opensport](http://groups.google.com/group/opensport)
9
-
10
-
11
- ## Usage
12
-
13
- To be done
14
-
15
-
16
- ## License
17
-
18
- The `alphabets` scripts are dedicated to the public domain.
19
- Use it as you please with no restrictions whatsoever.
20
-
21
-
22
- ## Questions? Comments?
23
-
24
- Send them along to the
25
- [Open Sports & Friends Forum/Mailing List](http://groups.google.com/group/opensport).
26
- Thanks!
1
+ # alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more
2
+
3
+
4
+ * home :: [github.com/rubycoco/core](https://github.com/rubycoco/core)
5
+ * bugs :: [github.com/rubycoco/core/issues](https://github.com/rubycoco/core/issues)
6
+ * gem :: [rubygems.org/gems/alphabets](https://rubygems.org/gems/alphabets)
7
+ * rdoc :: [rubydoc.info/gems/alphabets](http://rubydoc.info/gems/alphabets)
8
+
9
+
10
+ ## Usage
11
+
12
+ To be done
13
+
14
+
15
+ ## License
16
+
17
+ The `alphabets` scripts are dedicated to the public domain.
18
+ Use it as you please with no restrictions whatsoever.
19
+
data/Rakefile CHANGED
@@ -1,28 +1,28 @@
1
- require 'hoe'
2
- require './lib/alphabets/version.rb'
3
-
4
- Hoe.spec 'alphabets' do
5
-
6
- self.version = Alphabet::VERSION
7
-
8
- self.summary = "alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more"
9
- self.description = summary
10
-
11
- self.urls = ['https://github.com/sportdb/sport.db']
12
-
13
- self.author = 'Gerald Bauer'
14
- self.email = 'opensport@googlegroups.com'
15
-
16
- # switch extension to .markdown for gihub formatting
17
- self.readme_file = 'README.md'
18
- self.history_file = 'CHANGELOG.md'
19
-
20
- self.licenses = ['Public Domain']
21
-
22
- self.extra_deps = []
23
-
24
- self.spec_extras = {
25
- required_ruby_version: '>= 2.2.2'
26
- }
27
-
28
- end
1
+ require 'hoe'
2
+ require './lib/alphabets/version.rb'
3
+
4
+ Hoe.spec 'alphabets' do
5
+
6
+ self.version = Alphabet::VERSION
7
+
8
+ self.summary = "alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more"
9
+ self.description = summary
10
+
11
+ self.urls = { home: 'https://github.com/rubycoco/core' }
12
+
13
+ self.author = 'Gerald Bauer'
14
+ self.email = 'gerald.bauer@gmail.com'
15
+
16
+ # switch extension to .markdown for gihub formatting
17
+ self.readme_file = 'README.md'
18
+ self.history_file = 'CHANGELOG.md'
19
+
20
+ self.licenses = ['Public Domain']
21
+
22
+ self.extra_deps = []
23
+
24
+ self.spec_extras = {
25
+ required_ruby_version: '>= 2.2.2'
26
+ }
27
+
28
+ end