RubyGems - alphabets - Versions diffs - 1.0.0 → 1.0.1 - Mend

alphabets 1.0.0 → 1.0.1

Files changed (15) hide show

checksums.yaml +5 -5
data/CHANGELOG.md +5 -3
data/NOTES.md +193 -193
data/README.md +19 -26
data/Rakefile +28 -28
data/lib/alphabets/alphabets.rb +186 -184
data/lib/alphabets/variants.rb +72 -72
data/lib/alphabets/version.rb +23 -23
data/lib/alphabets.rb +46 -40
data/test/helper.rb +10 -10
data/test/test_downcase.rb +18 -18
data/test/test_reader.rb +1 -1
data/test/test_unaccent.rb +36 -36
data/test/test_variants.rb +36 -36
metadata +18 -13

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
-SHA1:
-  metadata.gz: 20cc18ecb226c6c72b3d03c80c58f5ad16b2e385
-  data.tar.gz: 5d503d887290c139ea920cd06aa05c5f92431a3b
+SHA256:
+  metadata.gz: 302aef0fb6736e1c2fa48e12ee810d2c9279ed98c5aabbeee1f2decb2babb943
+  data.tar.gz: 693f249af93afb58425a6189737ae2b3e2934dbb4cff5b555726529778bbf750
 SHA512:
-  metadata.gz: 4a375e07ac355c395c8e520e216c247ee72bb5b9b8b85c8110e5c1e72f5be11887815ddb5e80976411f914a02911ab5dc39f99e9ce6d6aa786713e5a8791b30d
-  data.tar.gz: 4259ac17a3c0743e2a70e79ae1dc65def9bfacc7baa5f06f1693d39cbfbd902c88ac222d6784282e18261e7dcf605e12777590ea1f2fae6b8408a10f22206fdd
+  metadata.gz: 53b87c802b8fef9be45fa3b294798f302087f683c44af876ad2278e18238c9dd098302c2c4fd5dcc15c8bdddd92d3ad3b3ec1b70f38f229f3bc87c512e264e73
+  data.tar.gz: 5d349a536390743c62ccfbc4332645597fda7e075aa829ff3bf06fb139253353452c26a35210f417b2a8461844c144b6ada57d90651dcf88a1cad5203c0f49e3

data/CHANGELOG.md CHANGED Viewed

@@ -1,3 +1,5 @@
-### 0.0.1 / 2019-08-13
-* Everything is new. First release.
+### 1.0.1 / 2024-06-07
+### 0.0.1 / 2019-08-13
+* Everything is new. First release.

data/NOTES.md CHANGED Viewed

@@ -1,193 +1,193 @@
-# Notes
-## Todos
-## Terminology
-Use Upcase, Downcase AND Titlecase (!)
-- Example: Ö  -> Upcase: OE, Downcase: oe, Titlecase: Oe (!)
-- Example: Æ  -> Upcase: AE, Downcase: ae, Titlecase: Ae (!)
-## Libraries
-**Ruby**
-- <https://github.com/SixArm/sixarm_ruby_unaccent> - Replace a string's accent characters with ASCII characters. Based on Perl Text::Unaccent from CPAN.
-- <https://github.com/fractalsoft/diacritics> - support downcase, upcase and permanent link with diacritical characters
-**Perl**
-- <https://metacpan.org/pod/Unicode::Diacritic::Strip> - strip diacritics from Unicode text
-**JavaScript**
-- <https://github.com/dundalek/latinize> -  convert accents (diacritics) from strings to latin characters
-- <https://github.com/tyxla/remove-accents> - removes the accents from a string, converting them to their corresponding non-accented ascii characters
-**PostgreSQL**
-- <https://www.postgresql.org/docs/current/unaccent.html> - unaccent is a text search dictionary that removes accents (diacritic signs) from lexemes
-## Links
-**Unicode w/ Ruby - Ruby ♡ Unicode**
-- <https://idiosyncratic-ruby.com/66-ruby-has-character>
-Ruby has Character - Ruby comes with good support for Unicode-related features. Read on if you want to learn more about important Unicode fundamentals and how to use them in Ruby...
-- <https://idiosyncratic-ruby.com/41-proper-unicoding>
-Proper Unicoding - Ruby's Regexp engine has a powerful feature built in: It can match for Unicode character properties. But what exactly are properties you can match for?
-- <https://idiosyncratic-ruby.com/30-regex-with-class>
-Regex with Class - Ruby's regex engine defines a lot of shortcut character classes. Besides the common meta characters (\w, etc.), there is also the POSIX style expressions and the unicode property syntax. This is an overview of all character classes
-**Unicode**
-- <https://unicode.org/reports/tr15/> - Unicode Standard Annex #15 - UNICODE NORMALIZATION FORMS
-**W3C**
-- <https://www.w3.org/TR/charmod-norm/>
-- <https://www.w3.org/International/wiki/Case_folding>
-In Western European languages, the letter 'i' (U+0069) upper cases to a dotless 'I' (U+0049). In Turkish, this letter upper cases to a dotted upper case letter 'İ' (U+0130). Similarly, 'I' (U+0049) lower cases to 'ı' (U+0131), which is a dotless lowercase letter i.
-**Wikipedia**
-- <https://en.wikipedia.org/wiki/Diacritic>
-**More**
-- [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/)
-by Joel Spolsky, 2003
-- [Unicode Normalization in Ruby](https://www.honeybadger.io/blog/ruby-unicode-normalization/) by Starr Horne, 2017
-## Mappings
-Open questions ...
-```
- Þ  =>  TH    ???
- þ  =>  th    ???
-```
-## Alphabets
-Add more alphabets... why? why not?
-- Portuguese [Â, "abcdefghijklmnopqrstuvwxyzáâãàçéêíóôõú", "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÂÃÀÇÉÊÍÓÔÕÚ"]
-- Russian [Щ, Ъ, Э, "абвгдеёжзийклмнопрстуфхцчшщъыьэюя", "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"]
-- Greek [Β, Μ, Χ, Ω, Ή, Ύ, Ώ, ΐ, ΰ, Ϊ, Ϋ]
-- Slovak ["aáäeéiíoóôuúyýbcčdďfghjklĺľmnňpqrŕsštťvwxzž", "AÁÄEÉIÍOÓÔUÚYÝBCČDĎFGHJKLĹĽMNŇPQRŔSŠTŤVWXZŽ"]
-- Italian ["aàbcdeèéfghiìíîlmnoòópqrstuùúvz", "AÀBCDEÈÉFGHIÌÍÎLMNOÒÓPQRSTUÙÚVZ"]
-- Romanian ["aăâbcdefghiîjklmnopqrsștțuvwxyz", "AĂÂBCDEFGHIÎJKLMNOPQRSȘTȚUVWXYZ"]
-- Danish [å, â, ô, Å, Â, Ô]
-```
-    def de
-      { # German
-        downcase:  %w(ä ö ü ß),
-        upcase:    %w(Ä Ö Ü ẞ),
-        permanent: %w(ae oe ue ss)
-      }
-    end
-    def pl
-      { # Polish
-        downcase:  %w(ą ć ę ł ń ó ś ż ź),
-        upcase:    %w(Ą Ć Ę Ł Ń Ó Ś Ż Ź),
-        permanent: %w(a c e l n o s z z)
-      }
-    end
-    def cs
-      { # Czech uses acute (á é í ó ú ý), caron (č ď ě ň ř š ť ž), ring (ů)
-        # aábcčdďeéěfghiíjklmnňoópqrřsštťuúůvwxyýzž
-        # AÁBCČDĎEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚŮVWXYÝZŽ
-        downcase:  %w(á é í ó ú ý č ď ě ň ř š ť ů ž),
-        upcase:    %w(Á É Í Ó Ú Ý Č Ď Ě Ň Ř Š Ť Ů Ž),
-        permanent: %w(a e i o u y c d e n r s t u z)
-      }
-    end
-    def fr
-      { # French
-        # abcdefghijklmnopqrstuvwxyzàâæçéèêëîïôœùûüÿ
-        # ABCDEFGHIJKLMNOPQRSTUVWXYZÀÂÆÇÉÈÊËÎÏÔŒÙÛÜŸ
-        downcase:  %w(à â é è ë ê ï î ô ù û ü ÿ ç œ æ),
-        upcase:    %w(À Â É È Ë Ê Ï Î Ô Ù Û Ü Ÿ Ç Œ Æ),
-        permanent: %w(a a e e e e i i o u u ue y c oe ae)
-      }
-    end
-    def it
-      { # Italian
-        downcase:  %w(à è é ì î ò ó ù),
-        upcase:    %w(À È É Ì Î Ò Ó Ù),
-        permanent: %w(a e e i i o o u)
-      }
-    end
-    def eo
-      { # Esperantohas the symbols ŭ, ĉ, ĝ, ĥ, ĵ and ŝ
-        downcase:  %w(ĉ ĝ ĥ ĵ ŝ ŭ),
-        upcase:    %w(Ĉ Ĝ Ĥ Ĵ Ŝ Ŭ),
-        permanent: %w(c g h j s u)
-      }
-    end
-    def is
-      { # Iceland
-        downcase:  %w(ð þ),
-        upcase:    %w(Ð Þ),
-        permanent: %w(d p)
-      }
-    end
-    def pt
-      { # Portugal uses á, â, ã, à, ç, é, ê, í, ó, ô, õ and ú
-        downcase:  %w(ã ç),
-        upcase:    %w(Ã Ç),
-        permanent: %w(a c)
-      }
-    end
-    def sp
-      { # Spanish
-        downcase:  ['ñ', 'õ', '¿', '¡'],
-        upcase:    ['Ñ', 'Õ', '¿', '¡'],
-        permanent: ['n', 'o', '', '']
-      }
-    end
-    def hu
-      { # Hungarian
-        downcase:  %w(ő),
-        upcase:    %w(Ő),
-        permanent: %w(oe)
-      }
-    end
-    def nn
-      { # Norwegian
-        downcase:  %w(æ å),
-        upcase:    %w(Æ Å),
-        permanent: %w(ae a)
-      }
-    end
-```
+# Notes
+## Todos
+## Terminology
+Use Upcase, Downcase AND Titlecase (!)
+- Example: Ö  -> Upcase: OE, Downcase: oe, Titlecase: Oe (!)
+- Example: Æ  -> Upcase: AE, Downcase: ae, Titlecase: Ae (!)
+## Libraries
+**Ruby**
+- <https://github.com/SixArm/sixarm_ruby_unaccent> - Replace a string's accent characters with ASCII characters. Based on Perl Text::Unaccent from CPAN.
+- <https://github.com/fractalsoft/diacritics> - support downcase, upcase and permanent link with diacritical characters
+**Perl**
+- <https://metacpan.org/pod/Unicode::Diacritic::Strip> - strip diacritics from Unicode text
+**JavaScript**
+- <https://github.com/dundalek/latinize> -  convert accents (diacritics) from strings to latin characters
+- <https://github.com/tyxla/remove-accents> - removes the accents from a string, converting them to their corresponding non-accented ascii characters
+**PostgreSQL**
+- <https://www.postgresql.org/docs/current/unaccent.html> - unaccent is a text search dictionary that removes accents (diacritic signs) from lexemes
+## Links
+**Unicode w/ Ruby - Ruby ♡ Unicode**
+- <https://idiosyncratic-ruby.com/66-ruby-has-character>
+Ruby has Character - Ruby comes with good support for Unicode-related features. Read on if you want to learn more about important Unicode fundamentals and how to use them in Ruby...
+- <https://idiosyncratic-ruby.com/41-proper-unicoding>
+Proper Unicoding - Ruby's Regexp engine has a powerful feature built in: It can match for Unicode character properties. But what exactly are properties you can match for?
+- <https://idiosyncratic-ruby.com/30-regex-with-class>
+Regex with Class - Ruby's regex engine defines a lot of shortcut character classes. Besides the common meta characters (\w, etc.), there is also the POSIX style expressions and the unicode property syntax. This is an overview of all character classes
+**Unicode**
+- <https://unicode.org/reports/tr15/> - Unicode Standard Annex #15 - UNICODE NORMALIZATION FORMS
+**W3C**
+- <https://www.w3.org/TR/charmod-norm/>
+- <https://www.w3.org/International/wiki/Case_folding>
+In Western European languages, the letter 'i' (U+0069) upper cases to a dotless 'I' (U+0049). In Turkish, this letter upper cases to a dotted upper case letter 'İ' (U+0130). Similarly, 'I' (U+0049) lower cases to 'ı' (U+0131), which is a dotless lowercase letter i.
+**Wikipedia**
+- <https://en.wikipedia.org/wiki/Diacritic>
+**More**
+- [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/)
+by Joel Spolsky, 2003
+- [Unicode Normalization in Ruby](https://www.honeybadger.io/blog/ruby-unicode-normalization/) by Starr Horne, 2017
+## Mappings
+Open questions ...
+```
+ Þ  =>  TH    ???
+ þ  =>  th    ???
+```
+## Alphabets
+Add more alphabets... why? why not?
+- Portuguese [Â, "abcdefghijklmnopqrstuvwxyzáâãàçéêíóôõú", "ABCDEFGHIJKLMNOPQRSTUVWXYZÁÂÃÀÇÉÊÍÓÔÕÚ"]
+- Russian [Щ, Ъ, Э, "абвгдеёжзийклмнопрстуфхцчшщъыьэюя", "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ"]
+- Greek [Β, Μ, Χ, Ω, Ή, Ύ, Ώ, ΐ, ΰ, Ϊ, Ϋ]
+- Slovak ["aáäeéiíoóôuúyýbcčdďfghjklĺľmnňpqrŕsštťvwxzž", "AÁÄEÉIÍOÓÔUÚYÝBCČDĎFGHJKLĹĽMNŇPQRŔSŠTŤVWXZŽ"]
+- Italian ["aàbcdeèéfghiìíîlmnoòópqrstuùúvz", "AÀBCDEÈÉFGHIÌÍÎLMNOÒÓPQRSTUÙÚVZ"]
+- Romanian ["aăâbcdefghiîjklmnopqrsștțuvwxyz", "AĂÂBCDEFGHIÎJKLMNOPQRSȘTȚUVWXYZ"]
+- Danish [å, â, ô, Å, Â, Ô]
+```
+    def de
+      { # German
+        downcase:  %w(ä ö ü ß),
+        upcase:    %w(Ä Ö Ü ẞ),
+        permanent: %w(ae oe ue ss)
+      }
+    end
+    def pl
+      { # Polish
+        downcase:  %w(ą ć ę ł ń ó ś ż ź),
+        upcase:    %w(Ą Ć Ę Ł Ń Ó Ś Ż Ź),
+        permanent: %w(a c e l n o s z z)
+      }
+    end
+    def cs
+      { # Czech uses acute (á é í ó ú ý), caron (č ď ě ň ř š ť ž), ring (ů)
+        # aábcčdďeéěfghiíjklmnňoópqrřsštťuúůvwxyýzž
+        # AÁBCČDĎEÉĚFGHIÍJKLMNŇOÓPQRŘSŠTŤUÚŮVWXYÝZŽ
+        downcase:  %w(á é í ó ú ý č ď ě ň ř š ť ů ž),
+        upcase:    %w(Á É Í Ó Ú Ý Č Ď Ě Ň Ř Š Ť Ů Ž),
+        permanent: %w(a e i o u y c d e n r s t u z)
+      }
+    end
+    def fr
+      { # French
+        # abcdefghijklmnopqrstuvwxyzàâæçéèêëîïôœùûüÿ
+        # ABCDEFGHIJKLMNOPQRSTUVWXYZÀÂÆÇÉÈÊËÎÏÔŒÙÛÜŸ
+        downcase:  %w(à â é è ë ê ï î ô ù û ü ÿ ç œ æ),
+        upcase:    %w(À Â É È Ë Ê Ï Î Ô Ù Û Ü Ÿ Ç Œ Æ),
+        permanent: %w(a a e e e e i i o u u ue y c oe ae)
+      }
+    end
+    def it
+      { # Italian
+        downcase:  %w(à è é ì î ò ó ù),
+        upcase:    %w(À È É Ì Î Ò Ó Ù),
+        permanent: %w(a e e i i o o u)
+      }
+    end
+    def eo
+      { # Esperantohas the symbols ŭ, ĉ, ĝ, ĥ, ĵ and ŝ
+        downcase:  %w(ĉ ĝ ĥ ĵ ŝ ŭ),
+        upcase:    %w(Ĉ Ĝ Ĥ Ĵ Ŝ Ŭ),
+        permanent: %w(c g h j s u)
+      }
+    end
+    def is
+      { # Iceland
+        downcase:  %w(ð þ),
+        upcase:    %w(Ð Þ),
+        permanent: %w(d p)
+      }
+    end
+    def pt
+      { # Portugal uses á, â, ã, à, ç, é, ê, í, ó, ô, õ and ú
+        downcase:  %w(ã ç),
+        upcase:    %w(Ã Ç),
+        permanent: %w(a c)
+      }
+    end
+    def sp
+      { # Spanish
+        downcase:  ['ñ', 'õ', '¿', '¡'],
+        upcase:    ['Ñ', 'Õ', '¿', '¡'],
+        permanent: ['n', 'o', '', '']
+      }
+    end
+    def hu
+      { # Hungarian
+        downcase:  %w(ő),
+        upcase:    %w(Ő),
+        permanent: %w(oe)
+      }
+    end
+    def nn
+      { # Norwegian
+        downcase:  %w(æ å),
+        upcase:    %w(Æ Å),
+        permanent: %w(ae a)
+      }
+    end
+```

data/README.md CHANGED Viewed

@@ -1,26 +1,19 @@
-# alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more
-* home  :: [github.com/sportdb/sport.db](https://github.com/sportdb/sport.db)
-* bugs  :: [github.com/sportdb/sport.db/issues](https://github.com/sportdb/sport.db/issues)
-* gem   :: [rubygems.org/gems/alphabets](https://rubygems.org/gems/alphabets)
-* rdoc  :: [rubydoc.info/gems/alphabets](http://rubydoc.info/gems/alphabets)
-* forum :: [opensport](http://groups.google.com/group/opensport)
-## Usage
-To be done
-## License
-The `alphabets` scripts are dedicated to the public domain.
-Use it as you please with no restrictions whatsoever.
-## Questions? Comments?
-Send them along to the
-[Open Sports & Friends Forum/Mailing List](http://groups.google.com/group/opensport).
-Thanks!
+# alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more
+* home  :: [github.com/rubycoco/core](https://github.com/rubycoco/core)
+* bugs  :: [github.com/rubycoco/core/issues](https://github.com/rubycoco/core/issues)
+* gem   :: [rubygems.org/gems/alphabets](https://rubygems.org/gems/alphabets)
+* rdoc  :: [rubydoc.info/gems/alphabets](http://rubydoc.info/gems/alphabets)
+## Usage
+To be done
+## License
+The `alphabets` scripts are dedicated to the public domain.
+Use it as you please with no restrictions whatsoever.

data/Rakefile CHANGED Viewed

@@ -1,28 +1,28 @@
-require 'hoe'
-require './lib/alphabets/version.rb'
-Hoe.spec 'alphabets' do
-  self.version = Alphabet::VERSION
-  self.summary = "alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more"
-  self.description = summary
-  self.urls = ['https://github.com/sportdb/sport.db']
-  self.author = 'Gerald Bauer'
-  self.email = 'opensport@googlegroups.com'
-  # switch extension to .markdown for gihub formatting
-  self.readme_file = 'README.md'
-  self.history_file = 'CHANGELOG.md'
-  self.licenses = ['Public Domain']
-  self.extra_deps = []
-  self.spec_extras = {
-    required_ruby_version: '>= 2.2.2'
-  }
-end
+require 'hoe'
+require './lib/alphabets/version.rb'
+Hoe.spec 'alphabets' do
+  self.version = Alphabet::VERSION
+  self.summary = "alphabets - alphabet (a-z) helpers incl. unaccent, downcase, variants, and more"
+  self.description = summary
+  self.urls = { home: 'https://github.com/rubycoco/core' }
+  self.author = 'Gerald Bauer'
+  self.email  = 'gerald.bauer@gmail.com'
+  # switch extension to .markdown for gihub formatting
+  self.readme_file = 'README.md'
+  self.history_file = 'CHANGELOG.md'
+  self.licenses = ['Public Domain']
+  self.extra_deps = []
+  self.spec_extras = {
+    required_ruby_version: '>= 2.2.2'
+  }
+end