unidecoder 1.1.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Changelog.md +11 -0
- data/README.md +21 -1
- data/lib/unidecoder/data/x00.yml +1 -1
- data/lib/unidecoder/data/x02.yml +32 -32
- data/lib/unidecoder/version.rb +1 -1
- metadata +3 -3
data/Changelog.md
CHANGED
|
@@ -1,5 +1,16 @@
|
|
|
1
1
|
# Unidecoder Changelog
|
|
2
2
|
|
|
3
|
+
## 1.1.1 2010-08-11
|
|
4
|
+
|
|
5
|
+
* Added some missing transliterations.
|
|
6
|
+
* Fixed some other incorrect transliterations.
|
|
7
|
+
|
|
8
|
+
## 1.1.0 2010-06-23
|
|
9
|
+
|
|
10
|
+
* Ability to override transliterations.
|
|
11
|
+
* Better Unicode validation.
|
|
12
|
+
* Cleanups and refactoring.
|
|
13
|
+
|
|
3
14
|
## 1.0.0 2009-02-20
|
|
4
15
|
|
|
5
16
|
* Unidecoder extracted from Stringex and released as independent gem.
|
data/README.md
CHANGED
|
@@ -10,7 +10,7 @@ separate library with some added functionality.
|
|
|
10
10
|
|
|
11
11
|
The Unidecoder component of Stringex is itself a port of Sean M. Burke's
|
|
12
12
|
[Unidecode](http://search.cpan.org/dist/Text-Unidecode/lib/Text/Unidecode.pm)
|
|
13
|
-
|
|
13
|
+
Perl module.
|
|
14
14
|
|
|
15
15
|
## Installation
|
|
16
16
|
|
|
@@ -29,3 +29,23 @@ If you also install either the [Unicode](http://github.com/blackwinter/unicode)
|
|
|
29
29
|
**or** [Active Support](http://github.com/rails/rails) gems, Unidecoder will
|
|
30
30
|
also perform Unicode normalization before attempting to transliterate strings
|
|
31
31
|
to ASCII.
|
|
32
|
+
|
|
33
|
+
## Warnings
|
|
34
|
+
|
|
35
|
+
While this is a neat trick, in practice many transliterations end up being
|
|
36
|
+
fairly useless. For example, all Chinese characters are transliterated to
|
|
37
|
+
Mandarin Chinese. Since Japanese uses Chinese characters writing, but
|
|
38
|
+
pronounces them differently from Mandarin, this makes the transliteration of
|
|
39
|
+
Japanese with this library useless.
|
|
40
|
+
|
|
41
|
+
Some languages, like Russian, would most correctly transliterate some letters
|
|
42
|
+
based on context, rather than a 1-1 mapping with ASCII. This library does not
|
|
43
|
+
do that.
|
|
44
|
+
|
|
45
|
+
Other languages, like Hebrew and Arabic, don't write vowels, but assume them
|
|
46
|
+
from context, so the ASCII representation of these langages given by this
|
|
47
|
+
library will look fairly ugly to native speakers.
|
|
48
|
+
|
|
49
|
+
Basically, your milage may vary. I don't speak every language used by this
|
|
50
|
+
library, so there are certain to be limitations and errors. Your feedback is
|
|
51
|
+
most appreciated!
|
data/lib/unidecoder/data/x00.yml
CHANGED
data/lib/unidecoder/data/x02.yml
CHANGED
|
@@ -31,8 +31,8 @@
|
|
|
31
31
|
- y
|
|
32
32
|
- H
|
|
33
33
|
- h
|
|
34
|
-
-
|
|
35
|
-
-
|
|
34
|
+
- N
|
|
35
|
+
- d
|
|
36
36
|
- OU
|
|
37
37
|
- ou
|
|
38
38
|
- Z
|
|
@@ -51,34 +51,34 @@
|
|
|
51
51
|
- o
|
|
52
52
|
- Y
|
|
53
53
|
- y
|
|
54
|
-
-
|
|
55
|
-
-
|
|
56
|
-
-
|
|
57
|
-
-
|
|
58
|
-
-
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
-
|
|
62
|
-
-
|
|
63
|
-
-
|
|
64
|
-
-
|
|
65
|
-
-
|
|
66
|
-
-
|
|
67
|
-
- '
|
|
68
|
-
- '
|
|
69
|
-
-
|
|
70
|
-
-
|
|
71
|
-
-
|
|
72
|
-
-
|
|
73
|
-
-
|
|
74
|
-
-
|
|
75
|
-
-
|
|
76
|
-
-
|
|
77
|
-
-
|
|
78
|
-
-
|
|
79
|
-
-
|
|
80
|
-
-
|
|
81
|
-
-
|
|
54
|
+
- l
|
|
55
|
+
- n
|
|
56
|
+
- t
|
|
57
|
+
- j
|
|
58
|
+
- db
|
|
59
|
+
- qp
|
|
60
|
+
- A
|
|
61
|
+
- C
|
|
62
|
+
- c
|
|
63
|
+
- L
|
|
64
|
+
- T
|
|
65
|
+
- s
|
|
66
|
+
- z
|
|
67
|
+
- '?'
|
|
68
|
+
- '?'
|
|
69
|
+
- B
|
|
70
|
+
- U
|
|
71
|
+
- V
|
|
72
|
+
- E
|
|
73
|
+
- e
|
|
74
|
+
- J
|
|
75
|
+
- j
|
|
76
|
+
- Q
|
|
77
|
+
- q
|
|
78
|
+
- R
|
|
79
|
+
- r
|
|
80
|
+
- Y
|
|
81
|
+
- y
|
|
82
82
|
- a
|
|
83
83
|
- a
|
|
84
84
|
- a
|
|
@@ -173,8 +173,8 @@
|
|
|
173
173
|
- lz
|
|
174
174
|
- WW
|
|
175
175
|
- ']]'
|
|
176
|
-
-
|
|
177
|
-
-
|
|
176
|
+
- h
|
|
177
|
+
- h
|
|
178
178
|
- k
|
|
179
179
|
- h
|
|
180
180
|
- j
|
data/lib/unidecoder/version.rb
CHANGED
metadata
CHANGED
|
@@ -5,8 +5,8 @@ version: !ruby/object:Gem::Version
|
|
|
5
5
|
segments:
|
|
6
6
|
- 1
|
|
7
7
|
- 1
|
|
8
|
-
-
|
|
9
|
-
version: 1.1.
|
|
8
|
+
- 1
|
|
9
|
+
version: 1.1.1
|
|
10
10
|
platform: ruby
|
|
11
11
|
authors:
|
|
12
12
|
- Russell Norris
|
|
@@ -15,7 +15,7 @@ autorequire:
|
|
|
15
15
|
bindir: bin
|
|
16
16
|
cert_chain: []
|
|
17
17
|
|
|
18
|
-
date: 2010-
|
|
18
|
+
date: 2010-08-11 00:00:00 -03:00
|
|
19
19
|
default_executable:
|
|
20
20
|
dependencies: []
|
|
21
21
|
|