unidecoder 1.1.0 → 1.1.1
Sign up to get free protection for your applications and to get access to all the features.
- data/Changelog.md +11 -0
- data/README.md +21 -1
- data/lib/unidecoder/data/x00.yml +1 -1
- data/lib/unidecoder/data/x02.yml +32 -32
- data/lib/unidecoder/version.rb +1 -1
- metadata +3 -3
data/Changelog.md
CHANGED
@@ -1,5 +1,16 @@
|
|
1
1
|
# Unidecoder Changelog
|
2
2
|
|
3
|
+
## 1.1.1 2010-08-11
|
4
|
+
|
5
|
+
* Added some missing transliterations.
|
6
|
+
* Fixed some other incorrect transliterations.
|
7
|
+
|
8
|
+
## 1.1.0 2010-06-23
|
9
|
+
|
10
|
+
* Ability to override transliterations.
|
11
|
+
* Better Unicode validation.
|
12
|
+
* Cleanups and refactoring.
|
13
|
+
|
3
14
|
## 1.0.0 2009-02-20
|
4
15
|
|
5
16
|
* Unidecoder extracted from Stringex and released as independent gem.
|
data/README.md
CHANGED
@@ -10,7 +10,7 @@ separate library with some added functionality.
|
|
10
10
|
|
11
11
|
The Unidecoder component of Stringex is itself a port of Sean M. Burke's
|
12
12
|
[Unidecode](http://search.cpan.org/dist/Text-Unidecode/lib/Text/Unidecode.pm)
|
13
|
-
|
13
|
+
Perl module.
|
14
14
|
|
15
15
|
## Installation
|
16
16
|
|
@@ -29,3 +29,23 @@ If you also install either the [Unicode](http://github.com/blackwinter/unicode)
|
|
29
29
|
**or** [Active Support](http://github.com/rails/rails) gems, Unidecoder will
|
30
30
|
also perform Unicode normalization before attempting to transliterate strings
|
31
31
|
to ASCII.
|
32
|
+
|
33
|
+
## Warnings
|
34
|
+
|
35
|
+
While this is a neat trick, in practice many transliterations end up being
|
36
|
+
fairly useless. For example, all Chinese characters are transliterated to
|
37
|
+
Mandarin Chinese. Since Japanese uses Chinese characters writing, but
|
38
|
+
pronounces them differently from Mandarin, this makes the transliteration of
|
39
|
+
Japanese with this library useless.
|
40
|
+
|
41
|
+
Some languages, like Russian, would most correctly transliterate some letters
|
42
|
+
based on context, rather than a 1-1 mapping with ASCII. This library does not
|
43
|
+
do that.
|
44
|
+
|
45
|
+
Other languages, like Hebrew and Arabic, don't write vowels, but assume them
|
46
|
+
from context, so the ASCII representation of these langages given by this
|
47
|
+
library will look fairly ugly to native speakers.
|
48
|
+
|
49
|
+
Basically, your milage may vary. I don't speak every language used by this
|
50
|
+
library, so there are certain to be limitations and errors. Your feedback is
|
51
|
+
most appreciated!
|
data/lib/unidecoder/data/x00.yml
CHANGED
data/lib/unidecoder/data/x02.yml
CHANGED
@@ -31,8 +31,8 @@
|
|
31
31
|
- y
|
32
32
|
- H
|
33
33
|
- h
|
34
|
-
-
|
35
|
-
-
|
34
|
+
- N
|
35
|
+
- d
|
36
36
|
- OU
|
37
37
|
- ou
|
38
38
|
- Z
|
@@ -51,34 +51,34 @@
|
|
51
51
|
- o
|
52
52
|
- Y
|
53
53
|
- y
|
54
|
-
-
|
55
|
-
-
|
56
|
-
-
|
57
|
-
-
|
58
|
-
-
|
59
|
-
-
|
60
|
-
-
|
61
|
-
-
|
62
|
-
-
|
63
|
-
-
|
64
|
-
-
|
65
|
-
-
|
66
|
-
-
|
67
|
-
- '
|
68
|
-
- '
|
69
|
-
-
|
70
|
-
-
|
71
|
-
-
|
72
|
-
-
|
73
|
-
-
|
74
|
-
-
|
75
|
-
-
|
76
|
-
-
|
77
|
-
-
|
78
|
-
-
|
79
|
-
-
|
80
|
-
-
|
81
|
-
-
|
54
|
+
- l
|
55
|
+
- n
|
56
|
+
- t
|
57
|
+
- j
|
58
|
+
- db
|
59
|
+
- qp
|
60
|
+
- A
|
61
|
+
- C
|
62
|
+
- c
|
63
|
+
- L
|
64
|
+
- T
|
65
|
+
- s
|
66
|
+
- z
|
67
|
+
- '?'
|
68
|
+
- '?'
|
69
|
+
- B
|
70
|
+
- U
|
71
|
+
- V
|
72
|
+
- E
|
73
|
+
- e
|
74
|
+
- J
|
75
|
+
- j
|
76
|
+
- Q
|
77
|
+
- q
|
78
|
+
- R
|
79
|
+
- r
|
80
|
+
- Y
|
81
|
+
- y
|
82
82
|
- a
|
83
83
|
- a
|
84
84
|
- a
|
@@ -173,8 +173,8 @@
|
|
173
173
|
- lz
|
174
174
|
- WW
|
175
175
|
- ']]'
|
176
|
-
-
|
177
|
-
-
|
176
|
+
- h
|
177
|
+
- h
|
178
178
|
- k
|
179
179
|
- h
|
180
180
|
- j
|
data/lib/unidecoder/version.rb
CHANGED
metadata
CHANGED
@@ -5,8 +5,8 @@ version: !ruby/object:Gem::Version
|
|
5
5
|
segments:
|
6
6
|
- 1
|
7
7
|
- 1
|
8
|
-
-
|
9
|
-
version: 1.1.
|
8
|
+
- 1
|
9
|
+
version: 1.1.1
|
10
10
|
platform: ruby
|
11
11
|
authors:
|
12
12
|
- Russell Norris
|
@@ -15,7 +15,7 @@ autorequire:
|
|
15
15
|
bindir: bin
|
16
16
|
cert_chain: []
|
17
17
|
|
18
|
-
date: 2010-
|
18
|
+
date: 2010-08-11 00:00:00 -03:00
|
19
19
|
default_executable:
|
20
20
|
dependencies: []
|
21
21
|
|