interscript 0.1.3 → 0.1.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/lib/interscript/version.rb +1 -1
- data/maps/alalc-aze-Cyrl-Latn-1997.yaml +141 -0
- data/maps/alalc-rus-Cyrl-Latn-1997.yaml +222 -0
- data/maps/alalc-rus-Cyrl-Latn-2012.yaml +162 -0
- data/maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml +184 -0
- data/maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml +9 -9
- data/maps/mvd-bel-Cyrl-Latn-2008.yaml +225 -0
- data/maps/mvd-bel-Cyrl-Latn-2010.yaml +63 -0
- data/maps/mvd-rus-Cyrl-Latn-2008.yaml +110 -0
- data/maps/mvd-rus-Cyrl-Latn-2010.yaml +37 -0
- data/maps/odni-bel-Cyrl-Latn-2015.yaml +148 -0
- data/maps/odni-bul-Cyrl-Latn-2015.yaml +96 -0
- data/maps/odni-kat-Geor-Latn-2015.yaml +1 -1
- data/maps/odni-rus-Cyrl-Latn-2015.yaml +77 -0
- data/maps/odni-srp-Cyrl-Latn-2015.yaml +129 -0
- data/maps/odni-ukr-Cyrl-Latn-2015.yaml +1 -1
- data/maps/odni-uzb-Cyrl-Latn-2015.yaml +167 -0
- metadata +15 -2
@@ -0,0 +1,129 @@
|
|
1
|
+
---
|
2
|
+
authority_id: odni
|
3
|
+
id: 2015
|
4
|
+
language: srp
|
5
|
+
source_script: Cyrl
|
6
|
+
destination_script: Latn
|
7
|
+
name: Office of the Director Of National Intelligence Serbian Personal Names 2015, ICS-630-01 Annex L
|
8
|
+
# url:
|
9
|
+
source: ICS-630-01 Annex L
|
10
|
+
creation_date: 2015
|
11
|
+
confirmation_date: 2015
|
12
|
+
description: |
|
13
|
+
This system is the Intelligence Community (IC) standard for the transliteration of Serbian names
|
14
|
+
written in Cyrillic that will be applied to all final written reports and products for IC consumers. It
|
15
|
+
is not intended to eliminate variations of a name that can contribute forensic information. Rather,
|
16
|
+
it is to provide an IC standard Romanized (English) transliteration from Serbian that can then be
|
17
|
+
linked to forensic information in ways that will help identify the referent of the name.
|
18
|
+
|
19
|
+
In cases where an individual’s name has already been transliterated in a variant spelling, the IC
|
20
|
+
Standard spelling should appear first, followed by the variant spelling(s) in parentheses at the first
|
21
|
+
usage. In addition, if the original Cyrillic spelling is known, that spelling should also appear in
|
22
|
+
parentheses following the name, if possible, following best practices of the issuing organization
|
23
|
+
and taking into consideration information system capabilities. This convention is designed to
|
24
|
+
ensure that vital forensic information is not lost.
|
25
|
+
|
26
|
+
For names of persons who are known to not be part of the Serbian-speaking community, use the
|
27
|
+
relevant IC transliteration standard for names from that language (e.g., Mikhail, Yitzhak). A
|
28
|
+
translator’s note may be used to clarify the known origin of the person. Spell names of
|
29
|
+
individuals from languages that are written in Roman letters as they are spelled in those
|
30
|
+
languages (e.g., George Clooney, Jorge Garcia, Georges Pompidou).
|
31
|
+
|
32
|
+
In the case of active senior government officials in the on-line CIA World Factbook and the online directory of Chiefs of State and Cabinet Members of Foreign Governments, the spellings
|
33
|
+
given in these on-line reference works should be used in place of the IC Standard. For any
|
34
|
+
individual who has at one time been listed in the Factbook or Chiefs of State directory but who no
|
35
|
+
longer appears in those resources (i.e. is no longer a government official), the IC Standard
|
36
|
+
spelling should appear first, with the spelling, if known, as it previously appeared in those
|
37
|
+
resources listed within parentheses at the first usage.
|
38
|
+
|
39
|
+
The primary goal is to produce a consistent Romanized transcription of names that is specifically
|
40
|
+
readable to the English-speaking non-specialist. The system uses the 26 letters of the standard
|
41
|
+
(English) Roman alphabet. Some ambiguities in the Romanized form will occur without the use
|
42
|
+
of diacritics. However, within the context of a report, where additional information about the
|
43
|
+
individual is provided, the referent will be clearly identified. This system will be used in
|
44
|
+
conjunction with on-line tools, name dictionaries, and lists containing conventional spellings of
|
45
|
+
names of well-known individuals.
|
46
|
+
|
47
|
+
notes:
|
48
|
+
|
49
|
+
tests:
|
50
|
+
- source: Гојко Митић
|
51
|
+
expected: Gojko Mitic
|
52
|
+
- source: Горња Ваганица
|
53
|
+
expected: Gornja Vaganica
|
54
|
+
- source: Довиђења
|
55
|
+
expected: Dovidjenja
|
56
|
+
- source: Ћао! Здраво!
|
57
|
+
expected: Cao! Zdravo!
|
58
|
+
- source: Кључ
|
59
|
+
expected: Kljuc
|
60
|
+
- source: Цигарете
|
61
|
+
expected: Cigarete
|
62
|
+
- source: Пролеће
|
63
|
+
expected: Prolece
|
64
|
+
- source: Понедељак
|
65
|
+
expected: Ponedeljak
|
66
|
+
|
67
|
+
map:
|
68
|
+
characters:
|
69
|
+
'\u0410': 'A' # А
|
70
|
+
'\u0411': 'B' # Б
|
71
|
+
'\u0412': 'V' # В
|
72
|
+
'\u0413': 'G' # Г
|
73
|
+
'\u0414': 'D' # Д
|
74
|
+
'\u0402': 'Dj' # Ђ
|
75
|
+
'\u0415': 'E' # Е
|
76
|
+
'\u0416': 'Z' # Ж
|
77
|
+
'\u0417': 'Z' # З
|
78
|
+
'\u0418': 'I' # И
|
79
|
+
'\u0408': 'J' # Ј
|
80
|
+
'\u041A': 'K' # К
|
81
|
+
'\u041B': 'L' # Л
|
82
|
+
'\u0409': 'Lj' # Љ
|
83
|
+
'\u041C': 'M' # М
|
84
|
+
'\u041D': 'N' # Н
|
85
|
+
'\u040A': Nj # Њ
|
86
|
+
'\u041E': 'O' # О
|
87
|
+
'\u041F': 'P' # П
|
88
|
+
'\u0420': 'R' # Р
|
89
|
+
'\u0421': 'S' # С
|
90
|
+
'\u0422': 'T' # Т
|
91
|
+
'\u040B': 'C' # Ћ
|
92
|
+
'\u0423': 'U' # У
|
93
|
+
'\u0424': 'F' # Ф
|
94
|
+
'\u0425': 'H' # Х
|
95
|
+
'\u0426': 'C' # Ц
|
96
|
+
'\u0427': 'C' # Ч
|
97
|
+
'\u040F': 'Dz' # Џ
|
98
|
+
'\u0428': 'S' # Ш
|
99
|
+
|
100
|
+
'\u0430': 'a' # а
|
101
|
+
'\u0431': 'b' # б
|
102
|
+
'\u0432': 'v' # в
|
103
|
+
'\u0433': 'g' # г
|
104
|
+
'\u0434': 'd' # д
|
105
|
+
'\u0452': 'dj' # ђ
|
106
|
+
'\u0435': 'e' # e
|
107
|
+
'\u0436': 'z' # ж
|
108
|
+
'\u0437': 'z' # з
|
109
|
+
'\u0438': 'i' # и
|
110
|
+
'\u0458': 'j' # ј
|
111
|
+
'\u043A': 'k' # к
|
112
|
+
'\u043B': 'l' # л
|
113
|
+
'\u0459': 'lj' # љ
|
114
|
+
'\u043C': 'm' # м
|
115
|
+
'\u043D': 'n' # н
|
116
|
+
'\u045A': 'nj' # њ
|
117
|
+
'\u043E': 'o' # о
|
118
|
+
'\u043F': 'p' # п
|
119
|
+
'\u0440': 'r' # р
|
120
|
+
'\u0441': 's' # с
|
121
|
+
'\u0442': 't' # т
|
122
|
+
'\u045B': 'c' # ћ
|
123
|
+
'\u0443': 'u' # у
|
124
|
+
'\u0444': 'f' # ф
|
125
|
+
'\u0445': 'h' # х
|
126
|
+
'\u0446': 'c' # ц
|
127
|
+
'\u0447': 'c' # ч
|
128
|
+
'\u045F': 'dz' # џ
|
129
|
+
'\u0448': 's' # ш
|
@@ -4,7 +4,7 @@ id: 2015
|
|
4
4
|
language: ukr
|
5
5
|
source_script: Cyrl
|
6
6
|
destination_script: Latn
|
7
|
-
name: Office of the Director Of National Intelligence Ukrainian Personal Names 2004 System
|
7
|
+
name: Office of the Director Of National Intelligence Ukrainian Personal Names 2004 System, ICS 630-01 Annex M
|
8
8
|
# url:
|
9
9
|
source: ICS 630-01, Annex M
|
10
10
|
creation_date: 2015
|
@@ -0,0 +1,167 @@
|
|
1
|
+
---
|
2
|
+
authority_id: odni
|
3
|
+
id: 2015
|
4
|
+
language: uzb
|
5
|
+
source_script: Cyrl
|
6
|
+
destination_script: Latn
|
7
|
+
name: Office of the Director Of National Intelligence Uzbek Personal Names 2015, ICS-630-01 Annex V
|
8
|
+
# url:
|
9
|
+
source: ICS-630-01 Annex V
|
10
|
+
creation_date: 2015
|
11
|
+
confirmation_date: 2015
|
12
|
+
description: |
|
13
|
+
This system is the Intelligence Community standard for the transliteration of Uzbek person names
|
14
|
+
that will be applied to all final written reports and products for IC consumers. It is not intended to
|
15
|
+
eliminate variations of a name that can contribute forensic information. Rather, it is to provide an
|
16
|
+
IC standard Romanized (English) transliteration from Uzbek that can then be linked to forensic
|
17
|
+
information in ways that will help identify the referent of the name.
|
18
|
+
|
19
|
+
In cases where an individual’s name has already been transliterated in a variant spelling, the IC
|
20
|
+
Standard spelling should appear first, followed by the variant spelling(s) in parentheses at the first
|
21
|
+
usage. In addition, if the original Cyrillic-script spelling is known, that spelling should also
|
22
|
+
appear in parentheses following the name, if possible, following best practices of the issuing
|
23
|
+
organization and taking into consideration information system capabilities. For example:
|
24
|
+
Farkhod Tojiev (also seen as Farhod Tadjiyev, Фарход Тожиев). This convention is designed to
|
25
|
+
ensure that vital forensic information is not lost.
|
26
|
+
|
27
|
+
For names of persons who are known to not be part of the Uzbek-speaking community, use the
|
28
|
+
relevant IC transliteration standard for names from that language (e.g., Yitzhak). A translator’s
|
29
|
+
note may be used to clarify the known origin of the person. Spell names of individuals from
|
30
|
+
languages that are written in Roman letters as they are spelled in those languages (e.g., George
|
31
|
+
Clooney, Jorge Garcia, Georges Pompidou).
|
32
|
+
|
33
|
+
In the case of active senior government officials in the on-line CIA World Factbook and the online directory of Chiefs of State and Cabinet Members of Foreign Governments, the spellings
|
34
|
+
given in these on-line reference works should be used in place of the IC Standard. For any
|
35
|
+
individual who has at one time been listed in the Factbook or Chiefs of State directory but who no
|
36
|
+
longer appears in those resources (i.e. is no longer a government official), the IC Standard
|
37
|
+
spelling should appear first, with the spelling, if known, as it previously appeared in those
|
38
|
+
resources listed within parentheses at the first usage.
|
39
|
+
|
40
|
+
The primary goal is to produce a consistent Romanized transcription of names that is specifically
|
41
|
+
readable to the English-speaking non-specialist. The system uses the 26 letters of the standard
|
42
|
+
(English) Roman alphabet. Some ambiguities in the Romanized form will occur without the use
|
43
|
+
of diacritics. However, within the context of a report, where additional information about the
|
44
|
+
individual is provided, the referent will be clearly identified. This system will be used in
|
45
|
+
conjunction with on-line tools, name dictionaries, and lists containing conventional spellings of
|
46
|
+
names of well-known individuals.
|
47
|
+
|
48
|
+
notes:
|
49
|
+
- Transliterate double digraphs as a single digraph, i.e. шш -> sh, not shsh
|
50
|
+
- In the Roman, no distinction is made between digraphs such as 'sh' and single contiguous letters, (e.g. 's' followed by 'h').
|
51
|
+
- The Cyrillic ъ and ь are not transliterated, but instead are left out of the transliteration.
|
52
|
+
|
53
|
+
tests:
|
54
|
+
- source: Фарход Тожиев
|
55
|
+
expected: Farkhod Tojiev
|
56
|
+
- source: Барча одамлар эркин, қадр-қиммат в ҳуқуқлард тенг бўлиб туғиладилар. Улар ақл в виждон соҳибидирлар в бир-бирлари ила биродарларча муомал қилишларь зарур.
|
57
|
+
expected: Barcha odamlar erkin, qadr-qimat v huquqlard teng bolib tughiladilar. Ular aql v vijdon sohibidirlar v bir-birlari ila birodarlarcha muomal qilishlar zarur.
|
58
|
+
- source: Тутук белгись
|
59
|
+
expected: Tutuk belgis
|
60
|
+
- source: Янги юл
|
61
|
+
expected: Yangi iul
|
62
|
+
- source: Ўзбек ёзуви
|
63
|
+
expected: Ozbek yozuvi
|
64
|
+
- source: Чиғатай гурунги
|
65
|
+
expected: Chighatay gurungi
|
66
|
+
- source: ъ
|
67
|
+
expected: ''
|
68
|
+
- source: шш
|
69
|
+
expected: sh
|
70
|
+
- source: ччччч
|
71
|
+
expected: ch
|
72
|
+
|
73
|
+
map:
|
74
|
+
rules:
|
75
|
+
# note[1]
|
76
|
+
- pattern: "(.)\\1{1,}"
|
77
|
+
result: "\\1"
|
78
|
+
|
79
|
+
characters:
|
80
|
+
'\u0410': 'A' # А
|
81
|
+
'\u0411': 'B' # Б
|
82
|
+
'\u0412': 'V' # В
|
83
|
+
'\u0413': 'G' # Г
|
84
|
+
'\u0492': 'Gh' # Ғ
|
85
|
+
'\u0414': 'D' # Д
|
86
|
+
'\u0415': 'E' # Е
|
87
|
+
'\u0401': 'Yo' # Ё
|
88
|
+
'\u0416': 'J' # Ж
|
89
|
+
'\u0417': 'Z' # З
|
90
|
+
'\u0418': 'I' # И
|
91
|
+
'\u0419': 'Y' # Й
|
92
|
+
'\u041A': 'K' # К
|
93
|
+
'\u049A': 'Q' # Қ
|
94
|
+
'\u041B': 'L' # Л
|
95
|
+
'\u041C': 'M' # М
|
96
|
+
'\u041D': 'N' # Н
|
97
|
+
'\u041E': 'O' # О
|
98
|
+
'\u041F': 'P' # П
|
99
|
+
'\u0420': 'R' # Р
|
100
|
+
'\u0421': 'S' # С
|
101
|
+
'\u0422': 'T' # Т
|
102
|
+
'\u0423': 'U' # У
|
103
|
+
'\u040E': 'O' # Ў
|
104
|
+
'\u0424': 'F' # Ф
|
105
|
+
'\u0425': 'Kh' # Х
|
106
|
+
'\u04B2': 'H' # Ҳ
|
107
|
+
'\u0426': 'Ts' # Ц
|
108
|
+
'\u0427': 'Ch' # Ч
|
109
|
+
'\u0428': 'Sh' # Ш
|
110
|
+
'\u042D': 'E' # Э
|
111
|
+
'\u042E': 'Yu' # Ю
|
112
|
+
'\u042F': 'Ya' # Я
|
113
|
+
|
114
|
+
'\u0430': 'a' # а
|
115
|
+
'\u0431': 'b' # б
|
116
|
+
'\u0432': 'v' # в
|
117
|
+
'\u0433': 'g' # г
|
118
|
+
'\u0493': 'gh' # ғ
|
119
|
+
'\u0434': 'd' # д
|
120
|
+
'\u0435': 'e' # e
|
121
|
+
'\u0451': 'yo' # ё
|
122
|
+
'\u0436': 'j' # ж
|
123
|
+
'\u0437': 'z' # з
|
124
|
+
'\u0438': 'i' # и
|
125
|
+
'\u0439': 'y' # й
|
126
|
+
'\u043A': 'k' # к
|
127
|
+
'\u049B': 'q' # қ
|
128
|
+
'\u043B': 'l' # л
|
129
|
+
'\u043C': 'm' # м
|
130
|
+
'\u043D': 'n' # н
|
131
|
+
'\u043E': 'o' # о
|
132
|
+
'\u043F': 'p' # п
|
133
|
+
'\u0440': 'r' # р
|
134
|
+
'\u0441': 's' # с
|
135
|
+
'\u0442': 't' # т
|
136
|
+
'\u0443': 'u' # у
|
137
|
+
'\u045E': 'o' # ў
|
138
|
+
'\u0444': 'f' # ф
|
139
|
+
'\u044B': 'y' # ы
|
140
|
+
'\u0447': 'ch' # ч
|
141
|
+
'\u044F': 'ia' # я
|
142
|
+
'\u044E': 'iu' # ю
|
143
|
+
'\u0445': 'kh' # х
|
144
|
+
'\u04B3': 'h' # ҳ
|
145
|
+
'\u0448': 'sh' # ш
|
146
|
+
'\u044D': 'e' # э
|
147
|
+
'\u0449': 'shch' # щ
|
148
|
+
'\u0446': 'ts' # ц
|
149
|
+
'\u0491': 'g' # ґ
|
150
|
+
'\u046B': 'u' # ѫ
|
151
|
+
'\u0452': 'd' # ђ
|
152
|
+
'\u0455': 'dz' # ѕ
|
153
|
+
'\u0458': 'j' # ј
|
154
|
+
'\u0459': 'lj' # љ
|
155
|
+
'\u045A': 'nj' # њ
|
156
|
+
'\u04BB': 'c' # һ
|
157
|
+
'\u045F': 'dz' # џ
|
158
|
+
'\u0454': 'ie' # є
|
159
|
+
'\u0457': 'i' # ї
|
160
|
+
'\u0453': 'g' # ѓ
|
161
|
+
|
162
|
+
# note[3]
|
163
|
+
'\u042a': '' # Ъ
|
164
|
+
'\u042c': '' # Ь
|
165
|
+
'\u044a': '' # ъ
|
166
|
+
'\u044c': '' # ь
|
167
|
+
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: interscript
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- project_contibutors
|
@@ -143,6 +143,7 @@ files:
|
|
143
143
|
- lib/model-7
|
144
144
|
- lib/tha-pt-b-7
|
145
145
|
- maps/acadsin-zho-Hani-Latn-2002.yaml
|
146
|
+
- maps/alalc-aze-Cyrl-Latn-1997.yaml
|
146
147
|
- maps/alalc-bel-cyrl-latn-1997.yaml
|
147
148
|
- maps/alalc-ben-Beng-Latn-2017.yaml
|
148
149
|
- maps/alalc-bul-Cyrl-Latn-1997.yaml
|
@@ -153,6 +154,8 @@ files:
|
|
153
154
|
- maps/alalc-kor-Hang-Latn-1997.yaml
|
154
155
|
- maps/alalc-mkd-Cyrl-Latn-2013.yaml
|
155
156
|
- maps/alalc-mkd-cyrl-latn-1997.yaml
|
157
|
+
- maps/alalc-rus-Cyrl-Latn-1997.yaml
|
158
|
+
- maps/alalc-rus-Cyrl-Latn-2012.yaml
|
156
159
|
- maps/alalc-srp-Cyrl-Latn-1997.yaml
|
157
160
|
- maps/alalc-srp-cyrl-latn-2013.yaml
|
158
161
|
- maps/alalc-ukr-Cyrl-Latn-1997.yaml
|
@@ -167,6 +170,7 @@ files:
|
|
167
170
|
- maps/bgna-bul-Cyrl-Latn-2009.yaml
|
168
171
|
- maps/bgnpcgn-arm-Armn-Latn-1981.yaml
|
169
172
|
- maps/bgnpcgn-aze-Cyrl-Latn-1993.yaml
|
173
|
+
- maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml
|
170
174
|
- maps/bgnpcgn-bel-cyrl-latn-1979.yaml
|
171
175
|
- maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml
|
172
176
|
- maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml
|
@@ -215,9 +219,18 @@ files:
|
|
215
219
|
- maps/mext-jpn-Hrkt-Latn-1954.yaml
|
216
220
|
- maps/moct-kor-Hang-Latn-2000.yaml
|
217
221
|
- maps/mofa-jpn-Hrkt-Latn-1989.yaml
|
222
|
+
- maps/mvd-bel-Cyrl-Latn-2008.yaml
|
223
|
+
- maps/mvd-bel-Cyrl-Latn-2010.yaml
|
224
|
+
- maps/mvd-rus-Cyrl-Latn-2008.yaml
|
225
|
+
- maps/mvd-rus-Cyrl-Latn-2010.yaml
|
218
226
|
- maps/nil-kor-Hang-Hang-jamo.yaml
|
227
|
+
- maps/odni-bel-Cyrl-Latn-2015.yaml
|
228
|
+
- maps/odni-bul-Cyrl-Latn-2015.yaml
|
219
229
|
- maps/odni-kat-Geor-Latn-2015.yaml
|
230
|
+
- maps/odni-rus-Cyrl-Latn-2015.yaml
|
231
|
+
- maps/odni-srp-Cyrl-Latn-2015.yaml
|
220
232
|
- maps/odni-ukr-Cyrl-Latn-2015.yaml
|
233
|
+
- maps/odni-uzb-Cyrl-Latn-2015.yaml
|
221
234
|
- maps/royin-tha-Thai-Latn-1939-generic.yaml
|
222
235
|
- maps/royin-tha-Thai-Latn-1968.yaml
|
223
236
|
- maps/royin-tha-Thai-Latn-1999-chained.yaml
|
@@ -263,7 +276,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
263
276
|
- !ruby/object:Gem::Version
|
264
277
|
version: 2.4.0
|
265
278
|
requirements: []
|
266
|
-
rubygems_version: 3.
|
279
|
+
rubygems_version: 3.0.3
|
267
280
|
signing_key:
|
268
281
|
specification_version: 4
|
269
282
|
summary: Interoperable script conversion systems
|