interscript 0.1.3 → 0.1.4

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,129 @@
1
+ ---
2
+ authority_id: odni
3
+ id: 2015
4
+ language: srp
5
+ source_script: Cyrl
6
+ destination_script: Latn
7
+ name: Office of the Director Of National Intelligence Serbian Personal Names 2015, ICS-630-01 Annex L
8
+ # url:
9
+ source: ICS-630-01 Annex L
10
+ creation_date: 2015
11
+ confirmation_date: 2015
12
+ description: |
13
+ This system is the Intelligence Community (IC) standard for the transliteration of Serbian names
14
+ written in Cyrillic that will be applied to all final written reports and products for IC consumers. It
15
+ is not intended to eliminate variations of a name that can contribute forensic information. Rather,
16
+ it is to provide an IC standard Romanized (English) transliteration from Serbian that can then be
17
+ linked to forensic information in ways that will help identify the referent of the name.
18
+
19
+ In cases where an individual’s name has already been transliterated in a variant spelling, the IC
20
+ Standard spelling should appear first, followed by the variant spelling(s) in parentheses at the first
21
+ usage. In addition, if the original Cyrillic spelling is known, that spelling should also appear in
22
+ parentheses following the name, if possible, following best practices of the issuing organization
23
+ and taking into consideration information system capabilities. This convention is designed to
24
+ ensure that vital forensic information is not lost.
25
+
26
+ For names of persons who are known to not be part of the Serbian-speaking community, use the
27
+ relevant IC transliteration standard for names from that language (e.g., Mikhail, Yitzhak). A
28
+ translator’s note may be used to clarify the known origin of the person. Spell names of
29
+ individuals from languages that are written in Roman letters as they are spelled in those
30
+ languages (e.g., George Clooney, Jorge Garcia, Georges Pompidou).
31
+
32
+ In the case of active senior government officials in the on-line CIA World Factbook and the online directory of Chiefs of State and Cabinet Members of Foreign Governments, the spellings
33
+ given in these on-line reference works should be used in place of the IC Standard. For any
34
+ individual who has at one time been listed in the Factbook or Chiefs of State directory but who no
35
+ longer appears in those resources (i.e. is no longer a government official), the IC Standard
36
+ spelling should appear first, with the spelling, if known, as it previously appeared in those
37
+ resources listed within parentheses at the first usage.
38
+
39
+ The primary goal is to produce a consistent Romanized transcription of names that is specifically
40
+ readable to the English-speaking non-specialist. The system uses the 26 letters of the standard
41
+ (English) Roman alphabet. Some ambiguities in the Romanized form will occur without the use
42
+ of diacritics. However, within the context of a report, where additional information about the
43
+ individual is provided, the referent will be clearly identified. This system will be used in
44
+ conjunction with on-line tools, name dictionaries, and lists containing conventional spellings of
45
+ names of well-known individuals.
46
+
47
+ notes:
48
+
49
+ tests:
50
+ - source: Гојко Митић
51
+ expected: Gojko Mitic
52
+ - source: Горња Ваганица
53
+ expected: Gornja Vaganica
54
+ - source: Довиђења
55
+ expected: Dovidjenja
56
+ - source: Ћао! Здраво!
57
+ expected: Cao! Zdravo!
58
+ - source: Кључ
59
+ expected: Kljuc
60
+ - source: Цигарете
61
+ expected: Cigarete
62
+ - source: Пролеће
63
+ expected: Prolece
64
+ - source: Понедељак
65
+ expected: Ponedeljak
66
+
67
+ map:
68
+ characters:
69
+ '\u0410': 'A' # А
70
+ '\u0411': 'B' # Б
71
+ '\u0412': 'V' # В
72
+ '\u0413': 'G' # Г
73
+ '\u0414': 'D' # Д
74
+ '\u0402': 'Dj' # Ђ
75
+ '\u0415': 'E' # Е
76
+ '\u0416': 'Z' # Ж
77
+ '\u0417': 'Z' # З
78
+ '\u0418': 'I' # И
79
+ '\u0408': 'J' # Ј
80
+ '\u041A': 'K' # К
81
+ '\u041B': 'L' # Л
82
+ '\u0409': 'Lj' # Љ
83
+ '\u041C': 'M' # М
84
+ '\u041D': 'N' # Н
85
+ '\u040A': Nj # Њ
86
+ '\u041E': 'O' # О
87
+ '\u041F': 'P' # П
88
+ '\u0420': 'R' # Р
89
+ '\u0421': 'S' # С
90
+ '\u0422': 'T' # Т
91
+ '\u040B': 'C' # Ћ
92
+ '\u0423': 'U' # У
93
+ '\u0424': 'F' # Ф
94
+ '\u0425': 'H' # Х
95
+ '\u0426': 'C' # Ц
96
+ '\u0427': 'C' # Ч
97
+ '\u040F': 'Dz' # Џ
98
+ '\u0428': 'S' # Ш
99
+
100
+ '\u0430': 'a' # а
101
+ '\u0431': 'b' # б
102
+ '\u0432': 'v' # в
103
+ '\u0433': 'g' # г
104
+ '\u0434': 'd' # д
105
+ '\u0452': 'dj' # ђ
106
+ '\u0435': 'e' # e
107
+ '\u0436': 'z' # ж
108
+ '\u0437': 'z' # з
109
+ '\u0438': 'i' # и
110
+ '\u0458': 'j' # ј
111
+ '\u043A': 'k' # к
112
+ '\u043B': 'l' # л
113
+ '\u0459': 'lj' # љ
114
+ '\u043C': 'm' # м
115
+ '\u043D': 'n' # н
116
+ '\u045A': 'nj' # њ
117
+ '\u043E': 'o' # о
118
+ '\u043F': 'p' # п
119
+ '\u0440': 'r' # р
120
+ '\u0441': 's' # с
121
+ '\u0442': 't' # т
122
+ '\u045B': 'c' # ћ
123
+ '\u0443': 'u' # у
124
+ '\u0444': 'f' # ф
125
+ '\u0445': 'h' # х
126
+ '\u0446': 'c' # ц
127
+ '\u0447': 'c' # ч
128
+ '\u045F': 'dz' # џ
129
+ '\u0448': 's' # ш
@@ -4,7 +4,7 @@ id: 2015
4
4
  language: ukr
5
5
  source_script: Cyrl
6
6
  destination_script: Latn
7
- name: Office of the Director Of National Intelligence Ukrainian Personal Names 2004 System
7
+ name: Office of the Director Of National Intelligence Ukrainian Personal Names 2004 System, ICS 630-01 Annex M
8
8
  # url:
9
9
  source: ICS 630-01, Annex M
10
10
  creation_date: 2015
@@ -0,0 +1,167 @@
1
+ ---
2
+ authority_id: odni
3
+ id: 2015
4
+ language: uzb
5
+ source_script: Cyrl
6
+ destination_script: Latn
7
+ name: Office of the Director Of National Intelligence Uzbek Personal Names 2015, ICS-630-01 Annex V
8
+ # url:
9
+ source: ICS-630-01 Annex V
10
+ creation_date: 2015
11
+ confirmation_date: 2015
12
+ description: |
13
+ This system is the Intelligence Community standard for the transliteration of Uzbek person names
14
+ that will be applied to all final written reports and products for IC consumers. It is not intended to
15
+ eliminate variations of a name that can contribute forensic information. Rather, it is to provide an
16
+ IC standard Romanized (English) transliteration from Uzbek that can then be linked to forensic
17
+ information in ways that will help identify the referent of the name.
18
+
19
+ In cases where an individual’s name has already been transliterated in a variant spelling, the IC
20
+ Standard spelling should appear first, followed by the variant spelling(s) in parentheses at the first
21
+ usage. In addition, if the original Cyrillic-script spelling is known, that spelling should also
22
+ appear in parentheses following the name, if possible, following best practices of the issuing
23
+ organization and taking into consideration information system capabilities. For example:
24
+ Farkhod Tojiev (also seen as Farhod Tadjiyev, Фарход Тожиев). This convention is designed to
25
+ ensure that vital forensic information is not lost.
26
+
27
+ For names of persons who are known to not be part of the Uzbek-speaking community, use the
28
+ relevant IC transliteration standard for names from that language (e.g., Yitzhak). A translator’s
29
+ note may be used to clarify the known origin of the person. Spell names of individuals from
30
+ languages that are written in Roman letters as they are spelled in those languages (e.g., George
31
+ Clooney, Jorge Garcia, Georges Pompidou).
32
+
33
+ In the case of active senior government officials in the on-line CIA World Factbook and the online directory of Chiefs of State and Cabinet Members of Foreign Governments, the spellings
34
+ given in these on-line reference works should be used in place of the IC Standard. For any
35
+ individual who has at one time been listed in the Factbook or Chiefs of State directory but who no
36
+ longer appears in those resources (i.e. is no longer a government official), the IC Standard
37
+ spelling should appear first, with the spelling, if known, as it previously appeared in those
38
+ resources listed within parentheses at the first usage.
39
+
40
+ The primary goal is to produce a consistent Romanized transcription of names that is specifically
41
+ readable to the English-speaking non-specialist. The system uses the 26 letters of the standard
42
+ (English) Roman alphabet. Some ambiguities in the Romanized form will occur without the use
43
+ of diacritics. However, within the context of a report, where additional information about the
44
+ individual is provided, the referent will be clearly identified. This system will be used in
45
+ conjunction with on-line tools, name dictionaries, and lists containing conventional spellings of
46
+ names of well-known individuals.
47
+
48
+ notes:
49
+ - Transliterate double digraphs as a single digraph, i.e. шш -> sh, not shsh
50
+ - In the Roman, no distinction is made between digraphs such as 'sh' and single contiguous letters, (e.g. 's' followed by 'h').
51
+ - The Cyrillic ъ and ь are not transliterated, but instead are left out of the transliteration.
52
+
53
+ tests:
54
+ - source: Фарход Тожиев
55
+ expected: Farkhod Tojiev
56
+ - source: Барча одамлар эркин, қадр-қиммат в ҳуқуқлард тенг бўлиб туғиладилар. Улар ақл в виждон соҳибидирлар в бир-бирлари ила биродарларча муомал қилишларь зарур.
57
+ expected: Barcha odamlar erkin, qadr-qimat v huquqlard teng bolib tughiladilar. Ular aql v vijdon sohibidirlar v bir-birlari ila birodarlarcha muomal qilishlar zarur.
58
+ - source: Тутук белгись
59
+ expected: Tutuk belgis
60
+ - source: Янги юл
61
+ expected: Yangi iul
62
+ - source: Ўзбек ёзуви
63
+ expected: Ozbek yozuvi
64
+ - source: Чиғатай гурунги
65
+ expected: Chighatay gurungi
66
+ - source: ъ
67
+ expected: ''
68
+ - source: шш
69
+ expected: sh
70
+ - source: ччччч
71
+ expected: ch
72
+
73
+ map:
74
+ rules:
75
+ # note[1]
76
+ - pattern: "(.)\\1{1,}"
77
+ result: "\\1"
78
+
79
+ characters:
80
+ '\u0410': 'A' # А
81
+ '\u0411': 'B' # Б
82
+ '\u0412': 'V' # В
83
+ '\u0413': 'G' # Г
84
+ '\u0492': 'Gh' # Ғ
85
+ '\u0414': 'D' # Д
86
+ '\u0415': 'E' # Е
87
+ '\u0401': 'Yo' # Ё
88
+ '\u0416': 'J' # Ж
89
+ '\u0417': 'Z' # З
90
+ '\u0418': 'I' # И
91
+ '\u0419': 'Y' # Й
92
+ '\u041A': 'K' # К
93
+ '\u049A': 'Q' # Қ
94
+ '\u041B': 'L' # Л
95
+ '\u041C': 'M' # М
96
+ '\u041D': 'N' # Н
97
+ '\u041E': 'O' # О
98
+ '\u041F': 'P' # П
99
+ '\u0420': 'R' # Р
100
+ '\u0421': 'S' # С
101
+ '\u0422': 'T' # Т
102
+ '\u0423': 'U' # У
103
+ '\u040E': 'O' # Ў
104
+ '\u0424': 'F' # Ф
105
+ '\u0425': 'Kh' # Х
106
+ '\u04B2': 'H' # Ҳ
107
+ '\u0426': 'Ts' # Ц
108
+ '\u0427': 'Ch' # Ч
109
+ '\u0428': 'Sh' # Ш
110
+ '\u042D': 'E' # Э
111
+ '\u042E': 'Yu' # Ю
112
+ '\u042F': 'Ya' # Я
113
+
114
+ '\u0430': 'a' # а
115
+ '\u0431': 'b' # б
116
+ '\u0432': 'v' # в
117
+ '\u0433': 'g' # г
118
+ '\u0493': 'gh' # ғ
119
+ '\u0434': 'd' # д
120
+ '\u0435': 'e' # e
121
+ '\u0451': 'yo' # ё
122
+ '\u0436': 'j' # ж
123
+ '\u0437': 'z' # з
124
+ '\u0438': 'i' # и
125
+ '\u0439': 'y' # й
126
+ '\u043A': 'k' # к
127
+ '\u049B': 'q' # қ
128
+ '\u043B': 'l' # л
129
+ '\u043C': 'm' # м
130
+ '\u043D': 'n' # н
131
+ '\u043E': 'o' # о
132
+ '\u043F': 'p' # п
133
+ '\u0440': 'r' # р
134
+ '\u0441': 's' # с
135
+ '\u0442': 't' # т
136
+ '\u0443': 'u' # у
137
+ '\u045E': 'o' # ў
138
+ '\u0444': 'f' # ф
139
+ '\u044B': 'y' # ы
140
+ '\u0447': 'ch' # ч
141
+ '\u044F': 'ia' # я
142
+ '\u044E': 'iu' # ю
143
+ '\u0445': 'kh' # х
144
+ '\u04B3': 'h' # ҳ
145
+ '\u0448': 'sh' # ш
146
+ '\u044D': 'e' # э
147
+ '\u0449': 'shch' # щ
148
+ '\u0446': 'ts' # ц
149
+ '\u0491': 'g' # ґ
150
+ '\u046B': 'u' # ѫ
151
+ '\u0452': 'd' # ђ
152
+ '\u0455': 'dz' # ѕ
153
+ '\u0458': 'j' # ј
154
+ '\u0459': 'lj' # љ
155
+ '\u045A': 'nj' # њ
156
+ '\u04BB': 'c' # һ
157
+ '\u045F': 'dz' # џ
158
+ '\u0454': 'ie' # є
159
+ '\u0457': 'i' # ї
160
+ '\u0453': 'g' # ѓ
161
+
162
+ # note[3]
163
+ '\u042a': '' # Ъ
164
+ '\u042c': '' # Ь
165
+ '\u044a': '' # ъ
166
+ '\u044c': '' # ь
167
+
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: interscript
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.1.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - project_contibutors
@@ -143,6 +143,7 @@ files:
143
143
  - lib/model-7
144
144
  - lib/tha-pt-b-7
145
145
  - maps/acadsin-zho-Hani-Latn-2002.yaml
146
+ - maps/alalc-aze-Cyrl-Latn-1997.yaml
146
147
  - maps/alalc-bel-cyrl-latn-1997.yaml
147
148
  - maps/alalc-ben-Beng-Latn-2017.yaml
148
149
  - maps/alalc-bul-Cyrl-Latn-1997.yaml
@@ -153,6 +154,8 @@ files:
153
154
  - maps/alalc-kor-Hang-Latn-1997.yaml
154
155
  - maps/alalc-mkd-Cyrl-Latn-2013.yaml
155
156
  - maps/alalc-mkd-cyrl-latn-1997.yaml
157
+ - maps/alalc-rus-Cyrl-Latn-1997.yaml
158
+ - maps/alalc-rus-Cyrl-Latn-2012.yaml
156
159
  - maps/alalc-srp-Cyrl-Latn-1997.yaml
157
160
  - maps/alalc-srp-cyrl-latn-2013.yaml
158
161
  - maps/alalc-ukr-Cyrl-Latn-1997.yaml
@@ -167,6 +170,7 @@ files:
167
170
  - maps/bgna-bul-Cyrl-Latn-2009.yaml
168
171
  - maps/bgnpcgn-arm-Armn-Latn-1981.yaml
169
172
  - maps/bgnpcgn-aze-Cyrl-Latn-1993.yaml
173
+ - maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml
170
174
  - maps/bgnpcgn-bel-cyrl-latn-1979.yaml
171
175
  - maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml
172
176
  - maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml
@@ -215,9 +219,18 @@ files:
215
219
  - maps/mext-jpn-Hrkt-Latn-1954.yaml
216
220
  - maps/moct-kor-Hang-Latn-2000.yaml
217
221
  - maps/mofa-jpn-Hrkt-Latn-1989.yaml
222
+ - maps/mvd-bel-Cyrl-Latn-2008.yaml
223
+ - maps/mvd-bel-Cyrl-Latn-2010.yaml
224
+ - maps/mvd-rus-Cyrl-Latn-2008.yaml
225
+ - maps/mvd-rus-Cyrl-Latn-2010.yaml
218
226
  - maps/nil-kor-Hang-Hang-jamo.yaml
227
+ - maps/odni-bel-Cyrl-Latn-2015.yaml
228
+ - maps/odni-bul-Cyrl-Latn-2015.yaml
219
229
  - maps/odni-kat-Geor-Latn-2015.yaml
230
+ - maps/odni-rus-Cyrl-Latn-2015.yaml
231
+ - maps/odni-srp-Cyrl-Latn-2015.yaml
220
232
  - maps/odni-ukr-Cyrl-Latn-2015.yaml
233
+ - maps/odni-uzb-Cyrl-Latn-2015.yaml
221
234
  - maps/royin-tha-Thai-Latn-1939-generic.yaml
222
235
  - maps/royin-tha-Thai-Latn-1968.yaml
223
236
  - maps/royin-tha-Thai-Latn-1999-chained.yaml
@@ -263,7 +276,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
263
276
  - !ruby/object:Gem::Version
264
277
  version: 2.4.0
265
278
  requirements: []
266
- rubygems_version: 3.1.2
279
+ rubygems_version: 3.0.3
267
280
  signing_key:
268
281
  specification_version: 4
269
282
  summary: Interoperable script conversion systems