interscript 0.1.1 → 0.1.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/lib/interscript/version.rb +1 -1
- data/maps/bas-rus-Cyrl-Latn-bss.yaml +149 -0
- data/maps/bas-rus-Cyrl-Latn-oss.yaml +149 -0
- data/maps/bgnpcgn-arm-Armn-Latn-1981.yaml +109 -0
- data/maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml +92 -0
- data/maps/bgnpcgn-chn-Hans-Latn-pinyin.yaml +7503 -0
- data/maps/bgnpcgn-per-Arab-Latn-1956.yaml +93 -0
- data/maps/bgnpcgn-rus-Cyrl-Latn-1947.yaml +233 -0
- data/maps/bgnpcgn-ukr-Cyrl-Latn-1965.yaml +90 -0
- data/maps/cn-chn-Hans-Latn-pinyin.yaml +24760 -0
- data/maps/historic-jpn-Hrkt-Latn-hepburn.yaml +336 -0
- data/maps/icao-bel-Cyrl-Latn-9303.yaml +125 -0
- data/maps/icao-bul-Cyrl-Latn-9303.yaml +123 -0
- data/maps/icao-gre-Grek-Latn-9303.yaml +101 -0
- data/maps/icao-heb-Hebr-Latn-9303.yaml +157 -0
- data/maps/icao-mkd-Cyrl-Latn-9303.yaml +118 -0
- data/maps/icao-per-Arab-Latn-9303.yaml +105 -0
- data/maps/icao-rus-Cyrl-Latn-9303.yaml +119 -0
- data/maps/icao-srp-Cyrl-Latn-9303.yaml +118 -0
- data/maps/icao-ukr-Cyrl-Latn-9303.yaml +121 -0
- data/maps/iso-rus-Cyrl-Latn-iso9.yaml +273 -0
- data/maps/mext-jpn-Hrkt-Latn-hepburn.yaml +330 -0
- data/maps/mext-jpn-Hrkt-Latn-kunrei.yaml +308 -0
- data/maps/un-jpn-Hrkt-Latn-hepburn.yaml +313 -0
- data/maps/un-jpn-Hrkt-Latn-kunrei.yaml +354 -0
- data/maps/un-mon-Mong-Latn-2013.yaml +80 -0
- data/spec/interscript_spec.rb +11 -0
- data/spec/spec_helper.rb +1 -0
- metadata +32 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 736d2d149984ce550327443c83f4f8b65ad3a46c106bb2c5d30392292b9e2ed6
|
4
|
+
data.tar.gz: 0dd4633aeccdfb1acfc618fe8708aafd4b7dac8b051fc6e6ac0420fdacc46066
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 156e919c03e8e7a7a0ce804b0d5402df833783bc82037d5fd06b75b4464cee12d4c426a88911394963888205ff4e7bc71e15eb19a6096cc1e71cd7d406efc3a1
|
7
|
+
data.tar.gz: c821069c94ba9e06d2a70b7e0b20916d1222fabf0a4daf7f5014999e24aac8d218beeb59b0316c885bfb28d84a11fb50d4b4216b157a6089f282811ef7ffa590
|
data/lib/interscript/version.rb
CHANGED
@@ -0,0 +1,149 @@
|
|
1
|
+
---
|
2
|
+
authority_id: bas
|
3
|
+
id: bss
|
4
|
+
language: rus
|
5
|
+
source_script: Cyrl
|
6
|
+
destination_script: Latn
|
7
|
+
name: Streamlined Romanization of Russian Cyrillic -- Basic Streamlined System
|
8
|
+
url: https://www.researchgate.net/publication/318402098
|
9
|
+
creation_date: 2017-07
|
10
|
+
description: |
|
11
|
+
The streamlined approach to transliteration was initiated by the
|
12
|
+
author with the development of the Streamlined System for the
|
13
|
+
Romanization of Bulgarian, which was eventually codified by the
|
14
|
+
Transliteration Act of 2009 (ДВ 2009) of the Bulgarian Parliament.
|
15
|
+
|
16
|
+
The four purposes of the system below are in order of priority:
|
17
|
+
1. ensure a plausible phonetic approximation of Russian words by English speaking users, including those having no knowledge of the Russian language and no available additional explanations;
|
18
|
+
2. the system should allow for the retrieval of the original Cyrillic spellings as much as feasible;
|
19
|
+
3. transliterated Russian words should fit an English language environment i.e. not be perceived as too ‘un-English’; and
|
20
|
+
4. transliterated word forms should be streamlined and simple. (Ivanov 2003, Ivanov et al. 2010)
|
21
|
+
|
22
|
+
notes:
|
23
|
+
- Typical for the streamlined approach is its non-use of diacritics,
|
24
|
+
its use of Latin y for rendering only Cyrillic й rather than both й and
|
25
|
+
ы, its non-use of Latin j, as well as its use of Latin h rather than kh
|
26
|
+
for Cyrillic х.
|
27
|
+
|
28
|
+
tests:
|
29
|
+
- source: |
|
30
|
+
Эх, тройка! птица тройка, кто тебя выдумал? знать, у бойкого народа
|
31
|
+
ты могла только родиться, в той земле, что не любит шутить, а
|
32
|
+
ровнем-гладнем разметнулась на полсвета, да и ступай считать версты, пока
|
33
|
+
не зарябит тебе в очи. И не хитрый, кажись, дорожный снаряд, не
|
34
|
+
железным схвачен винтом, а наскоро живьём с одним топором да долотом
|
35
|
+
снарядил и собрал тебя ярославский расторопный мужик. Не в немецких
|
36
|
+
ботфортах ямщик: борода да рукавицы, и сидит чёрт знает на чём; а
|
37
|
+
привстал, да замахнулся, да затянул песню — кони вихрем, спицы в
|
38
|
+
колесах смешались в один гладкий круг, только дрогнула дорога, да вскрикнул
|
39
|
+
в испуге остановившийся пешеход — и вон она понеслась, понеслась,
|
40
|
+
понеслась!
|
41
|
+
|
42
|
+
Н.В. Гоголь
|
43
|
+
expected: |
|
44
|
+
Eh, troyka! ptitsa troyka, kto tebya vidumal? znat, u boykogo
|
45
|
+
naroda ti mogla tolko roditsya, v toy zemle, chto ne lyubit shutit, a
|
46
|
+
rovnem-gladnem razmetnulas na polsveta, da i stupay schitay versti,
|
47
|
+
poka ne zaryabit tebe v ochi. I ne hitriy, kazhis, dorozhniy snaryad,
|
48
|
+
ne zheleznim shvachen vintom, a naskoro zhivyem s odnim toporom da
|
49
|
+
dolotom sobral tebya yaroslavskiy rastoropniy muzhik. Ne v nemetskih
|
50
|
+
botfortah yamshchik: boroda da rukavitsi, i sidit chert znaet na chem;
|
51
|
+
a privstal, da zamahnulsya, da zatyanul pesnyu — koni vihrem, spitsi v
|
52
|
+
kolesah smeshalis v odin gladkiy krug, tolko drognula doroga, da
|
53
|
+
vskriknul v ispuge ostanovivshiysya peshehod — i von ona poneslas,
|
54
|
+
poneslas, poneslas!
|
55
|
+
|
56
|
+
N.V. Gogol
|
57
|
+
|
58
|
+
map:
|
59
|
+
characters:
|
60
|
+
"\u0027": "" # '
|
61
|
+
"\u0410": "A" # А
|
62
|
+
"\u0411": "B" # Б
|
63
|
+
"\u0412": "V" # В
|
64
|
+
"\u0413": "G" # Г
|
65
|
+
"\u0414": "D" # Д
|
66
|
+
"\u0401": "E" # Ё
|
67
|
+
"\u0415": "E" # Е
|
68
|
+
"\u0416": "ZH" # Ж
|
69
|
+
"\u0417": "Z" # З
|
70
|
+
"\u042D": "E" # Э
|
71
|
+
"\u0418": "I" # И
|
72
|
+
"\u0419": "Y" # Й
|
73
|
+
"\u041A": "K" # К
|
74
|
+
"\u041B": "L" # Л
|
75
|
+
"\u041C": "M" # М
|
76
|
+
"\u041D": "N" # Н
|
77
|
+
"\u041E": "O" # О
|
78
|
+
"\u041F": "P" # П
|
79
|
+
"\u0420": "R" # Р
|
80
|
+
"\u0421": "S" # С
|
81
|
+
"\u0422": "T" # Т
|
82
|
+
"\u0423": "U" # У
|
83
|
+
"\u0424": "F" # Ф
|
84
|
+
"\u0425": "H" # Х
|
85
|
+
"\u0426": "TS" # Ц
|
86
|
+
"\u0427": "CH" # Ч
|
87
|
+
"\u0428": "SH" # Ш
|
88
|
+
"\u0429": "SHCH" # Щ
|
89
|
+
"\u042B": "I" # Ы
|
90
|
+
"\u042F": "YA" # Я
|
91
|
+
"\u042E": "YU" # Ю
|
92
|
+
|
93
|
+
# Ь (before Е, Ё, И, O, Э)
|
94
|
+
"\u042c\u0401": "YE" # Ё
|
95
|
+
"\u042c\u0415": "YE" # Е
|
96
|
+
"\u042c\u0418": "YI" # И
|
97
|
+
"\u042c\u041E": "YO" # O
|
98
|
+
"\u042c\u0417": "YE" # Э
|
99
|
+
|
100
|
+
# Ъ (otherwise) -> (none)
|
101
|
+
"\u042c": ""
|
102
|
+
# Ъ -> (none)
|
103
|
+
"\u042a": ""
|
104
|
+
|
105
|
+
"\u0430": "a" # а
|
106
|
+
"\u0431": "b" # б
|
107
|
+
"\u0432": "v" # в
|
108
|
+
"\u0433": "g" # г
|
109
|
+
"\u0434": "d" # д
|
110
|
+
"\u0451": "e" # ё
|
111
|
+
"\u0435": "e" # e
|
112
|
+
"\u0436": "zh" # ж
|
113
|
+
"\u0437": "z" # з
|
114
|
+
"\u044D": "e" # э
|
115
|
+
"\u0438": "i" # и
|
116
|
+
"\u0439": "y" # й
|
117
|
+
"\u043A": "k" # к
|
118
|
+
"\u043B": "l" # л
|
119
|
+
"\u043C": "m" # м
|
120
|
+
"\u043D": "n" # н
|
121
|
+
"\u043E": "o" # о
|
122
|
+
"\u043F": "p" # п
|
123
|
+
"\u0440": "r" # р
|
124
|
+
"\u0441": "s" # с
|
125
|
+
"\u0442": "t" # т
|
126
|
+
"\u0443": "u" # у
|
127
|
+
"\u0444": "f" # ф
|
128
|
+
"\u0445": "h" # х
|
129
|
+
"\u0446": "ts" # ц
|
130
|
+
"\u0447": "ch" # ч
|
131
|
+
"\u0448": "sh" # ш
|
132
|
+
"\u0449": "shch" # щ
|
133
|
+
"\u044B": "i" # ы
|
134
|
+
"\u044F": "ya" # я
|
135
|
+
"\u044E": "yu" # ю
|
136
|
+
|
137
|
+
# ь (before е, ё, и, o, э)
|
138
|
+
"\u044c\u0435": "ye" # ё
|
139
|
+
"\u044c\u0451": "ye" # е
|
140
|
+
"\u044c\u0438": "yi" # и
|
141
|
+
"\u044c\u006f": "yo" # o
|
142
|
+
"\u044c\u044d": "ye" # э
|
143
|
+
|
144
|
+
# ь (otherwise) -> (none)
|
145
|
+
"\u044c": ""
|
146
|
+
|
147
|
+
# ъ -> (none)
|
148
|
+
"\u044a": ""
|
149
|
+
|
@@ -0,0 +1,149 @@
|
|
1
|
+
---
|
2
|
+
authority_id: bas
|
3
|
+
id: oss
|
4
|
+
language: rus
|
5
|
+
source_script: Cyrl
|
6
|
+
destination_script: Latn
|
7
|
+
name: Streamlined Romanization of Russian Cyrillic -- Optimized Streamlined System
|
8
|
+
url: https://www.researchgate.net/publication/318402098
|
9
|
+
creation_date: 2017-07
|
10
|
+
description: |
|
11
|
+
The streamlined approach to transliteration was initiated by the
|
12
|
+
author with the development of the Streamlined System for the
|
13
|
+
Romanization of Bulgarian, which was eventually codified by the
|
14
|
+
Transliteration Act of 2009 (ДВ 2009) of the Bulgarian Parliament.
|
15
|
+
|
16
|
+
The four purposes of the system below are in order of priority:
|
17
|
+
1. ensure a plausible phonetic approximation of Russian words by English speaking users, including those having no knowledge of the Russian language and no available additional explanations;
|
18
|
+
2. the system should allow for the retrieval of the original Cyrillic spellings as much as feasible;
|
19
|
+
3. transliterated Russian words should fit an English language environment i.e. not be perceived as too ‘un-English’; and
|
20
|
+
4. transliterated word forms should be streamlined and simple. (Ivanov 2003, Ivanov et al. 2010)
|
21
|
+
|
22
|
+
notes:
|
23
|
+
- Typical for the streamlined approach is its non-use of diacritics,
|
24
|
+
its use of Latin y for rendering only Cyrillic й rather than both й and
|
25
|
+
ы, its non-use of Latin j, as well as its use of Latin h rather than kh
|
26
|
+
for Cyrillic х.
|
27
|
+
|
28
|
+
tests:
|
29
|
+
- source: |
|
30
|
+
Эх, тройка! птица тройка, кто тебя выдумал? знать, у бойкого народа
|
31
|
+
ты могла только родиться, в той земле, что не любит шутить, а
|
32
|
+
ровнем-гладнем разметнулась на полсвета, да и ступай считать версты, пока
|
33
|
+
не зарябит тебе в очи. И не хитрый, кажись, дорожный снаряд, не
|
34
|
+
железным схвачен винтом, а наскоро живьём с одним топором да долотом
|
35
|
+
снарядил и собрал тебя ярославский расторопный мужик. Не в немецких
|
36
|
+
ботфортах ямщик: борода да рукавицы, и сидит чёрт знает на чём; а
|
37
|
+
привстал, да замахнулся, да затянул песню — кони вихрем, спицы в
|
38
|
+
колесах смешались в один гладкий круг, только дрогнула дорога, да вскрикнул
|
39
|
+
в испуге остановившийся пешеход — и вон она понеслась, понеслась,
|
40
|
+
понеслась!
|
41
|
+
|
42
|
+
Н.В. Гоголь
|
43
|
+
expected: |
|
44
|
+
`Eh, troyka! ptitsa troyka, kto tebya v`idumal? znat', u boykogo
|
45
|
+
naroda t`i mogla tol'ko rodit'sya, v toy zemle, chto ne lyubit shutit',
|
46
|
+
a rovnem-gladnem razmetnulas' na polsveta, da i stupay schitay verst`i,
|
47
|
+
poka ne zaryabit tebe v ochi. I ne hitr`iy, kazhis', dorozhn`iy
|
48
|
+
snaryad, ne zhelezn`im shvachen vintom, a naskoro zhivy``em s odnim
|
49
|
+
toporom da dolotom sobral tebya yaroslavskiy rastoropn`iy muzhik. Ne v
|
50
|
+
nemetskih botfortah yamshchik: boroda da rukavits`i, i sidit ch``ert
|
51
|
+
znaet na ch``em; a privstal, da zamahnulsya, da zatyanul pesnyu — koni
|
52
|
+
vihrem, spits`i v kolesah smeshalis' v odin gladkiy krug, tol'ko
|
53
|
+
drognula doroga, da vskriknul v ispuge ostanovivshiysya peshehod — i
|
54
|
+
von ona poneslas', poneslas', poneslas'!
|
55
|
+
|
56
|
+
N.V. Gogol'
|
57
|
+
|
58
|
+
map:
|
59
|
+
characters:
|
60
|
+
"\u0027": "" # '
|
61
|
+
"\u0410": "A" # А
|
62
|
+
"\u0411": "B" # Б
|
63
|
+
"\u0412": "V" # В
|
64
|
+
"\u0413": "G" # Г
|
65
|
+
"\u0414": "D" # Д
|
66
|
+
"\u0401": "``E" # Ё
|
67
|
+
"\u0415": "E" # Е
|
68
|
+
"\u0416": "ZH" # Ж
|
69
|
+
"\u0417": "Z" # З
|
70
|
+
"\u042D": "`E" # Э
|
71
|
+
"\u0418": "I" # И
|
72
|
+
"\u0419": "Y" # Й
|
73
|
+
"\u041A": "K" # К
|
74
|
+
"\u041B": "L" # Л
|
75
|
+
"\u041C": "M" # М
|
76
|
+
"\u041D": "N" # Н
|
77
|
+
"\u041E": "O" # О
|
78
|
+
"\u041F": "P" # П
|
79
|
+
"\u0420": "R" # Р
|
80
|
+
"\u0421": "S" # С
|
81
|
+
"\u0422": "T" # Т
|
82
|
+
"\u0423": "U" # У
|
83
|
+
"\u0424": "F" # Ф
|
84
|
+
"\u0425": "H" # Х
|
85
|
+
"\u0426": "TS" # Ц
|
86
|
+
"\u0427": "CH" # Ч
|
87
|
+
"\u0428": "SH" # Ш
|
88
|
+
"\u0429": "SHCH" # Щ
|
89
|
+
"\u042B": "`I" # Ы
|
90
|
+
"\u042F": "YA" # Я
|
91
|
+
"\u042E": "YU" # Ю
|
92
|
+
|
93
|
+
# Ь (before Е, Ё, И, O, Э)
|
94
|
+
"\u042c\u0401": "Y``E" # Ё
|
95
|
+
"\u042c\u0415": "YE" # Е
|
96
|
+
"\u042c\u0418": "YI" # И
|
97
|
+
"\u042c\u041E": "YO" # O
|
98
|
+
"\u042c\u0417": "Y`E" # Э
|
99
|
+
|
100
|
+
|
101
|
+
# Ъ (otherwise) -> " (or none)
|
102
|
+
"\u042c": "'"
|
103
|
+
# Ъ -> ' (or none)
|
104
|
+
"\u042a": '"'
|
105
|
+
|
106
|
+
"\u0430": "a" # а
|
107
|
+
"\u0431": "b" # б
|
108
|
+
"\u0432": "v" # в
|
109
|
+
"\u0433": "g" # г
|
110
|
+
"\u0434": "d" # д
|
111
|
+
"\u0451": "``e" # ё
|
112
|
+
"\u0435": "e" # e
|
113
|
+
"\u0436": "zh" # ж
|
114
|
+
"\u0437": "z" # з
|
115
|
+
"\u044D": "`e" # э
|
116
|
+
"\u0438": "i" # и
|
117
|
+
"\u0439": "y" # й
|
118
|
+
"\u043A": "k" # к
|
119
|
+
"\u043B": "l" # л
|
120
|
+
"\u043C": "m" # м
|
121
|
+
"\u043D": "n" # н
|
122
|
+
"\u043E": "o" # о
|
123
|
+
"\u043F": "p" # п
|
124
|
+
"\u0440": "r" # р
|
125
|
+
"\u0441": "s" # с
|
126
|
+
"\u0442": "t" # т
|
127
|
+
"\u0443": "u" # у
|
128
|
+
"\u0444": "f" # ф
|
129
|
+
"\u0445": "h" # х
|
130
|
+
"\u0446": "ts" # ц
|
131
|
+
"\u0447": "ch" # ч
|
132
|
+
"\u0448": "sh" # ш
|
133
|
+
"\u0449": "shch" # щ
|
134
|
+
"\u044B": "`i" # ы
|
135
|
+
"\u044F": "ya" # я
|
136
|
+
"\u044E": "yu" # ю
|
137
|
+
|
138
|
+
# ь (before е, ё, и, o, э)
|
139
|
+
"\u044c\u0435": "ye" # ё
|
140
|
+
"\u044c\u0451": "y``e" # e
|
141
|
+
"\u044c\u0438": "yi" # и
|
142
|
+
"\u044c\u006f": "yo" # o
|
143
|
+
"\u044c\u044d": "y`e" # э
|
144
|
+
|
145
|
+
# Ъ (otherwise) -> " (or none)
|
146
|
+
"\u044c": "'"
|
147
|
+
|
148
|
+
# Ъ -> ' (or none)
|
149
|
+
"\u044a": '"'
|
@@ -0,0 +1,109 @@
|
|
1
|
+
---
|
2
|
+
authority_id: bgnpcgn
|
3
|
+
id: 1981
|
4
|
+
language: arm
|
5
|
+
source_script: Armn
|
6
|
+
destination_script: Latn
|
7
|
+
name: BGN/PCGN 1981 System
|
8
|
+
url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/810208/ROMANIZATION_OF_ARMENIAN.pdf
|
9
|
+
creation_date: 2013
|
10
|
+
confirmation date: 2019-06
|
11
|
+
description: |
|
12
|
+
The BGN/PCGN system for Armenian was designed for use in romanizing
|
13
|
+
names written in the Armenian alphabet. The Roman letters and letter
|
14
|
+
combinations shown as equivalents to the Armenian characters reflect
|
15
|
+
the eastern variety of Armenian, i.e. the language spoken in the
|
16
|
+
Republic of Armenia.
|
17
|
+
|
18
|
+
notes:
|
19
|
+
- The character ե should be romanized ye initially and after the vowel characters ա, ե, է, ը, ի, ո, ու and օ. In all other instances, it should be romanized e.
|
20
|
+
- The character ո should be romanized vo initially except in the word ով, which should be roman- ized ov. In all other instances, it should be romanized o.
|
21
|
+
- In Soviet-era sources this upper-case digraph character is found as Եի (Unicode encoding 0535+056B).
|
22
|
+
- This lower-case character may be seen either in digraph form as եւ (Unicode encoding 0565+0582) or in single character form as եւ (Unicode encoding 0587).
|
23
|
+
- The characters ԵՎ , եւ and եւ should be romanized yev initially, in isolation, and after the vowel characters ա, ե, է, ը, ի, ո, ու, and օ. In all other instances these characters should be romanized ev.
|
24
|
+
- All apostrophes appearing in Armenian romanization are encoded Unicode 2019.
|
25
|
+
- The Romanization column shows only lowercase forms but, when romanizing, uppercase and lowercase Roman letters as appropriate should be used.
|
26
|
+
|
27
|
+
tests:
|
28
|
+
- source:
|
29
|
+
expected:
|
30
|
+
map:
|
31
|
+
characters:
|
32
|
+
'\u0531' : 'A'
|
33
|
+
'\u0532' : 'B'
|
34
|
+
'\u0533' : 'G'
|
35
|
+
'\u0534' : 'D'
|
36
|
+
'\u0535' : 'Ye' #treated same as Russian 'ye'
|
37
|
+
'\u0536' : 'Z'
|
38
|
+
'\u0537' : 'E'
|
39
|
+
'\u0538' : 'Y'
|
40
|
+
'\u0539' : 'T\u2019'
|
41
|
+
'\u053a' : 'Zh'
|
42
|
+
'\u053b' : 'I'
|
43
|
+
'\u053c' : 'L'
|
44
|
+
'\u053d' : 'Kh'
|
45
|
+
'\u053e' : 'Ts'
|
46
|
+
'\u053f' : 'K'
|
47
|
+
'\u0540' : 'H'
|
48
|
+
'\u0541' : 'Dz'
|
49
|
+
'\u0542' : 'Gh'
|
50
|
+
'\u0543' : 'Ch'
|
51
|
+
'\u0544' : 'M'
|
52
|
+
'\u0545' : 'Y'
|
53
|
+
'\u0546' : 'N'
|
54
|
+
'\u0547' : 'Sh'
|
55
|
+
'\u0548' : 'O' # VO initially and U when in combination with \u0552
|
56
|
+
'\u0549' : u'Ch\u2019'
|
57
|
+
'\u054a' : 'P'
|
58
|
+
'\u054b' : 'J'
|
59
|
+
'\u054c' : 'Rr'
|
60
|
+
'\u054d' : 'S'
|
61
|
+
'\u054e' : 'V'
|
62
|
+
'\u054f' : 'T'
|
63
|
+
'\u0550' : 'R'
|
64
|
+
'\u0551' : 'Ts\u2019'
|
65
|
+
'\u0548\u0552' : 'U'
|
66
|
+
'\u0548\u0582' : 'U'
|
67
|
+
'\u0553' : 'P\u2019'
|
68
|
+
'\u0554' : 'K\u2019'
|
69
|
+
'\u0555' : 'O'
|
70
|
+
'\u0556' : 'F'
|
71
|
+
'\u0561' : 'a'
|
72
|
+
'\u0562' : 'b'
|
73
|
+
'\u0563' : 'g'
|
74
|
+
'\u0564' : 'd'
|
75
|
+
'\u0565' : 'e' # ye initially
|
76
|
+
'\u0566' : 'z'
|
77
|
+
'\u0567' : 'e'
|
78
|
+
'\u0568' : 'y'
|
79
|
+
'\u0569' : u't\u2019'
|
80
|
+
'\u056a' : 'zh'
|
81
|
+
'\u056b' : 'i'
|
82
|
+
'\u056c' : 'l'
|
83
|
+
'\u056d' : 'kh'
|
84
|
+
'\u056e' : 'ts'
|
85
|
+
'\u056f' : 'k'
|
86
|
+
'\u0570' : 'h'
|
87
|
+
'\u0571' : 'dz'
|
88
|
+
'\u0572' : 'gh'
|
89
|
+
'\u0573' : 'ch'
|
90
|
+
'\u0574' : 'm'
|
91
|
+
'\u0575' : 'y'
|
92
|
+
'\u0576' : 'n'
|
93
|
+
'\u0577' : 'sh'
|
94
|
+
'\u0578' : 'o' # vo initially and u when in combination with \u0582
|
95
|
+
'\u0579' : 'ch\u2019'
|
96
|
+
'\u057a' : 'p'
|
97
|
+
'\u057b' : 'j'
|
98
|
+
'\u057c' : 'rr'
|
99
|
+
'\u057d' : 's'
|
100
|
+
'\u057e' : 'v'
|
101
|
+
'\u057f' : 't'
|
102
|
+
'\u0580' : 'r'
|
103
|
+
'\u0581' : 'ts\u2019'
|
104
|
+
'\u0578\u0582' : 'u'
|
105
|
+
'\u0583' : 'p\u2019'
|
106
|
+
'\u0584' : 'k\u2019'
|
107
|
+
'\u0585' : 'o'
|
108
|
+
'\u0586' : 'f'
|
109
|
+
'\u0587' : 'ev' # yev initially
|
@@ -0,0 +1,92 @@
|
|
1
|
+
---
|
2
|
+
authority_id: bgnpcgn
|
3
|
+
id: 2013
|
4
|
+
language: bul
|
5
|
+
source_script: Cyrl
|
6
|
+
destination_script: Latn
|
7
|
+
name: BGN/PCGN 2013 Agreement
|
8
|
+
url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/811509/ROMANIZATION_OF_BULGARIAN.pdf
|
9
|
+
creation_date: 2013
|
10
|
+
confirmation date: 2019-06
|
11
|
+
description: |
|
12
|
+
This system reflects the Bulgarian national system officially adopted
|
13
|
+
by state decree in March 2009. It was adopted by BGN and PCGN in 2013,
|
14
|
+
replacing the BGN/PCGN system of 1952.
|
15
|
+
|
16
|
+
notes:
|
17
|
+
- When in final position, “ия” is romanized as “ia” (e.g., София = Sofia; София-Град= Sofia-Grad).
|
18
|
+
- An exception to the romanization system is allowed for the name of the state. Thus, България is roman-
|
19
|
+
ized as Bulgaria.
|
20
|
+
- The Romanization column shows only lowercase forms but, when romanizing, uppercase and lowercase Roman letters as appropriate should be used.
|
21
|
+
tests:
|
22
|
+
- source: София
|
23
|
+
expected: Sofia
|
24
|
+
- source: София-Град
|
25
|
+
expected: Sofia-Grad
|
26
|
+
- source: България
|
27
|
+
expected: Bulgaria
|
28
|
+
|
29
|
+
map:
|
30
|
+
characters:
|
31
|
+
'България': 'Bulgaria'
|
32
|
+
'\u0410': 'A'
|
33
|
+
'\u0411': 'B'
|
34
|
+
'\u0412': 'V'
|
35
|
+
'\u0413': 'G'
|
36
|
+
'\u0414': 'D'
|
37
|
+
'\u0415': 'E'
|
38
|
+
'\u0416': 'ZH'
|
39
|
+
'\u0417': 'Z'
|
40
|
+
'\u0418': 'I'
|
41
|
+
'\u0419': 'Y'
|
42
|
+
'\u041a': 'K'
|
43
|
+
'\u041b': 'L'
|
44
|
+
'\u041c': 'M'
|
45
|
+
'\u041d': 'N'
|
46
|
+
'\u041e': 'O'
|
47
|
+
'\u041f': 'P'
|
48
|
+
'\u0420': 'R'
|
49
|
+
'\u0421': 'S'
|
50
|
+
'\u0422': 'T'
|
51
|
+
'\u0423': 'U'
|
52
|
+
'\u0424': 'F'
|
53
|
+
'\u0425': 'KH'
|
54
|
+
'\u0426': 'TS'
|
55
|
+
'\u0427': 'CH'
|
56
|
+
'\u0428': 'SH'
|
57
|
+
'\u0429': 'SHT'
|
58
|
+
'\u042a': '\u016c'
|
59
|
+
'\u042c': "\'"
|
60
|
+
'\u042e': 'YU'
|
61
|
+
'\u042f': 'YA'
|
62
|
+
'\u0430': 'a'
|
63
|
+
'\u0431': 'b'
|
64
|
+
'\u0432': 'v'
|
65
|
+
'\u0433': 'g'
|
66
|
+
'\u0434': 'd'
|
67
|
+
'\u0435': 'e'
|
68
|
+
'\u0436': 'zh'
|
69
|
+
'\u0437': 'z'
|
70
|
+
'\u0438': 'i'
|
71
|
+
'\u0439': 'y'
|
72
|
+
'\u043a': 'k'
|
73
|
+
'\u043b': 'l'
|
74
|
+
'\u043c': 'm'
|
75
|
+
'\u043d': 'n'
|
76
|
+
'\u043e': 'o'
|
77
|
+
'\u043f': 'p'
|
78
|
+
'\u0440': 'r'
|
79
|
+
'\u0441': 's'
|
80
|
+
'\u0442': 't'
|
81
|
+
'\u0443': 'u'
|
82
|
+
'\u0444': 'f'
|
83
|
+
'\u0445': 'kh'
|
84
|
+
'\u0446': 'ts'
|
85
|
+
'\u0447': 'ch'
|
86
|
+
'\u0448': 'sh'
|
87
|
+
'\u0449': 'sht'
|
88
|
+
'\u044a': '\u016d'
|
89
|
+
'\u044c': "\'"
|
90
|
+
'\u044e': 'yu'
|
91
|
+
'\u044f': 'ya'
|
92
|
+
|