interscript 0.1.5 → 2.1.0a8

Sign up to get free protection for your applications and to get access to all the features.
Files changed (200) hide show
  1. checksums.yaml +4 -4
  2. data/.gitignore +11 -0
  3. data/.rspec +3 -0
  4. data/Gemfile +29 -0
  5. data/LICENSE.adoc +31 -0
  6. data/README.md +3 -0
  7. data/Rakefile +53 -0
  8. data/bin/console +14 -0
  9. data/bin/interscript +3 -39
  10. data/bin/maps_analyze_staging +168 -0
  11. data/bin/maps_debug_compilers +58 -0
  12. data/bin/maps_debug_ordering +88 -0
  13. data/bin/maps_debug_ruby_compile +24 -0
  14. data/bin/maps_debug_step_by_step +44 -0
  15. data/bin/maps_optimize_order +112 -0
  16. data/bin/maps_v1_analyze_regexps +45 -0
  17. data/bin/maps_v1_to_v2 +426 -0
  18. data/exe/interscript +6 -0
  19. data/interscript.gemspec +31 -0
  20. data/lib/interscript.rb +81 -123
  21. data/lib/interscript/command.rb +5 -5
  22. data/lib/interscript/compiler.rb +22 -0
  23. data/lib/interscript/compiler/javascript.rb +292 -0
  24. data/lib/interscript/compiler/ruby.rb +262 -0
  25. data/lib/interscript/dsl.rb +67 -0
  26. data/lib/interscript/dsl/aliases.rb +23 -0
  27. data/lib/interscript/dsl/document.rb +46 -0
  28. data/lib/interscript/dsl/group.rb +45 -0
  29. data/lib/interscript/dsl/group/parallel.rb +6 -0
  30. data/lib/interscript/dsl/items.rb +89 -0
  31. data/lib/interscript/dsl/metadata.rb +26 -0
  32. data/lib/interscript/dsl/stage.rb +6 -0
  33. data/lib/interscript/dsl/symbol_mm.rb +11 -0
  34. data/lib/interscript/dsl/tests.rb +12 -0
  35. data/lib/interscript/interpreter.rb +251 -0
  36. data/lib/interscript/node.rb +25 -0
  37. data/lib/interscript/node/alias_def.rb +15 -0
  38. data/lib/interscript/node/dependency.rb +13 -0
  39. data/lib/interscript/node/document.rb +45 -0
  40. data/lib/interscript/node/group.rb +34 -0
  41. data/lib/interscript/node/group/parallel.rb +9 -0
  42. data/lib/interscript/node/group/sequential.rb +2 -0
  43. data/lib/interscript/node/item.rb +52 -0
  44. data/lib/interscript/node/item/alias.rb +42 -0
  45. data/lib/interscript/node/item/any.rb +61 -0
  46. data/lib/interscript/node/item/capture.rb +50 -0
  47. data/lib/interscript/node/item/group.rb +51 -0
  48. data/lib/interscript/node/item/repeat.rb +40 -0
  49. data/lib/interscript/node/item/stage.rb +23 -0
  50. data/lib/interscript/node/item/string.rb +51 -0
  51. data/lib/interscript/node/metadata.rb +18 -0
  52. data/lib/interscript/node/rule.rb +6 -0
  53. data/lib/interscript/node/rule/funcall.rb +18 -0
  54. data/lib/interscript/node/rule/run.rb +15 -0
  55. data/lib/interscript/node/rule/sub.rb +65 -0
  56. data/lib/interscript/node/stage.rb +19 -0
  57. data/lib/interscript/node/tests.rb +15 -0
  58. data/lib/interscript/stdlib.rb +211 -0
  59. data/lib/interscript/utils/regexp_converter.rb +283 -0
  60. data/lib/interscript/version.rb +1 -1
  61. data/requirements.txt +1 -0
  62. metadata +73 -311
  63. data/README.adoc +0 -298
  64. data/bin/rspec +0 -29
  65. data/lib/__pycache__/g2pwrapper.cpython-38.pyc +0 -0
  66. data/lib/g2pwrapper.py +0 -34
  67. data/lib/interscript-opal.rb +0 -2
  68. data/lib/interscript/fs.rb +0 -69
  69. data/lib/interscript/mapping.rb +0 -142
  70. data/lib/interscript/opal.rb +0 -23
  71. data/lib/interscript/opal/maps.js.erb +0 -7
  72. data/lib/interscript/opal_map_translate.rb +0 -12
  73. data/lib/model-7 +0 -0
  74. data/lib/tha-pt-b-7 +0 -0
  75. data/maps/acadsin-zho-Hani-Latn-2002.yaml +0 -38912
  76. data/maps/alalc-aze-Cyrl-Latn-1997.yaml +0 -141
  77. data/maps/alalc-bel-cyrl-latn-1997.yaml +0 -125
  78. data/maps/alalc-ben-Beng-Latn-2017.yaml +0 -130
  79. data/maps/alalc-bul-Cyrl-Latn-1997.yaml +0 -94
  80. data/maps/alalc-ell-Grek-Latn-1997.yaml +0 -625
  81. data/maps/alalc-ell-Grek-Latn-2010.yaml +0 -628
  82. data/maps/alalc-kat-Geok-Latn-1997.yaml +0 -112
  83. data/maps/alalc-kat-Geor-Latn-1997.yaml +0 -146
  84. data/maps/alalc-kor-Hang-Latn-1997.yaml +0 -94
  85. data/maps/alalc-mkd-Cyrl-Latn-2013.yaml +0 -103
  86. data/maps/alalc-mkd-cyrl-latn-1997.yaml +0 -114
  87. data/maps/alalc-rus-Cyrl-Latn-1997.yaml +0 -222
  88. data/maps/alalc-rus-Cyrl-Latn-2012.yaml +0 -162
  89. data/maps/alalc-srp-Cyrl-Latn-1997.yaml +0 -114
  90. data/maps/alalc-srp-cyrl-latn-2013.yaml +0 -135
  91. data/maps/alalc-ukr-Cyrl-Latn-1997.yaml +0 -141
  92. data/maps/alalc-ukr-Cyrl-Latn-2011.yaml +0 -16
  93. data/maps/apcbg-bul-Cyrl-Latn-1995.yaml +0 -283
  94. data/maps/bas-rus-Cyrl-Latn-2017-bss.yaml +0 -175
  95. data/maps/bas-rus-Cyrl-Latn-2017-oss.yaml +0 -169
  96. data/maps/bgn-jpn-Hrkt-Latn-1962.yaml +0 -294
  97. data/maps/bgn-kor-Hang-Latn-1943.yaml +0 -31
  98. data/maps/bgn-kor-Kore-Latn-1943.yaml +0 -31
  99. data/maps/bgna-bul-Cyrl-Latn-2006.yaml +0 -208
  100. data/maps/bgna-bul-Cyrl-Latn-2009.yaml +0 -208
  101. data/maps/bgnpcgn-arm-Armn-Latn-1981.yaml +0 -108
  102. data/maps/bgnpcgn-aze-Cyrl-Latn-1993.yaml +0 -104
  103. data/maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml +0 -184
  104. data/maps/bgnpcgn-bel-cyrl-latn-1979.yaml +0 -285
  105. data/maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml +0 -115
  106. data/maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml +0 -38
  107. data/maps/bgnpcgn-ell-Grek-Latn-1962.yaml +0 -702
  108. data/maps/bgnpcgn-ell-Grek-Latn-1996.yaml +0 -20
  109. data/maps/bgnpcgn-jpn-Hrkt-Latn-1976.yaml +0 -257
  110. data/maps/bgnpcgn-kat-Geor-Latn-1981.yaml +0 -127
  111. data/maps/bgnpcgn-kat-Geor-Latn-2009.yaml +0 -43
  112. data/maps/bgnpcgn-kor-Hang-Latn-kn-1945.yaml +0 -253
  113. data/maps/bgnpcgn-kor-Hang-Latn-rok-2011.yaml +0 -48
  114. data/maps/bgnpcgn-kor-Kore-Latn-rok-2011.yaml +0 -48
  115. data/maps/bgnpcgn-mkd-Cyrl-Latn-1981.yaml +0 -159
  116. data/maps/bgnpcgn-mkd-Cyrl-Latn-2013.yaml +0 -190
  117. data/maps/bgnpcgn-per-Arab-Latn-1956.yaml +0 -93
  118. data/maps/bgnpcgn-rus-Cyrl-Latn-1947.yaml +0 -314
  119. data/maps/bgnpcgn-srp-Cyrl-Latn-2005.yaml +0 -166
  120. data/maps/bgnpcgn-ukr-Cyrl-Latn-1965.yaml +0 -163
  121. data/maps/bgnpcgn-ukr-Cyrl-Latn-2019.yaml +0 -208
  122. data/maps/bgnpcgn-zho-Hans-Latn-1979.yaml +0 -7456
  123. data/maps/by-bel-Cyrl-Latn-1998.yaml +0 -168
  124. data/maps/by-bel-Cyrl-Latn-2007.yaml +0 -115
  125. data/maps/elot-ell-Grek-Latn-743-1982-tl.yaml +0 -685
  126. data/maps/elot-ell-Grek-Latn-743-1982-ts.yaml +0 -681
  127. data/maps/elot-ell-Grek-Latn-743-2001-tl.yaml +0 -20
  128. data/maps/elot-ell-Grek-Latn-743-2001-ts.yaml +0 -32
  129. data/maps/ggg-kat-Geor-Latn-2002.yaml +0 -89
  130. data/maps/gki-bel-cyrl-latn-1992.yaml +0 -33
  131. data/maps/gki-bel-cyrl-latn-2000.yaml +0 -201
  132. data/maps/gost-rus-cyrl-latn-16876-71-1983.yaml +0 -186
  133. data/maps/hk-yue-Hani-Latn-1888.yaml +0 -38497
  134. data/maps/icao-bel-Cyrl-Latn-9303.yaml +0 -141
  135. data/maps/icao-bul-Cyrl-Latn-9303.yaml +0 -122
  136. data/maps/icao-heb-Hebr-Latn-9303.yaml +0 -151
  137. data/maps/icao-mkd-Cyrl-Latn-9303.yaml +0 -117
  138. data/maps/icao-per-Arab-Latn-9303.yaml +0 -104
  139. data/maps/icao-rus-Cyrl-Latn-9303.yaml +0 -118
  140. data/maps/icao-srp-Cyrl-Latn-9303.yaml +0 -117
  141. data/maps/icao-ukr-Cyrl-Latn-9303.yaml +0 -120
  142. data/maps/iso-ell-Grek-Latn-843-1997-t1.yaml +0 -610
  143. data/maps/iso-ell-Grek-Latn-843-1997-t2.yaml +0 -41
  144. data/maps/iso-jpn-Hrkt-Latn-3602-1989.yaml +0 -62
  145. data/maps/iso-rus-Cyrl-Latn-9-1995.yaml +0 -272
  146. data/maps/iso-tha-Thai-Latn-11940-1998.yaml +0 -109
  147. data/maps/kp-kor-Hang-Latn-2002.yaml +0 -901
  148. data/maps/lshk-yue-Hani-Latn-jyutping-1993.yaml +0 -44820
  149. data/maps/mext-jpn-Hrkt-Latn-1954.yaml +0 -411
  150. data/maps/moct-kor-Hang-Latn-2000.yaml +0 -803
  151. data/maps/mofa-jpn-Hrkt-Latn-1989.yaml +0 -541
  152. data/maps/mvd-bel-Cyrl-Latn-2008.yaml +0 -225
  153. data/maps/mvd-bel-Cyrl-Latn-2010.yaml +0 -63
  154. data/maps/mvd-rus-Cyrl-Latn-2008.yaml +0 -110
  155. data/maps/mvd-rus-Cyrl-Latn-2010.yaml +0 -37
  156. data/maps/nil-kor-Hang-Hang-jamo.yaml +0 -11193
  157. data/maps/odni-aze-Cyrl-Latn-2015.yaml +0 -144
  158. data/maps/odni-bel-Cyrl-Latn-2015.yaml +0 -148
  159. data/maps/odni-bul-Cyrl-Latn-2015.yaml +0 -96
  160. data/maps/odni-kat-Geor-Latn-2015.yaml +0 -88
  161. data/maps/odni-kaz-Cyrl-Latn-2015.yaml +0 -148
  162. data/maps/odni-kir-Cyrl-Latn-2015.yaml +0 -136
  163. data/maps/odni-mkd-cyrl-latn-2015.yaml +0 -122
  164. data/maps/odni-rus-Cyrl-Latn-2015.yaml +0 -77
  165. data/maps/odni-srp-Cyrl-Latn-2015.yaml +0 -129
  166. data/maps/odni-tat-Cyrl-Latn-2015.yaml +0 -142
  167. data/maps/odni-tgk-Cyrl-Latn-2015.yaml +0 -148
  168. data/maps/odni-uig-Cyrl-Latn-2015.yaml +0 -138
  169. data/maps/odni-ukr-Cyrl-Latn-2015.yaml +0 -157
  170. data/maps/odni-uzb-Cyrl-Latn-2015.yaml +0 -167
  171. data/maps/royin-tha-Thai-Latn-1939-generic.yaml +0 -90
  172. data/maps/royin-tha-Thai-Latn-1968.yaml +0 -179
  173. data/maps/royin-tha-Thai-Latn-1999-chained.yaml +0 -180
  174. data/maps/royin-tha-Thai-Latn-1999.yaml +0 -76
  175. data/maps/sac-zho-Hans-Latn-1979.yaml +0 -24759
  176. data/maps/ses-ara-arab-latn-1930.yaml +0 -275
  177. data/maps/stategeocadastre-ukr-Cyrl-Latn-1993.yaml +0 -222
  178. data/maps/ua-ukr-Cyrl-Latn-1996.yaml +0 -193
  179. data/maps/un-ara-Arab-Latn-1971.yaml +0 -127
  180. data/maps/un-ara-Arab-Latn-1972.yaml +0 -152
  181. data/maps/un-ara-Arab-Latn-2017.yaml +0 -383
  182. data/maps/un-bel-Cyrl-Latn-2007.yaml +0 -114
  183. data/maps/un-ben-Beng-Latn-2016.yaml +0 -534
  184. data/maps/un-ell-Grek-Latn-1987-tl.yaml +0 -32
  185. data/maps/un-ell-Grek-Latn-1987-ts.yaml +0 -20
  186. data/maps/un-ell-Grek-Latn-phonetic-1987.yaml +0 -780
  187. data/maps/un-mon-Mong-Latn-2013.yaml +0 -93
  188. data/maps/un-rus-Cyrl-Latn-1987.yaml +0 -166
  189. data/maps/un-ukr-cyrl-latn-1998.yaml +0 -30
  190. data/maps/var-jpn-Hrkt-Latn-hepburn-1886.yaml +0 -406
  191. data/maps/var-jpn-Hrkt-Latn-hepburn-1954.yaml +0 -386
  192. data/maps/var-kor-Hang-Latn-mr-1939.yaml +0 -1054
  193. data/maps/var-kor-Kore-Hang-2013.yaml +0 -59754
  194. data/maps/var-kor-Kore-Latn-mr-1939.yaml +0 -37
  195. data/maps/var-tha-Thai-Thai-phonemic.yaml +0 -59
  196. data/maps/var-tha-Thai-Zsym-ipa.yaml +0 -301
  197. data/maps/var-zho-Hani-Latn-1979.yaml +0 -38908
  198. data/spec/interscript/mapping_spec.rb +0 -42
  199. data/spec/interscript_spec.rb +0 -26
  200. data/spec/spec_helper.rb +0 -3
@@ -1,3 +1,3 @@
1
1
  module Interscript
2
- VERSION = "0.1.5"
2
+ VERSION = "2.1.0a8"
3
3
  end
data/requirements.txt ADDED
@@ -0,0 +1 @@
1
+ torch
metadata CHANGED
@@ -1,12 +1,12 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: interscript
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.5
4
+ version: 2.1.0a8
5
5
  platform: ruby
6
6
  authors:
7
- - project_contibutors
7
+ - Ribose Inc.
8
8
  autorequire:
9
- bindir: bin
9
+ bindir: exe
10
10
  cert_chain: []
11
11
  date: 2019-11-17 00:00:00.000000000 Z
12
12
  dependencies:
@@ -25,167 +25,13 @@ dependencies:
25
25
  - !ruby/object:Gem::Version
26
26
  version: '0'
27
27
  - !ruby/object:Gem::Dependency
28
- name: debase
28
+ name: interscript-maps
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
31
  - - ">="
32
32
  - !ruby/object:Gem::Version
33
33
  version: '0'
34
- type: :development
35
- prerelease: false
36
- version_requirements: !ruby/object:Gem::Requirement
37
- requirements:
38
- - - ">="
39
- - !ruby/object:Gem::Version
40
- version: '0'
41
- - !ruby/object:Gem::Dependency
42
- name: pry
43
- requirement: !ruby/object:Gem::Requirement
44
- requirements:
45
- - - ">="
46
- - !ruby/object:Gem::Version
47
- version: '0'
48
- type: :development
49
- prerelease: false
50
- version_requirements: !ruby/object:Gem::Requirement
51
- requirements:
52
- - - ">="
53
- - !ruby/object:Gem::Version
54
- version: '0'
55
- - !ruby/object:Gem::Dependency
56
- name: pycall
57
- requirement: !ruby/object:Gem::Requirement
58
- requirements:
59
- - - ">="
60
- - !ruby/object:Gem::Version
61
- version: '0'
62
- type: :development
63
- prerelease: false
64
- version_requirements: !ruby/object:Gem::Requirement
65
- requirements:
66
- - - ">="
67
- - !ruby/object:Gem::Version
68
- version: '0'
69
- - !ruby/object:Gem::Dependency
70
- name: rambling-trie
71
- requirement: !ruby/object:Gem::Requirement
72
- requirements:
73
- - - ">="
74
- - !ruby/object:Gem::Version
75
- version: '0'
76
- type: :development
77
- prerelease: false
78
- version_requirements: !ruby/object:Gem::Requirement
79
- requirements:
80
- - - ">="
81
- - !ruby/object:Gem::Version
82
- version: '0'
83
- - !ruby/object:Gem::Dependency
84
- name: rake
85
- requirement: !ruby/object:Gem::Requirement
86
- requirements:
87
- - - ">="
88
- - !ruby/object:Gem::Version
89
- version: '0'
90
- type: :development
91
- prerelease: false
92
- version_requirements: !ruby/object:Gem::Requirement
93
- requirements:
94
- - - ">="
95
- - !ruby/object:Gem::Version
96
- version: '0'
97
- - !ruby/object:Gem::Dependency
98
- name: rspec
99
- requirement: !ruby/object:Gem::Requirement
100
- requirements:
101
- - - ">="
102
- - !ruby/object:Gem::Version
103
- version: '0'
104
- type: :development
105
- prerelease: false
106
- version_requirements: !ruby/object:Gem::Requirement
107
- requirements:
108
- - - ">="
109
- - !ruby/object:Gem::Version
110
- version: '0'
111
- - !ruby/object:Gem::Dependency
112
- name: ruby-debug-ide
113
- requirement: !ruby/object:Gem::Requirement
114
- requirements:
115
- - - ">="
116
- - !ruby/object:Gem::Version
117
- version: '0'
118
- type: :development
119
- prerelease: false
120
- version_requirements: !ruby/object:Gem::Requirement
121
- requirements:
122
- - - ">="
123
- - !ruby/object:Gem::Version
124
- version: '0'
125
- - !ruby/object:Gem::Dependency
126
- name: rambling-trie-opal
127
- requirement: !ruby/object:Gem::Requirement
128
- requirements:
129
- - - ">="
130
- - !ruby/object:Gem::Version
131
- version: '0'
132
- type: :development
133
- prerelease: false
134
- version_requirements: !ruby/object:Gem::Requirement
135
- requirements:
136
- - - ">="
137
- - !ruby/object:Gem::Version
138
- version: '0'
139
- - !ruby/object:Gem::Dependency
140
- name: opal
141
- requirement: !ruby/object:Gem::Requirement
142
- requirements:
143
- - - "~>"
144
- - !ruby/object:Gem::Version
145
- version: 1.0.3
146
- type: :development
147
- prerelease: false
148
- version_requirements: !ruby/object:Gem::Requirement
149
- requirements:
150
- - - "~>"
151
- - !ruby/object:Gem::Version
152
- version: 1.0.3
153
- - !ruby/object:Gem::Dependency
154
- name: guard
155
- requirement: !ruby/object:Gem::Requirement
156
- requirements:
157
- - - ">="
158
- - !ruby/object:Gem::Version
159
- version: '0'
160
- type: :development
161
- prerelease: false
162
- version_requirements: !ruby/object:Gem::Requirement
163
- requirements:
164
- - - ">="
165
- - !ruby/object:Gem::Version
166
- version: '0'
167
- - !ruby/object:Gem::Dependency
168
- name: guard-rake
169
- requirement: !ruby/object:Gem::Requirement
170
- requirements:
171
- - - ">="
172
- - !ruby/object:Gem::Version
173
- version: '0'
174
- type: :development
175
- prerelease: false
176
- version_requirements: !ruby/object:Gem::Requirement
177
- requirements:
178
- - - ">="
179
- - !ruby/object:Gem::Version
180
- version: '0'
181
- - !ruby/object:Gem::Dependency
182
- name: closure-compiler
183
- requirement: !ruby/object:Gem::Requirement
184
- requirements:
185
- - - ">="
186
- - !ruby/object:Gem::Version
187
- version: '0'
188
- type: :development
34
+ type: :runtime
189
35
  prerelease: false
190
36
  version_requirements: !ruby/object:Gem::Requirement
191
37
  requirements:
@@ -193,161 +39,80 @@ dependencies:
193
39
  - !ruby/object:Gem::Version
194
40
  version: '0'
195
41
  description: Interoperable script conversion systems
196
- email:
42
+ email:
43
+ - open.source@ribose.com
197
44
  executables:
198
45
  - interscript
199
- - rspec
200
- - setup
201
46
  extensions: []
202
47
  extra_rdoc_files: []
203
48
  files:
204
- - README.adoc
49
+ - ".gitignore"
50
+ - ".rspec"
51
+ - Gemfile
52
+ - LICENSE.adoc
53
+ - README.md
54
+ - Rakefile
55
+ - bin/console
205
56
  - bin/interscript
206
- - bin/rspec
57
+ - bin/maps_analyze_staging
58
+ - bin/maps_debug_compilers
59
+ - bin/maps_debug_ordering
60
+ - bin/maps_debug_ruby_compile
61
+ - bin/maps_debug_step_by_step
62
+ - bin/maps_optimize_order
63
+ - bin/maps_v1_analyze_regexps
64
+ - bin/maps_v1_to_v2
207
65
  - bin/setup
208
- - lib/__pycache__/g2pwrapper.cpython-38.pyc
209
- - lib/g2pwrapper.py
210
- - lib/interscript-opal.rb
66
+ - exe/interscript
67
+ - interscript.gemspec
211
68
  - lib/interscript.rb
212
69
  - lib/interscript/command.rb
213
- - lib/interscript/fs.rb
214
- - lib/interscript/mapping.rb
215
- - lib/interscript/opal.rb
216
- - lib/interscript/opal/maps.js.erb
217
- - lib/interscript/opal_map_translate.rb
70
+ - lib/interscript/compiler.rb
71
+ - lib/interscript/compiler/javascript.rb
72
+ - lib/interscript/compiler/ruby.rb
73
+ - lib/interscript/dsl.rb
74
+ - lib/interscript/dsl/aliases.rb
75
+ - lib/interscript/dsl/document.rb
76
+ - lib/interscript/dsl/group.rb
77
+ - lib/interscript/dsl/group/parallel.rb
78
+ - lib/interscript/dsl/items.rb
79
+ - lib/interscript/dsl/metadata.rb
80
+ - lib/interscript/dsl/stage.rb
81
+ - lib/interscript/dsl/symbol_mm.rb
82
+ - lib/interscript/dsl/tests.rb
83
+ - lib/interscript/interpreter.rb
84
+ - lib/interscript/node.rb
85
+ - lib/interscript/node/alias_def.rb
86
+ - lib/interscript/node/dependency.rb
87
+ - lib/interscript/node/document.rb
88
+ - lib/interscript/node/group.rb
89
+ - lib/interscript/node/group/parallel.rb
90
+ - lib/interscript/node/group/sequential.rb
91
+ - lib/interscript/node/item.rb
92
+ - lib/interscript/node/item/alias.rb
93
+ - lib/interscript/node/item/any.rb
94
+ - lib/interscript/node/item/capture.rb
95
+ - lib/interscript/node/item/group.rb
96
+ - lib/interscript/node/item/repeat.rb
97
+ - lib/interscript/node/item/stage.rb
98
+ - lib/interscript/node/item/string.rb
99
+ - lib/interscript/node/metadata.rb
100
+ - lib/interscript/node/rule.rb
101
+ - lib/interscript/node/rule/funcall.rb
102
+ - lib/interscript/node/rule/run.rb
103
+ - lib/interscript/node/rule/sub.rb
104
+ - lib/interscript/node/stage.rb
105
+ - lib/interscript/node/tests.rb
106
+ - lib/interscript/stdlib.rb
107
+ - lib/interscript/utils/regexp_converter.rb
218
108
  - lib/interscript/version.rb
219
- - lib/model-7
220
- - lib/tha-pt-b-7
221
- - maps/acadsin-zho-Hani-Latn-2002.yaml
222
- - maps/alalc-aze-Cyrl-Latn-1997.yaml
223
- - maps/alalc-bel-cyrl-latn-1997.yaml
224
- - maps/alalc-ben-Beng-Latn-2017.yaml
225
- - maps/alalc-bul-Cyrl-Latn-1997.yaml
226
- - maps/alalc-ell-Grek-Latn-1997.yaml
227
- - maps/alalc-ell-Grek-Latn-2010.yaml
228
- - maps/alalc-kat-Geok-Latn-1997.yaml
229
- - maps/alalc-kat-Geor-Latn-1997.yaml
230
- - maps/alalc-kor-Hang-Latn-1997.yaml
231
- - maps/alalc-mkd-Cyrl-Latn-2013.yaml
232
- - maps/alalc-mkd-cyrl-latn-1997.yaml
233
- - maps/alalc-rus-Cyrl-Latn-1997.yaml
234
- - maps/alalc-rus-Cyrl-Latn-2012.yaml
235
- - maps/alalc-srp-Cyrl-Latn-1997.yaml
236
- - maps/alalc-srp-cyrl-latn-2013.yaml
237
- - maps/alalc-ukr-Cyrl-Latn-1997.yaml
238
- - maps/alalc-ukr-Cyrl-Latn-2011.yaml
239
- - maps/apcbg-bul-Cyrl-Latn-1995.yaml
240
- - maps/bas-rus-Cyrl-Latn-2017-bss.yaml
241
- - maps/bas-rus-Cyrl-Latn-2017-oss.yaml
242
- - maps/bgn-jpn-Hrkt-Latn-1962.yaml
243
- - maps/bgn-kor-Hang-Latn-1943.yaml
244
- - maps/bgn-kor-Kore-Latn-1943.yaml
245
- - maps/bgna-bul-Cyrl-Latn-2006.yaml
246
- - maps/bgna-bul-Cyrl-Latn-2009.yaml
247
- - maps/bgnpcgn-arm-Armn-Latn-1981.yaml
248
- - maps/bgnpcgn-aze-Cyrl-Latn-1993.yaml
249
- - maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml
250
- - maps/bgnpcgn-bel-cyrl-latn-1979.yaml
251
- - maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml
252
- - maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml
253
- - maps/bgnpcgn-ell-Grek-Latn-1962.yaml
254
- - maps/bgnpcgn-ell-Grek-Latn-1996.yaml
255
- - maps/bgnpcgn-jpn-Hrkt-Latn-1976.yaml
256
- - maps/bgnpcgn-kat-Geor-Latn-1981.yaml
257
- - maps/bgnpcgn-kat-Geor-Latn-2009.yaml
258
- - maps/bgnpcgn-kor-Hang-Latn-kn-1945.yaml
259
- - maps/bgnpcgn-kor-Hang-Latn-rok-2011.yaml
260
- - maps/bgnpcgn-kor-Kore-Latn-rok-2011.yaml
261
- - maps/bgnpcgn-mkd-Cyrl-Latn-1981.yaml
262
- - maps/bgnpcgn-mkd-Cyrl-Latn-2013.yaml
263
- - maps/bgnpcgn-per-Arab-Latn-1956.yaml
264
- - maps/bgnpcgn-rus-Cyrl-Latn-1947.yaml
265
- - maps/bgnpcgn-srp-Cyrl-Latn-2005.yaml
266
- - maps/bgnpcgn-ukr-Cyrl-Latn-1965.yaml
267
- - maps/bgnpcgn-ukr-Cyrl-Latn-2019.yaml
268
- - maps/bgnpcgn-zho-Hans-Latn-1979.yaml
269
- - maps/by-bel-Cyrl-Latn-1998.yaml
270
- - maps/by-bel-Cyrl-Latn-2007.yaml
271
- - maps/elot-ell-Grek-Latn-743-1982-tl.yaml
272
- - maps/elot-ell-Grek-Latn-743-1982-ts.yaml
273
- - maps/elot-ell-Grek-Latn-743-2001-tl.yaml
274
- - maps/elot-ell-Grek-Latn-743-2001-ts.yaml
275
- - maps/ggg-kat-Geor-Latn-2002.yaml
276
- - maps/gki-bel-cyrl-latn-1992.yaml
277
- - maps/gki-bel-cyrl-latn-2000.yaml
278
- - maps/gost-rus-cyrl-latn-16876-71-1983.yaml
279
- - maps/hk-yue-Hani-Latn-1888.yaml
280
- - maps/icao-bel-Cyrl-Latn-9303.yaml
281
- - maps/icao-bul-Cyrl-Latn-9303.yaml
282
- - maps/icao-heb-Hebr-Latn-9303.yaml
283
- - maps/icao-mkd-Cyrl-Latn-9303.yaml
284
- - maps/icao-per-Arab-Latn-9303.yaml
285
- - maps/icao-rus-Cyrl-Latn-9303.yaml
286
- - maps/icao-srp-Cyrl-Latn-9303.yaml
287
- - maps/icao-ukr-Cyrl-Latn-9303.yaml
288
- - maps/iso-ell-Grek-Latn-843-1997-t1.yaml
289
- - maps/iso-ell-Grek-Latn-843-1997-t2.yaml
290
- - maps/iso-jpn-Hrkt-Latn-3602-1989.yaml
291
- - maps/iso-rus-Cyrl-Latn-9-1995.yaml
292
- - maps/iso-tha-Thai-Latn-11940-1998.yaml
293
- - maps/kp-kor-Hang-Latn-2002.yaml
294
- - maps/lshk-yue-Hani-Latn-jyutping-1993.yaml
295
- - maps/mext-jpn-Hrkt-Latn-1954.yaml
296
- - maps/moct-kor-Hang-Latn-2000.yaml
297
- - maps/mofa-jpn-Hrkt-Latn-1989.yaml
298
- - maps/mvd-bel-Cyrl-Latn-2008.yaml
299
- - maps/mvd-bel-Cyrl-Latn-2010.yaml
300
- - maps/mvd-rus-Cyrl-Latn-2008.yaml
301
- - maps/mvd-rus-Cyrl-Latn-2010.yaml
302
- - maps/nil-kor-Hang-Hang-jamo.yaml
303
- - maps/odni-aze-Cyrl-Latn-2015.yaml
304
- - maps/odni-bel-Cyrl-Latn-2015.yaml
305
- - maps/odni-bul-Cyrl-Latn-2015.yaml
306
- - maps/odni-kat-Geor-Latn-2015.yaml
307
- - maps/odni-kaz-Cyrl-Latn-2015.yaml
308
- - maps/odni-kir-Cyrl-Latn-2015.yaml
309
- - maps/odni-mkd-cyrl-latn-2015.yaml
310
- - maps/odni-rus-Cyrl-Latn-2015.yaml
311
- - maps/odni-srp-Cyrl-Latn-2015.yaml
312
- - maps/odni-tat-Cyrl-Latn-2015.yaml
313
- - maps/odni-tgk-Cyrl-Latn-2015.yaml
314
- - maps/odni-uig-Cyrl-Latn-2015.yaml
315
- - maps/odni-ukr-Cyrl-Latn-2015.yaml
316
- - maps/odni-uzb-Cyrl-Latn-2015.yaml
317
- - maps/royin-tha-Thai-Latn-1939-generic.yaml
318
- - maps/royin-tha-Thai-Latn-1968.yaml
319
- - maps/royin-tha-Thai-Latn-1999-chained.yaml
320
- - maps/royin-tha-Thai-Latn-1999.yaml
321
- - maps/sac-zho-Hans-Latn-1979.yaml
322
- - maps/ses-ara-arab-latn-1930.yaml
323
- - maps/stategeocadastre-ukr-Cyrl-Latn-1993.yaml
324
- - maps/ua-ukr-Cyrl-Latn-1996.yaml
325
- - maps/un-ara-Arab-Latn-1971.yaml
326
- - maps/un-ara-Arab-Latn-1972.yaml
327
- - maps/un-ara-Arab-Latn-2017.yaml
328
- - maps/un-bel-Cyrl-Latn-2007.yaml
329
- - maps/un-ben-Beng-Latn-2016.yaml
330
- - maps/un-ell-Grek-Latn-1987-tl.yaml
331
- - maps/un-ell-Grek-Latn-1987-ts.yaml
332
- - maps/un-ell-Grek-Latn-phonetic-1987.yaml
333
- - maps/un-mon-Mong-Latn-2013.yaml
334
- - maps/un-rus-Cyrl-Latn-1987.yaml
335
- - maps/un-ukr-cyrl-latn-1998.yaml
336
- - maps/var-jpn-Hrkt-Latn-hepburn-1886.yaml
337
- - maps/var-jpn-Hrkt-Latn-hepburn-1954.yaml
338
- - maps/var-kor-Hang-Latn-mr-1939.yaml
339
- - maps/var-kor-Kore-Hang-2013.yaml
340
- - maps/var-kor-Kore-Latn-mr-1939.yaml
341
- - maps/var-tha-Thai-Thai-phonemic.yaml
342
- - maps/var-tha-Thai-Zsym-ipa.yaml
343
- - maps/var-zho-Hani-Latn-1979.yaml
344
- - spec/interscript/mapping_spec.rb
345
- - spec/interscript_spec.rb
346
- - spec/spec_helper.rb
347
- homepage: ''
109
+ - requirements.txt
110
+ homepage: https://www.interscript.com
348
111
  licenses:
349
- - MIT
350
- metadata: {}
112
+ - BSD-2-Clause
113
+ metadata:
114
+ homepage_uri: https://www.interscript.com
115
+ source_code_uri: https://github.com/interscript/interscript
351
116
  post_install_message:
352
117
  rdoc_options: []
353
118
  require_paths:
@@ -356,18 +121,15 @@ required_ruby_version: !ruby/object:Gem::Requirement
356
121
  requirements:
357
122
  - - ">="
358
123
  - !ruby/object:Gem::Version
359
- version: '0'
124
+ version: 2.3.0
360
125
  required_rubygems_version: !ruby/object:Gem::Requirement
361
126
  requirements:
362
- - - ">="
127
+ - - ">"
363
128
  - !ruby/object:Gem::Version
364
- version: 2.4.0
129
+ version: 1.3.1
365
130
  requirements: []
366
- rubygems_version: 3.0.3
131
+ rubygems_version: 3.1.6
367
132
  signing_key:
368
133
  specification_version: 4
369
134
  summary: Interoperable script conversion systems
370
- test_files:
371
- - spec/interscript/mapping_spec.rb
372
- - spec/interscript_spec.rb
373
- - spec/spec_helper.rb
135
+ test_files: []
data/README.adoc DELETED
@@ -1,298 +0,0 @@
1
- = Interscript: Interoperable Script Conversion Systems, with a Ruby implementation
2
-
3
- image:https://github.com/interscript/interscript/workflows/test/badge.svg["Ruby build status", link="https://github.com/interscript/interscript/actions?workflow=test"]
4
- image:https://github.com/interscript/interscript/workflows/js/badge.svg["JavaScript build status", link="https://github.com/interscript/interscript/actions?workflow=js"]
5
-
6
- == Introduction
7
-
8
- This repository contains interoperable transliteration schemes from:
9
-
10
- * ALA-LC
11
- * BGN/PCGN
12
- * ICAO
13
- * ISO
14
- * UN (by UNGEGN)
15
- * Many, many other script conversion system authorities.
16
-
17
- The goal is to achieve interoperable transliteration schemes allowing quality comparisons.
18
-
19
-
20
-
21
- == Demonstration
22
-
23
- These transliteration systems are used in the demo:
24
-
25
- `bgnpcgn-rus-Cyrl-Latn-1947`:: BGN/PCGN Romanization of Russian
26
- `iso-rus-Cyrl-Latn-9-1995`:: ISO 9 Romanization of Russian
27
- `icao-rus-Cyrl-Latn-9303`:: ICAO MRZ Romanization of Russian
28
- `bas-rus-Cyrl-Latn-2017-bss`:: Bulgaria Academy of Science Streamlined System for Russian
29
-
30
- image:demo/20191118-interscript-demo-cast.gif["interscript screencast"]
31
-
32
-
33
- == Installation
34
-
35
- === Prerequisites
36
-
37
- Linux:
38
-
39
- [source,sh]
40
- ----
41
- apt-get install swig python3-setuptools
42
- ----
43
-
44
- Windows:
45
-
46
- [source,sh]
47
- ----
48
- choco install --no-progress swig
49
- ----
50
-
51
- Interscript depends on Python and the https://github.com/sequitur-g2p/sequitur-g2p[`sequitur-g2p`] module
52
-
53
- [source,sh]
54
- ----
55
- pip3 install setuptools numpy
56
- curl -sSL -o sequitur-g2p.zip https://github.com/sequitur-g2p/sequitur-g2p/archive/806273f.zip
57
- pip3 install sequitur-g2p.zip
58
- ----
59
-
60
- Interscript depends on Ruby. Once you manage to install Ruby, it's easy.
61
-
62
- [source,sh]
63
- ----
64
- gem install interscript
65
- ----
66
-
67
- == Usage
68
-
69
- Assume you have a file ready in the source script like this:
70
-
71
- [source,sh]
72
- ----
73
- cat <<EOT > rus-Cyrl.txt
74
- Эх, тройка! птица тройка, кто тебя выдумал? знать, у бойкого народа ты
75
- могла только родиться, в той земле, что не любит шутить, а
76
- ровнем-гладнем разметнулась на полсвета, да и ступай считать версты,
77
- пока не зарябит тебе в очи. И не хитрый, кажись, дорожный снаряд, не
78
- железным схвачен винтом, а наскоро живьём с одним топором да долотом
79
- снарядил и собрал тебя ярославский расторопный мужик. Не в немецких
80
- ботфортах ямщик: борода да рукавицы, и сидит чёрт знает на чём; а
81
- привстал, да замахнулся, да затянул песню — кони вихрем, спицы в
82
- колесах смешались в один гладкий круг, только дрогнула дорога, да
83
- вскрикнул в испуге остановившийся пешеход — и вон она понеслась,
84
- понеслась, понеслась!
85
-
86
- Н.В. Гоголь
87
- EOT
88
- ----
89
-
90
- You can run `interscript` on this text using different transliteration systems.
91
-
92
- [source,sh]
93
- ----
94
- interscript rus-Cyrl.txt \
95
- --system=bgnpcgn-rus-Cyrl-Latn-1947 \
96
- --output=bgnpcgn-rus-Latn.txt
97
-
98
- interscript rus-Cyrl.txt \
99
- --system=iso-rus-Cyrl-Latn-9-1995 \
100
- --output=iso-rus-Latn.txt
101
-
102
- interscript rus-Cyrl.txt \
103
- --system=icao-rus-Cyrl-Latn-9303 \
104
- --output=icao-rus-Latn.txt
105
-
106
- interscript rus-Cyrl.txt \
107
- --system=bas-rus-Cyrl-Latn-2017-bss \
108
- --output=bas-rus-Latn.txt
109
- ----
110
-
111
- It is then easy to see the exact differences in rendering between the systems.
112
-
113
- [source,sh]
114
- ----
115
- diff bgnpcgn-rus-Latn.txt bas-rus-Latn.txt
116
- ----
117
-
118
- == Adding transliteration system
119
-
120
- Transliteration systems stored in a `maps/` directory as YAML files.
121
- You can create a new file and add it to the directory.
122
-
123
- The file should be named as `<system-code>.yaml`, where `system-code`
124
- is in accordance with
125
- http://calconnect.gitlab.io/tc-localization/csd-transcription-systems[ISO/CC 24229].
126
-
127
- === File structure
128
-
129
- [source,yaml]
130
- ----
131
- authority_id: bgnpcgn
132
- id: 1947
133
- language: rus
134
- source_script: Cyrl
135
- destination_script: Latn
136
- name: ROMANIZATION OF RUSSIAN, BGN/PCGN 1947 System
137
- url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/807920/ROMANIZATION_OF_RUSSIAN.pdf
138
- creation_date: 1947
139
- confirmation_date: 2019-06
140
- description: The BGN/PCGN system for Russian was adopted ...
141
-
142
- notes:
143
- - The character e should be romanized ye initially, after the vowel ...
144
-
145
- tests:
146
- - source: ДЛИННОЕ ПОКРЫВАЛО
147
- expected: DLINNOYE POKRYVALO
148
- - source: Еловая шишка
149
- expected: Yelovaya shishka
150
-
151
- map:
152
- rules:
153
- - pattern: (?<=[АаЕеЁёИиОоУуЫыЭэЮюЯяЙйЪъЬь])\u0415 # Е after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь
154
- result: Ye
155
- - pattern: \b\u0415 # Е initially
156
- result: Ye
157
-
158
- characters:
159
- "\u0410": "A"
160
- "\u0411": "B"
161
- "\u0412": "V"
162
- ----
163
-
164
-
165
- === Rules
166
-
167
- The subsection `rules` is placed under the `map` key. All rules are applied in order they are placed before the subsection `characters` applying. Rules apply to an original text, not to a result of previous rules applying.
168
-
169
- Each rule has `pattern` and `result` elements.
170
-
171
- Pattern is a regex expression. It should be representing as a string without `//` or `%r{}` parentheses. For example `\b\u0415`. In case a rule is depend on previous or next content, lookahead or lookbehind could be used. For example a rule with the pattern `(?<=[АаЕеЁёИиОоУуЫыЭэЮюЯяЙйЪъЬь])\u0415` find every Е after upper or lower case symbols a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь.
172
-
173
- Result is a replacement a for pattern's match. It can contain a string, an Unicode characters specified by a hexadecimal number, a captured group reference. String with hexadecimal number or captured group reference should be double quoted. For example `"Y\u00eb"` or `"\\1\u00b7\\2"`. Captured group are referred by double backslash and group's number.
174
-
175
- Because rules are applied in order, multiple rules applicable to the same segment of a string can be addressed by rule ordering, and rules can be used as priority over characters. For example:
176
-
177
- [source,yaml]
178
- ----
179
- map:
180
- rules:
181
- - pattern: \u03B3\u03B3 # γ (before Γ, Ξ, Χ)
182
- result: ng
183
- - pattern: (?<![Γγ])\u03B3(?=[ΕεέΗηήΙιίΥυύ]) # γ (before front vowels)
184
- result: y
185
- ----
186
-
187
- (γι maps to `yi`; but γγ maps to `ng`. In the case of γγι, the first rule takes priority, and the transliteration is `ngi`: it makes the second rule impossible.)
188
-
189
- [source,yaml]
190
- ----
191
- map:
192
- rules:
193
- - pattern: (?<=\b)\u03BC[πΠ] # μπ (initially)
194
- result: b
195
- - pattern: \u03BC[πΠ] # μπ (medially)
196
- result: mb
197
- ----
198
-
199
- (The first rule applies at the start of a word; the second rule does not specify a context, as it applies in all other cases not covered by the first rule.)
200
-
201
- [source,yaml]
202
- ----
203
- map:
204
- rules:
205
- - pattern: ";"
206
- result: "?"
207
-
208
- characters
209
- "\u00B7": ";
210
- ----
211
-
212
- (This guarantees that any `;` are converted to `?` before any new `;` are introduced; because all three are Latin script, they could be mixed up in ordering.)
213
-
214
- Normally rules "`bleed`" each other: once a rule applies to a segment, that segment cannot trigger other rules, because it is already converted to Roman. Exceptionally, it will be necessary to have a rule add or remove characters in the original script, rather than transliterate them, so that the same context can be invoked by two rules in succession:
215
-
216
- [source,yaml]
217
- ----
218
- map:
219
- rules:
220
- - pattern: (?<=[АаЕеЁёИиОоУуЫыЭэЮюЯя])\u042b # Ы after any vowel character
221
- result: "\u00b7Ы"
222
- - pattern: \u042b(?=[АаУуЫыЭэ]) # Ы before а, у, ы, or э
223
- result: "Ы\u00b7"
224
- ----
225
-
226
- (If the result were `\u00B7Y`, the second rule could not be applied afterwards; but we want ОЫУ to transliterate as `O·Y·U`. In order to make that happen, we preserve the Ы during the rules phase, resulting in О·Ы·У; we only convert the letters to Roman script in the `characters` phase.)
227
-
228
- === Testing transliteration systems
229
-
230
- To test all transliteration systems in the `maps/` directory, run:
231
-
232
- [source,sh]
233
- ----
234
- bundle exec rspec
235
- ----
236
-
237
- The command takes `source` texts from the `test` section, transforms
238
- them using `rules` and `charmaps` from the `map` key, and compares the
239
- results with `expected:` text from the `source:` section.
240
-
241
- To test a specific transliteration system, set the environment variable
242
- `TRANSLIT_SYSTEM` to the system code of the desired system
243
- (i.e. the "`basename`" of the system's YAML file):
244
-
245
- [source,sh]
246
- ----
247
- TRANSLIT_SYSTEM=bgnpcgn-rus-Cyrl-Latn-1947 bundle exec rspec
248
- ----
249
-
250
-
251
- == ISCS system codes
252
-
253
- In accordance with
254
- http://calconnect.gitlab.io/tc-localization/csd-transcription-systems[ISO/CC 24229],
255
- the system code identifying a script conversion system has the following components:
256
-
257
- e.g. `bgnpcgn-rus-Cyrl-Latn-1947`:
258
-
259
- `bgnpcgn`:: the authority identifier
260
- `rus`:: an ISO 639-{1,2,3,5} language code that this system applies to (For 639-2, use (T) code)
261
- `Cyrl`:: an ISO 15924 script code, identifying the source script
262
- `Latn`:: an ISO 15924 script code, identifying the target script
263
- `1947`:: an identifier unit within the authority to identify this system
264
-
265
-
266
- == Covered languages
267
-
268
- Currently the schemes cover Cyrillic, Armenian, Greek, Arabic and Hebrew.
269
-
270
-
271
- == Samples to play with
272
-
273
- * `rus-Cyrl-1.txt`: Copied from the XLS output from http://www.primorsk.vybory.izbirkom.ru/region/primorsk?action=show&global=true&root=254017025&tvd=4254017212287&vrn=100100067795849&prver=0&pronetvd=0&region=25&sub_region=25&type=242&vibid=4254017212287
274
-
275
- * `rus-Cyrl-2.txt`: Copied from the XLS output from http://www.yaroslavl.vybory.izbirkom.ru/region/yaroslavl?action=show&root=764013001&tvd=4764013188704&vrn=4764013188693&prver=0&pronetvd=0&region=76&sub_region=76&type=426&vibid=4764013188704
276
-
277
-
278
- == References
279
-
280
- Reference documents are located at the
281
- https://github.com/interscript/interscript-references[interscript-references repository].
282
- Some specifications that have distribution limitations may not be reproduced there.
283
-
284
-
285
- == Links to system definitions
286
-
287
- * https://www.iso.org/committee/48750.html[ISO/TC 46 (see standards published by WG 3)]
288
- * http://geonames.nga.mil/gns/html/romanization.html[BGN/PCGN and BGN Romanization systems (BGN)]
289
- * https://www.gov.uk/government/publications/romanization-systems[BGN/PCGN Romanization systems (PCGN)]
290
- * https://www.loc.gov/catdir/cpso/roman.html[ALA-LC Romanization systems in current use]
291
- * http://catdir.loc.gov/catdir/cpso/roman.html[ALA-LC Romanization systems from 1997]
292
- * http://www.eki.ee/wgrs/[UN Romanization systems]
293
- * http://www.eki.ee/knab/kblatyl2.htm[EKI KNAB systems]
294
-
295
- == Copyright and license
296
-
297
- This is a Ribose project. Copyright Ribose.
298
-