interscript 0.1.4 → 2.0.5

Sign up to get free protection for your applications and to get access to all the features.
Files changed (183) hide show
  1. checksums.yaml +4 -4
  2. data/.gitignore +11 -0
  3. data/.rspec +3 -0
  4. data/Gemfile +29 -0
  5. data/LICENSE.adoc +31 -0
  6. data/README.md +3 -0
  7. data/Rakefile +53 -0
  8. data/bin/console +14 -0
  9. data/bin/interscript +3 -39
  10. data/bin/maps_analyze_staging +168 -0
  11. data/bin/maps_debug_compilers +58 -0
  12. data/bin/maps_debug_ordering +88 -0
  13. data/bin/maps_debug_ruby_compile +24 -0
  14. data/bin/maps_debug_step_by_step +44 -0
  15. data/bin/maps_optimize_order +112 -0
  16. data/bin/maps_v1_analyze_regexps +45 -0
  17. data/bin/maps_v1_to_v2 +426 -0
  18. data/exe/interscript +6 -0
  19. data/interscript.gemspec +31 -0
  20. data/lib/interscript.rb +76 -128
  21. data/lib/interscript/command.rb +6 -5
  22. data/lib/interscript/compiler.rb +22 -0
  23. data/lib/interscript/compiler/javascript.rb +292 -0
  24. data/lib/interscript/compiler/ruby.rb +262 -0
  25. data/lib/interscript/dsl.rb +67 -0
  26. data/lib/interscript/dsl/aliases.rb +23 -0
  27. data/lib/interscript/dsl/document.rb +46 -0
  28. data/lib/interscript/dsl/group.rb +45 -0
  29. data/lib/interscript/dsl/group/parallel.rb +6 -0
  30. data/lib/interscript/dsl/items.rb +89 -0
  31. data/lib/interscript/dsl/metadata.rb +26 -0
  32. data/lib/interscript/dsl/stage.rb +6 -0
  33. data/lib/interscript/dsl/symbol_mm.rb +11 -0
  34. data/lib/interscript/dsl/tests.rb +12 -0
  35. data/lib/interscript/interpreter.rb +251 -0
  36. data/lib/interscript/node.rb +25 -0
  37. data/lib/interscript/node/alias_def.rb +15 -0
  38. data/lib/interscript/node/dependency.rb +13 -0
  39. data/lib/interscript/node/document.rb +45 -0
  40. data/lib/interscript/node/group.rb +34 -0
  41. data/lib/interscript/node/group/parallel.rb +9 -0
  42. data/lib/interscript/node/group/sequential.rb +2 -0
  43. data/lib/interscript/node/item.rb +52 -0
  44. data/lib/interscript/node/item/alias.rb +42 -0
  45. data/lib/interscript/node/item/any.rb +61 -0
  46. data/lib/interscript/node/item/capture.rb +50 -0
  47. data/lib/interscript/node/item/group.rb +51 -0
  48. data/lib/interscript/node/item/repeat.rb +40 -0
  49. data/lib/interscript/node/item/stage.rb +23 -0
  50. data/lib/interscript/node/item/string.rb +51 -0
  51. data/lib/interscript/node/metadata.rb +18 -0
  52. data/lib/interscript/node/rule.rb +6 -0
  53. data/lib/interscript/node/rule/funcall.rb +18 -0
  54. data/lib/interscript/node/rule/run.rb +15 -0
  55. data/lib/interscript/node/rule/sub.rb +65 -0
  56. data/lib/interscript/node/stage.rb +19 -0
  57. data/lib/interscript/node/tests.rb +15 -0
  58. data/lib/interscript/stdlib.rb +211 -0
  59. data/lib/interscript/utils/regexp_converter.rb +283 -0
  60. data/lib/interscript/version.rb +1 -1
  61. data/requirements.txt +1 -0
  62. metadata +73 -223
  63. data/README.adoc +0 -297
  64. data/bin/rspec +0 -29
  65. data/lib/g2pwrapper.py +0 -34
  66. data/lib/interscript/mapping.rb +0 -125
  67. data/lib/model-7 +0 -0
  68. data/lib/tha-pt-b-7 +0 -0
  69. data/maps/acadsin-zho-Hani-Latn-2002.yaml +0 -38912
  70. data/maps/alalc-aze-Cyrl-Latn-1997.yaml +0 -141
  71. data/maps/alalc-bel-cyrl-latn-1997.yaml +0 -125
  72. data/maps/alalc-ben-Beng-Latn-2017.yaml +0 -130
  73. data/maps/alalc-bul-Cyrl-Latn-1997.yaml +0 -94
  74. data/maps/alalc-ell-Grek-Latn-1997.yaml +0 -625
  75. data/maps/alalc-ell-Grek-Latn-2010.yaml +0 -628
  76. data/maps/alalc-kat-Geok-Latn-1997.yaml +0 -112
  77. data/maps/alalc-kat-Geor-Latn-1997.yaml +0 -146
  78. data/maps/alalc-kor-Hang-Latn-1997.yaml +0 -94
  79. data/maps/alalc-mkd-Cyrl-Latn-2013.yaml +0 -103
  80. data/maps/alalc-mkd-cyrl-latn-1997.yaml +0 -114
  81. data/maps/alalc-rus-Cyrl-Latn-1997.yaml +0 -222
  82. data/maps/alalc-rus-Cyrl-Latn-2012.yaml +0 -162
  83. data/maps/alalc-srp-Cyrl-Latn-1997.yaml +0 -114
  84. data/maps/alalc-srp-cyrl-latn-2013.yaml +0 -135
  85. data/maps/alalc-ukr-Cyrl-Latn-1997.yaml +0 -141
  86. data/maps/alalc-ukr-Cyrl-Latn-2011.yaml +0 -16
  87. data/maps/apcbg-bul-Cyrl-Latn-1995.yaml +0 -283
  88. data/maps/bas-rus-Cyrl-Latn-2017-bss.yaml +0 -175
  89. data/maps/bas-rus-Cyrl-Latn-2017-oss.yaml +0 -169
  90. data/maps/bgn-jpn-Hrkt-Latn-1962.yaml +0 -294
  91. data/maps/bgn-kor-Hang-Latn-1943.yaml +0 -31
  92. data/maps/bgn-kor-Kore-Latn-1943.yaml +0 -31
  93. data/maps/bgna-bul-Cyrl-Latn-2006.yaml +0 -208
  94. data/maps/bgna-bul-Cyrl-Latn-2009.yaml +0 -208
  95. data/maps/bgnpcgn-arm-Armn-Latn-1981.yaml +0 -108
  96. data/maps/bgnpcgn-aze-Cyrl-Latn-1993.yaml +0 -104
  97. data/maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml +0 -184
  98. data/maps/bgnpcgn-bel-cyrl-latn-1979.yaml +0 -285
  99. data/maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml +0 -115
  100. data/maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml +0 -38
  101. data/maps/bgnpcgn-chn-Hans-Latn-1979.yaml +0 -7456
  102. data/maps/bgnpcgn-ell-Grek-Latn-1962.yaml +0 -702
  103. data/maps/bgnpcgn-ell-Grek-Latn-1996.yaml +0 -20
  104. data/maps/bgnpcgn-jpn-Hrkt-Latn-1976.yaml +0 -257
  105. data/maps/bgnpcgn-kat-Geor-Latn-1981.yaml +0 -127
  106. data/maps/bgnpcgn-kat-Geor-Latn-2009.yaml +0 -43
  107. data/maps/bgnpcgn-kor-Hang-Latn-kn-1945.yaml +0 -253
  108. data/maps/bgnpcgn-kor-Hang-Latn-rok-2011.yaml +0 -48
  109. data/maps/bgnpcgn-kor-Kore-Latn-rok-2011.yaml +0 -48
  110. data/maps/bgnpcgn-mkd-Cyrl-Latn-1981.yaml +0 -159
  111. data/maps/bgnpcgn-mkd-Cyrl-Latn-2013.yaml +0 -190
  112. data/maps/bgnpcgn-per-Arab-Latn-1956.yaml +0 -93
  113. data/maps/bgnpcgn-rus-Cyrl-Latn-1947.yaml +0 -314
  114. data/maps/bgnpcgn-srp-Cyrl-Latn-2005.yaml +0 -166
  115. data/maps/bgnpcgn-ukr-Cyrl-Latn-1965.yaml +0 -163
  116. data/maps/bgnpcgn-ukr-Cyrl-Latn-2019.yaml +0 -208
  117. data/maps/by-bel-Cyrl-Latn-1998.yaml +0 -168
  118. data/maps/by-bel-Cyrl-Latn-2007.yaml +0 -115
  119. data/maps/elot-ell-Grek-Latn-743-1982-tl.yaml +0 -685
  120. data/maps/elot-ell-Grek-Latn-743-1982-ts.yaml +0 -681
  121. data/maps/elot-ell-Grek-Latn-743-2001-tl.yaml +0 -20
  122. data/maps/elot-ell-Grek-Latn-743-2001-ts.yaml +0 -32
  123. data/maps/ggg-kat-Geor-Latn-2002.yaml +0 -89
  124. data/maps/gki-bel-cyrl-latn-1992.yaml +0 -33
  125. data/maps/gki-bel-cyrl-latn-2000.yaml +0 -201
  126. data/maps/gost-rus-cyrl-latn-16876-71-1983.yaml +0 -186
  127. data/maps/hk-yue-Hani-Latn-1888.yaml +0 -38497
  128. data/maps/icao-bel-Cyrl-Latn-9303.yaml +0 -141
  129. data/maps/icao-bul-Cyrl-Latn-9303.yaml +0 -122
  130. data/maps/icao-heb-Hebr-Latn-9303.yaml +0 -151
  131. data/maps/icao-mkd-Cyrl-Latn-9303.yaml +0 -117
  132. data/maps/icao-per-Arab-Latn-9303.yaml +0 -104
  133. data/maps/icao-rus-Cyrl-Latn-9303.yaml +0 -118
  134. data/maps/icao-srp-Cyrl-Latn-9303.yaml +0 -117
  135. data/maps/icao-ukr-Cyrl-Latn-9303.yaml +0 -120
  136. data/maps/iso-ell-Grek-Latn-843-1997-t1.yaml +0 -610
  137. data/maps/iso-ell-Grek-Latn-843-1997-t2.yaml +0 -41
  138. data/maps/iso-jpn-Hrkt-Latn-3602-1989.yaml +0 -62
  139. data/maps/iso-rus-Cyrl-Latn-9-1995.yaml +0 -272
  140. data/maps/iso-tha-Thai-Latn-11940-1998.yaml +0 -109
  141. data/maps/kp-kor-Hang-Latn-2002.yaml +0 -901
  142. data/maps/lshk-yue-Hani-Latn-jyutping-1993.yaml +0 -44820
  143. data/maps/mext-jpn-Hrkt-Latn-1954.yaml +0 -411
  144. data/maps/moct-kor-Hang-Latn-2000.yaml +0 -803
  145. data/maps/mofa-jpn-Hrkt-Latn-1989.yaml +0 -541
  146. data/maps/mvd-bel-Cyrl-Latn-2008.yaml +0 -225
  147. data/maps/mvd-bel-Cyrl-Latn-2010.yaml +0 -63
  148. data/maps/mvd-rus-Cyrl-Latn-2008.yaml +0 -110
  149. data/maps/mvd-rus-Cyrl-Latn-2010.yaml +0 -37
  150. data/maps/nil-kor-Hang-Hang-jamo.yaml +0 -11193
  151. data/maps/odni-bel-Cyrl-Latn-2015.yaml +0 -148
  152. data/maps/odni-bul-Cyrl-Latn-2015.yaml +0 -96
  153. data/maps/odni-kat-Geor-Latn-2015.yaml +0 -88
  154. data/maps/odni-rus-Cyrl-Latn-2015.yaml +0 -77
  155. data/maps/odni-srp-Cyrl-Latn-2015.yaml +0 -129
  156. data/maps/odni-ukr-Cyrl-Latn-2015.yaml +0 -157
  157. data/maps/odni-uzb-Cyrl-Latn-2015.yaml +0 -167
  158. data/maps/royin-tha-Thai-Latn-1939-generic.yaml +0 -90
  159. data/maps/royin-tha-Thai-Latn-1968.yaml +0 -179
  160. data/maps/royin-tha-Thai-Latn-1999-chained.yaml +0 -180
  161. data/maps/royin-tha-Thai-Latn-1999.yaml +0 -76
  162. data/maps/sac-zho-Hans-Latn-1979.yaml +0 -24759
  163. data/maps/stategeocadastre-ukr-Cyrl-Latn-1993.yaml +0 -222
  164. data/maps/ua-ukr-Cyrl-Latn-1996.yaml +0 -193
  165. data/maps/un-bel-Cyrl-Latn-2007.yaml +0 -114
  166. data/maps/un-ben-Beng-Latn-2016.yaml +0 -534
  167. data/maps/un-ell-Grek-Latn-1987-tl.yaml +0 -32
  168. data/maps/un-ell-Grek-Latn-1987-ts.yaml +0 -20
  169. data/maps/un-ell-Grek-Latn-phonetic-1987.yaml +0 -780
  170. data/maps/un-mon-Mong-Latn-2013.yaml +0 -93
  171. data/maps/un-rus-Cyrl-Latn-1987.yaml +0 -166
  172. data/maps/un-ukr-cyrl-latn-1998.yaml +0 -30
  173. data/maps/var-jpn-Hrkt-Latn-hepburn-1886.yaml +0 -406
  174. data/maps/var-jpn-Hrkt-Latn-hepburn-1954.yaml +0 -386
  175. data/maps/var-kor-Hang-Latn-mr-1939.yaml +0 -1054
  176. data/maps/var-kor-Kore-Hang-2013.yaml +0 -59754
  177. data/maps/var-kor-Kore-Latn-mr-1939.yaml +0 -37
  178. data/maps/var-tha-Thai-Thai-phonemic.yaml +0 -59
  179. data/maps/var-tha-Thai-Zsym-ipa.yaml +0 -301
  180. data/maps/var-zho-Hani-Latn-1979.yaml +0 -38908
  181. data/spec/interscript/mapping_spec.rb +0 -42
  182. data/spec/interscript_spec.rb +0 -26
  183. data/spec/spec_helper.rb +0 -3
@@ -1,3 +1,3 @@
1
1
  module Interscript
2
- VERSION = "0.1.4"
2
+ VERSION = "2.0.5"
3
3
  end
data/requirements.txt ADDED
@@ -0,0 +1 @@
1
+ torch
metadata CHANGED
@@ -1,12 +1,12 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: interscript
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.4
4
+ version: 2.0.5
5
5
  platform: ruby
6
6
  authors:
7
- - project_contibutors
7
+ - Ribose Inc.
8
8
  autorequire:
9
- bindir: bin
9
+ bindir: exe
10
10
  cert_chain: []
11
11
  date: 2019-11-17 00:00:00.000000000 Z
12
12
  dependencies:
@@ -25,97 +25,13 @@ dependencies:
25
25
  - !ruby/object:Gem::Version
26
26
  version: '0'
27
27
  - !ruby/object:Gem::Dependency
28
- name: debase
28
+ name: interscript-maps
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
31
  - - ">="
32
32
  - !ruby/object:Gem::Version
33
33
  version: '0'
34
- type: :development
35
- prerelease: false
36
- version_requirements: !ruby/object:Gem::Requirement
37
- requirements:
38
- - - ">="
39
- - !ruby/object:Gem::Version
40
- version: '0'
41
- - !ruby/object:Gem::Dependency
42
- name: pry
43
- requirement: !ruby/object:Gem::Requirement
44
- requirements:
45
- - - ">="
46
- - !ruby/object:Gem::Version
47
- version: '0'
48
- type: :development
49
- prerelease: false
50
- version_requirements: !ruby/object:Gem::Requirement
51
- requirements:
52
- - - ">="
53
- - !ruby/object:Gem::Version
54
- version: '0'
55
- - !ruby/object:Gem::Dependency
56
- name: pycall
57
- requirement: !ruby/object:Gem::Requirement
58
- requirements:
59
- - - ">="
60
- - !ruby/object:Gem::Version
61
- version: '0'
62
- type: :development
63
- prerelease: false
64
- version_requirements: !ruby/object:Gem::Requirement
65
- requirements:
66
- - - ">="
67
- - !ruby/object:Gem::Version
68
- version: '0'
69
- - !ruby/object:Gem::Dependency
70
- name: rambling-trie
71
- requirement: !ruby/object:Gem::Requirement
72
- requirements:
73
- - - ">="
74
- - !ruby/object:Gem::Version
75
- version: '0'
76
- type: :development
77
- prerelease: false
78
- version_requirements: !ruby/object:Gem::Requirement
79
- requirements:
80
- - - ">="
81
- - !ruby/object:Gem::Version
82
- version: '0'
83
- - !ruby/object:Gem::Dependency
84
- name: rake
85
- requirement: !ruby/object:Gem::Requirement
86
- requirements:
87
- - - ">="
88
- - !ruby/object:Gem::Version
89
- version: '0'
90
- type: :development
91
- prerelease: false
92
- version_requirements: !ruby/object:Gem::Requirement
93
- requirements:
94
- - - ">="
95
- - !ruby/object:Gem::Version
96
- version: '0'
97
- - !ruby/object:Gem::Dependency
98
- name: rspec
99
- requirement: !ruby/object:Gem::Requirement
100
- requirements:
101
- - - ">="
102
- - !ruby/object:Gem::Version
103
- version: '0'
104
- type: :development
105
- prerelease: false
106
- version_requirements: !ruby/object:Gem::Requirement
107
- requirements:
108
- - - ">="
109
- - !ruby/object:Gem::Version
110
- version: '0'
111
- - !ruby/object:Gem::Dependency
112
- name: ruby-debug-ide
113
- requirement: !ruby/object:Gem::Requirement
114
- requirements:
115
- - - ">="
116
- - !ruby/object:Gem::Version
117
- version: '0'
118
- type: :development
34
+ type: :runtime
119
35
  prerelease: false
120
36
  version_requirements: !ruby/object:Gem::Requirement
121
37
  requirements:
@@ -123,144 +39,80 @@ dependencies:
123
39
  - !ruby/object:Gem::Version
124
40
  version: '0'
125
41
  description: Interoperable script conversion systems
126
- email:
42
+ email:
43
+ - open.source@ribose.com
127
44
  executables:
128
45
  - interscript
129
- - rspec
130
- - setup
131
46
  extensions: []
132
47
  extra_rdoc_files: []
133
48
  files:
134
- - README.adoc
49
+ - ".gitignore"
50
+ - ".rspec"
51
+ - Gemfile
52
+ - LICENSE.adoc
53
+ - README.md
54
+ - Rakefile
55
+ - bin/console
135
56
  - bin/interscript
136
- - bin/rspec
57
+ - bin/maps_analyze_staging
58
+ - bin/maps_debug_compilers
59
+ - bin/maps_debug_ordering
60
+ - bin/maps_debug_ruby_compile
61
+ - bin/maps_debug_step_by_step
62
+ - bin/maps_optimize_order
63
+ - bin/maps_v1_analyze_regexps
64
+ - bin/maps_v1_to_v2
137
65
  - bin/setup
138
- - lib/g2pwrapper.py
66
+ - exe/interscript
67
+ - interscript.gemspec
139
68
  - lib/interscript.rb
140
69
  - lib/interscript/command.rb
141
- - lib/interscript/mapping.rb
70
+ - lib/interscript/compiler.rb
71
+ - lib/interscript/compiler/javascript.rb
72
+ - lib/interscript/compiler/ruby.rb
73
+ - lib/interscript/dsl.rb
74
+ - lib/interscript/dsl/aliases.rb
75
+ - lib/interscript/dsl/document.rb
76
+ - lib/interscript/dsl/group.rb
77
+ - lib/interscript/dsl/group/parallel.rb
78
+ - lib/interscript/dsl/items.rb
79
+ - lib/interscript/dsl/metadata.rb
80
+ - lib/interscript/dsl/stage.rb
81
+ - lib/interscript/dsl/symbol_mm.rb
82
+ - lib/interscript/dsl/tests.rb
83
+ - lib/interscript/interpreter.rb
84
+ - lib/interscript/node.rb
85
+ - lib/interscript/node/alias_def.rb
86
+ - lib/interscript/node/dependency.rb
87
+ - lib/interscript/node/document.rb
88
+ - lib/interscript/node/group.rb
89
+ - lib/interscript/node/group/parallel.rb
90
+ - lib/interscript/node/group/sequential.rb
91
+ - lib/interscript/node/item.rb
92
+ - lib/interscript/node/item/alias.rb
93
+ - lib/interscript/node/item/any.rb
94
+ - lib/interscript/node/item/capture.rb
95
+ - lib/interscript/node/item/group.rb
96
+ - lib/interscript/node/item/repeat.rb
97
+ - lib/interscript/node/item/stage.rb
98
+ - lib/interscript/node/item/string.rb
99
+ - lib/interscript/node/metadata.rb
100
+ - lib/interscript/node/rule.rb
101
+ - lib/interscript/node/rule/funcall.rb
102
+ - lib/interscript/node/rule/run.rb
103
+ - lib/interscript/node/rule/sub.rb
104
+ - lib/interscript/node/stage.rb
105
+ - lib/interscript/node/tests.rb
106
+ - lib/interscript/stdlib.rb
107
+ - lib/interscript/utils/regexp_converter.rb
142
108
  - lib/interscript/version.rb
143
- - lib/model-7
144
- - lib/tha-pt-b-7
145
- - maps/acadsin-zho-Hani-Latn-2002.yaml
146
- - maps/alalc-aze-Cyrl-Latn-1997.yaml
147
- - maps/alalc-bel-cyrl-latn-1997.yaml
148
- - maps/alalc-ben-Beng-Latn-2017.yaml
149
- - maps/alalc-bul-Cyrl-Latn-1997.yaml
150
- - maps/alalc-ell-Grek-Latn-1997.yaml
151
- - maps/alalc-ell-Grek-Latn-2010.yaml
152
- - maps/alalc-kat-Geok-Latn-1997.yaml
153
- - maps/alalc-kat-Geor-Latn-1997.yaml
154
- - maps/alalc-kor-Hang-Latn-1997.yaml
155
- - maps/alalc-mkd-Cyrl-Latn-2013.yaml
156
- - maps/alalc-mkd-cyrl-latn-1997.yaml
157
- - maps/alalc-rus-Cyrl-Latn-1997.yaml
158
- - maps/alalc-rus-Cyrl-Latn-2012.yaml
159
- - maps/alalc-srp-Cyrl-Latn-1997.yaml
160
- - maps/alalc-srp-cyrl-latn-2013.yaml
161
- - maps/alalc-ukr-Cyrl-Latn-1997.yaml
162
- - maps/alalc-ukr-Cyrl-Latn-2011.yaml
163
- - maps/apcbg-bul-Cyrl-Latn-1995.yaml
164
- - maps/bas-rus-Cyrl-Latn-2017-bss.yaml
165
- - maps/bas-rus-Cyrl-Latn-2017-oss.yaml
166
- - maps/bgn-jpn-Hrkt-Latn-1962.yaml
167
- - maps/bgn-kor-Hang-Latn-1943.yaml
168
- - maps/bgn-kor-Kore-Latn-1943.yaml
169
- - maps/bgna-bul-Cyrl-Latn-2006.yaml
170
- - maps/bgna-bul-Cyrl-Latn-2009.yaml
171
- - maps/bgnpcgn-arm-Armn-Latn-1981.yaml
172
- - maps/bgnpcgn-aze-Cyrl-Latn-1993.yaml
173
- - maps/bgnpcgn-bak-Cyrl-Latn-2007.yaml
174
- - maps/bgnpcgn-bel-cyrl-latn-1979.yaml
175
- - maps/bgnpcgn-bul-Cyrl-Latn-1952.yaml
176
- - maps/bgnpcgn-bul-Cyrl-Latn-2013.yaml
177
- - maps/bgnpcgn-chn-Hans-Latn-1979.yaml
178
- - maps/bgnpcgn-ell-Grek-Latn-1962.yaml
179
- - maps/bgnpcgn-ell-Grek-Latn-1996.yaml
180
- - maps/bgnpcgn-jpn-Hrkt-Latn-1976.yaml
181
- - maps/bgnpcgn-kat-Geor-Latn-1981.yaml
182
- - maps/bgnpcgn-kat-Geor-Latn-2009.yaml
183
- - maps/bgnpcgn-kor-Hang-Latn-kn-1945.yaml
184
- - maps/bgnpcgn-kor-Hang-Latn-rok-2011.yaml
185
- - maps/bgnpcgn-kor-Kore-Latn-rok-2011.yaml
186
- - maps/bgnpcgn-mkd-Cyrl-Latn-1981.yaml
187
- - maps/bgnpcgn-mkd-Cyrl-Latn-2013.yaml
188
- - maps/bgnpcgn-per-Arab-Latn-1956.yaml
189
- - maps/bgnpcgn-rus-Cyrl-Latn-1947.yaml
190
- - maps/bgnpcgn-srp-Cyrl-Latn-2005.yaml
191
- - maps/bgnpcgn-ukr-Cyrl-Latn-1965.yaml
192
- - maps/bgnpcgn-ukr-Cyrl-Latn-2019.yaml
193
- - maps/by-bel-Cyrl-Latn-1998.yaml
194
- - maps/by-bel-Cyrl-Latn-2007.yaml
195
- - maps/elot-ell-Grek-Latn-743-1982-tl.yaml
196
- - maps/elot-ell-Grek-Latn-743-1982-ts.yaml
197
- - maps/elot-ell-Grek-Latn-743-2001-tl.yaml
198
- - maps/elot-ell-Grek-Latn-743-2001-ts.yaml
199
- - maps/ggg-kat-Geor-Latn-2002.yaml
200
- - maps/gki-bel-cyrl-latn-1992.yaml
201
- - maps/gki-bel-cyrl-latn-2000.yaml
202
- - maps/gost-rus-cyrl-latn-16876-71-1983.yaml
203
- - maps/hk-yue-Hani-Latn-1888.yaml
204
- - maps/icao-bel-Cyrl-Latn-9303.yaml
205
- - maps/icao-bul-Cyrl-Latn-9303.yaml
206
- - maps/icao-heb-Hebr-Latn-9303.yaml
207
- - maps/icao-mkd-Cyrl-Latn-9303.yaml
208
- - maps/icao-per-Arab-Latn-9303.yaml
209
- - maps/icao-rus-Cyrl-Latn-9303.yaml
210
- - maps/icao-srp-Cyrl-Latn-9303.yaml
211
- - maps/icao-ukr-Cyrl-Latn-9303.yaml
212
- - maps/iso-ell-Grek-Latn-843-1997-t1.yaml
213
- - maps/iso-ell-Grek-Latn-843-1997-t2.yaml
214
- - maps/iso-jpn-Hrkt-Latn-3602-1989.yaml
215
- - maps/iso-rus-Cyrl-Latn-9-1995.yaml
216
- - maps/iso-tha-Thai-Latn-11940-1998.yaml
217
- - maps/kp-kor-Hang-Latn-2002.yaml
218
- - maps/lshk-yue-Hani-Latn-jyutping-1993.yaml
219
- - maps/mext-jpn-Hrkt-Latn-1954.yaml
220
- - maps/moct-kor-Hang-Latn-2000.yaml
221
- - maps/mofa-jpn-Hrkt-Latn-1989.yaml
222
- - maps/mvd-bel-Cyrl-Latn-2008.yaml
223
- - maps/mvd-bel-Cyrl-Latn-2010.yaml
224
- - maps/mvd-rus-Cyrl-Latn-2008.yaml
225
- - maps/mvd-rus-Cyrl-Latn-2010.yaml
226
- - maps/nil-kor-Hang-Hang-jamo.yaml
227
- - maps/odni-bel-Cyrl-Latn-2015.yaml
228
- - maps/odni-bul-Cyrl-Latn-2015.yaml
229
- - maps/odni-kat-Geor-Latn-2015.yaml
230
- - maps/odni-rus-Cyrl-Latn-2015.yaml
231
- - maps/odni-srp-Cyrl-Latn-2015.yaml
232
- - maps/odni-ukr-Cyrl-Latn-2015.yaml
233
- - maps/odni-uzb-Cyrl-Latn-2015.yaml
234
- - maps/royin-tha-Thai-Latn-1939-generic.yaml
235
- - maps/royin-tha-Thai-Latn-1968.yaml
236
- - maps/royin-tha-Thai-Latn-1999-chained.yaml
237
- - maps/royin-tha-Thai-Latn-1999.yaml
238
- - maps/sac-zho-Hans-Latn-1979.yaml
239
- - maps/stategeocadastre-ukr-Cyrl-Latn-1993.yaml
240
- - maps/ua-ukr-Cyrl-Latn-1996.yaml
241
- - maps/un-bel-Cyrl-Latn-2007.yaml
242
- - maps/un-ben-Beng-Latn-2016.yaml
243
- - maps/un-ell-Grek-Latn-1987-tl.yaml
244
- - maps/un-ell-Grek-Latn-1987-ts.yaml
245
- - maps/un-ell-Grek-Latn-phonetic-1987.yaml
246
- - maps/un-mon-Mong-Latn-2013.yaml
247
- - maps/un-rus-Cyrl-Latn-1987.yaml
248
- - maps/un-ukr-cyrl-latn-1998.yaml
249
- - maps/var-jpn-Hrkt-Latn-hepburn-1886.yaml
250
- - maps/var-jpn-Hrkt-Latn-hepburn-1954.yaml
251
- - maps/var-kor-Hang-Latn-mr-1939.yaml
252
- - maps/var-kor-Kore-Hang-2013.yaml
253
- - maps/var-kor-Kore-Latn-mr-1939.yaml
254
- - maps/var-tha-Thai-Thai-phonemic.yaml
255
- - maps/var-tha-Thai-Zsym-ipa.yaml
256
- - maps/var-zho-Hani-Latn-1979.yaml
257
- - spec/interscript/mapping_spec.rb
258
- - spec/interscript_spec.rb
259
- - spec/spec_helper.rb
260
- homepage: ''
109
+ - requirements.txt
110
+ homepage: https://www.interscript.com
261
111
  licenses:
262
- - MIT
263
- metadata: {}
112
+ - BSD-2-Clause
113
+ metadata:
114
+ homepage_uri: https://www.interscript.com
115
+ source_code_uri: https://github.com/interscript/interscript
264
116
  post_install_message:
265
117
  rdoc_options: []
266
118
  require_paths:
@@ -269,18 +121,16 @@ required_ruby_version: !ruby/object:Gem::Requirement
269
121
  requirements:
270
122
  - - ">="
271
123
  - !ruby/object:Gem::Version
272
- version: '0'
124
+ version: 2.3.0
273
125
  required_rubygems_version: !ruby/object:Gem::Requirement
274
126
  requirements:
275
127
  - - ">="
276
128
  - !ruby/object:Gem::Version
277
- version: 2.4.0
129
+ version: '0'
278
130
  requirements: []
279
- rubygems_version: 3.0.3
131
+ rubyforge_project:
132
+ rubygems_version: 2.7.8
280
133
  signing_key:
281
134
  specification_version: 4
282
135
  summary: Interoperable script conversion systems
283
- test_files:
284
- - spec/interscript/mapping_spec.rb
285
- - spec/interscript_spec.rb
286
- - spec/spec_helper.rb
136
+ test_files: []
data/README.adoc DELETED
@@ -1,297 +0,0 @@
1
- = Interscript: Interoperable Script Conversion Systems, with a Ruby implementation
2
-
3
- image:https://github.com/interscript/interscript/workflows/test/badge.svg["Build Status", link="https://github.com/interscript/interscript/actions?workflow=test"]
4
-
5
- == Introduction
6
-
7
- This repository contains interoperable transliteration schemes from:
8
-
9
- * ALA-LC
10
- * BGN/PCGN
11
- * ICAO
12
- * ISO
13
- * UN (by UNGEGN)
14
- * Many, many other script conversion system authorities.
15
-
16
- The goal is to achieve interoperable transliteration schemes allowing quality comparisons.
17
-
18
-
19
-
20
- == Demonstration
21
-
22
- These transliteration systems are used in the demo:
23
-
24
- `bgnpcgn-rus-Cyrl-Latn-1947`:: BGN/PCGN Romanization of Russian
25
- `iso-rus-Cyrl-Latn-iso9`:: ISO 9 Romanization of Russian
26
- `icao-rus-Cyrl-Latn-9303`:: ICAO MRZ Romanization of Russian
27
- `bas-rus-Cyrl-Latn-bss`:: Bulgaria Academy of Science Streamlined System for Russian
28
-
29
- image:demo/20191118-interscript-demo-cast.gif["interscript screencast"]
30
-
31
-
32
- == Installation
33
-
34
- === Prerequisites
35
-
36
- Linux:
37
-
38
- [source,sh]
39
- ----
40
- apt-get install swig python3-setuptools
41
- ----
42
-
43
- Windows:
44
-
45
- [source,sh]
46
- ----
47
- choco install --no-progress swig
48
- ----
49
-
50
- Interscript depends on Python and the https://github.com/sequitur-g2p/sequitur-g2p[`sequitur-g2p`] module
51
-
52
- [source,sh]
53
- ----
54
- pip3 install setuptools numpy
55
- curl -sSL -o sequitur-g2p.zip https://github.com/sequitur-g2p/sequitur-g2p/archive/806273f.zip
56
- pip3 install sequitur-g2p.zip
57
- ----
58
-
59
- Interscript depends on Ruby. Once you manage to install Ruby, it's easy.
60
-
61
- [source,sh]
62
- ----
63
- gem install interscript
64
- ----
65
-
66
- == Usage
67
-
68
- Assume you have a file ready in the source script like this:
69
-
70
- [source,sh]
71
- ----
72
- cat <<EOT > rus-Cyrl.txt
73
- Эх, тройка! птица тройка, кто тебя выдумал? знать, у бойкого народа ты
74
- могла только родиться, в той земле, что не любит шутить, а
75
- ровнем-гладнем разметнулась на полсвета, да и ступай считать версты,
76
- пока не зарябит тебе в очи. И не хитрый, кажись, дорожный снаряд, не
77
- железным схвачен винтом, а наскоро живьём с одним топором да долотом
78
- снарядил и собрал тебя ярославский расторопный мужик. Не в немецких
79
- ботфортах ямщик: борода да рукавицы, и сидит чёрт знает на чём; а
80
- привстал, да замахнулся, да затянул песню — кони вихрем, спицы в
81
- колесах смешались в один гладкий круг, только дрогнула дорога, да
82
- вскрикнул в испуге остановившийся пешеход — и вон она понеслась,
83
- понеслась, понеслась!
84
-
85
- Н.В. Гоголь
86
- EOT
87
- ----
88
-
89
- You can run `interscript` on this text using different transliteration systems.
90
-
91
- [source,sh]
92
- ----
93
- interscript rus-Cyrl.txt \
94
- --system=bgnpcgn-rus-Cyrl-Latn-1947 \
95
- --output=bgnpcgn-rus-Latn.txt
96
-
97
- interscript rus-Cyrl.txt \
98
- --system=iso-rus-Cyrl-Latn-iso9 \
99
- --output=iso-rus-Latn.txt
100
-
101
- interscript rus-Cyrl.txt \
102
- --system=icao-rus-Cyrl-Latn-9303 \
103
- --output=icao-rus-Latn.txt
104
-
105
- interscript rus-Cyrl.txt \
106
- --system=bas-rus-Cyrl-Latn-bss \
107
- --output=bas-rus-Latn.txt
108
- ----
109
-
110
- It is then easy to see the exact differences in rendering between the systems.
111
-
112
- [source,sh]
113
- ----
114
- diff bgnpcgn-rus-Latn.txt bas-rus-Latn.txt
115
- ----
116
-
117
- == Adding transliteration system
118
-
119
- Transliteration systems stored in a `maps/` directory as YAML files.
120
- You can create a new file and add it to the directory.
121
-
122
- The file should be named as `<system-code>.yaml`, where `system-code`
123
- is in accordance with
124
- http://calconnect.gitlab.io/tc-localization/csd-transcription-systems[ISO/CC 24229].
125
-
126
- === File structure
127
-
128
- [source,yaml]
129
- ----
130
- authority_id: bgnpcgn
131
- id: 1947
132
- language: rus
133
- source_script: Cyrl
134
- destination_script: Latn
135
- name: ROMANIZATION OF RUSSIAN, BGN/PCGN 1947 System
136
- url: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/807920/ROMANIZATION_OF_RUSSIAN.pdf
137
- creation_date: 1947
138
- confirmation_date: 2019-06
139
- description: The BGN/PCGN system for Russian was adopted ...
140
-
141
- notes:
142
- - The character e should be romanized ye initially, after the vowel ...
143
-
144
- tests:
145
- - source: ДЛИННОЕ ПОКРЫВАЛО
146
- expected: DLINNOYE POKRYVALO
147
- - source: Еловая шишка
148
- expected: Yelovaya shishka
149
-
150
- map:
151
- rules:
152
- - pattern: (?<=[АаЕеЁёИиОоУуЫыЭэЮюЯяЙйЪъЬь])\u0415 # Е after a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь
153
- result: Ye
154
- - pattern: \b\u0415 # Е initially
155
- result: Ye
156
-
157
- characters:
158
- "\u0410": "A"
159
- "\u0411": "B"
160
- "\u0412": "V"
161
- ----
162
-
163
-
164
- === Rules
165
-
166
- The subsection `rules` is placed under the `map` key. All rules are applied in order they are placed before the subsection `characters` applying. Rules apply to an original text, not to a result of previous rules applying.
167
-
168
- Each rule has `pattern` and `result` elements.
169
-
170
- Pattern is a regex expression. It should be representing as a string without `//` or `%r{}` parentheses. For example `\b\u0415`. In case a rule is depend on previous or next content, lookahead or lookbehind could be used. For example a rule with the pattern `(?<=[АаЕеЁёИиОоУуЫыЭэЮюЯяЙйЪъЬь])\u0415` find every Е after upper or lower case symbols a, e, ё, и, о, у, ы, э, ю, я, й, ъ, ь.
171
-
172
- Result is a replacement a for pattern's match. It can contain a string, an Unicode characters specified by a hexadecimal number, a captured group reference. String with hexadecimal number or captured group reference should be double quoted. For example `"Y\u00eb"` or `"\\1\u00b7\\2"`. Captured group are referred by double backslash and group's number.
173
-
174
- Because rules are applied in order, multiple rules applicable to the same segment of a string can be addressed by rule ordering, and rules can be used as priority over characters. For example:
175
-
176
- [source,yaml]
177
- ----
178
- map:
179
- rules:
180
- - pattern: \u03B3\u03B3 # γ (before Γ, Ξ, Χ)
181
- result: ng
182
- - pattern: (?<![Γγ])\u03B3(?=[ΕεέΗηήΙιίΥυύ]) # γ (before front vowels)
183
- result: y
184
- ----
185
-
186
- (γι maps to `yi`; but γγ maps to `ng`. In the case of γγι, the first rule takes priority, and the transliteration is `ngi`: it makes the second rule impossible.)
187
-
188
- [source,yaml]
189
- ----
190
- map:
191
- rules:
192
- - pattern: (?<=\b)\u03BC[πΠ] # μπ (initially)
193
- result: b
194
- - pattern: \u03BC[πΠ] # μπ (medially)
195
- result: mb
196
- ----
197
-
198
- (The first rule applies at the start of a word; the second rule does not specify a context, as it applies in all other cases not covered by the first rule.)
199
-
200
- [source,yaml]
201
- ----
202
- map:
203
- rules:
204
- - pattern: ";"
205
- result: "?"
206
-
207
- characters
208
- "\u00B7": ";
209
- ----
210
-
211
- (This guarantees that any `;` are converted to `?` before any new `;` are introduced; because all three are Latin script, they could be mixed up in ordering.)
212
-
213
- Normally rules "`bleed`" each other: once a rule applies to a segment, that segment cannot trigger other rules, because it is already converted to Roman. Exceptionally, it will be necessary to have a rule add or remove characters in the original script, rather than transliterate them, so that the same context can be invoked by two rules in succession:
214
-
215
- [source,yaml]
216
- ----
217
- map:
218
- rules:
219
- - pattern: (?<=[АаЕеЁёИиОоУуЫыЭэЮюЯя])\u042b # Ы after any vowel character
220
- result: "\u00b7Ы"
221
- - pattern: \u042b(?=[АаУуЫыЭэ]) # Ы before а, у, ы, or э
222
- result: "Ы\u00b7"
223
- ----
224
-
225
- (If the result were `\u00B7Y`, the second rule could not be applied afterwards; but we want ОЫУ to transliterate as `O·Y·U`. In order to make that happen, we preserve the Ы during the rules phase, resulting in О·Ы·У; we only convert the letters to Roman script in the `characters` phase.)
226
-
227
- === Testing transliteration systems
228
-
229
- To test all transliteration systems in the `maps/` directory, run:
230
-
231
- [source,sh]
232
- ----
233
- bundle exec rspec
234
- ----
235
-
236
- The command takes `source` texts from the `test` section, transforms
237
- them using `rules` and `charmaps` from the `map` key, and compares the
238
- results with `expected:` text from the `source:` section.
239
-
240
- To test a specific transliteration system, set the environment variable
241
- `TRANSLIT_SYSTEM` to the system code of the desired system
242
- (i.e. the "`basename`" of the system's YAML file):
243
-
244
- [source,sh]
245
- ----
246
- TRANSLIT_SYSTEM=bgnpcgn-rus-Cyrl-Latn-1947 bundle exec rspec
247
- ----
248
-
249
-
250
- == ISCS system codes
251
-
252
- In accordance with
253
- http://calconnect.gitlab.io/tc-localization/csd-transcription-systems[ISO/CC 24229],
254
- the system code identifying a script conversion system has the following components:
255
-
256
- e.g. `bgnpcgn-rus-Cyrl-Latn-1947`:
257
-
258
- `bgnpcgn`:: the authority identifier
259
- `rus`:: an ISO 639-2 3-letter language code that this system applies to
260
- `Cyrl`:: an ISO 15924 script code, identifying the source script
261
- `Latn`:: an ISO 15924 script code, identifying the target script
262
- `1947`:: an identifier unit within the authority to identify this system
263
-
264
-
265
- == Covered languages
266
-
267
- Currently the schemes cover Cyrillic, Armenian, Greek, Arabic and Hebrew.
268
-
269
-
270
- == Samples to play with
271
-
272
- * `rus-Cyrl-1.txt`: Copied from the XLS output from http://www.primorsk.vybory.izbirkom.ru/region/primorsk?action=show&global=true&root=254017025&tvd=4254017212287&vrn=100100067795849&prver=0&pronetvd=0&region=25&sub_region=25&type=242&vibid=4254017212287
273
-
274
- * `rus-Cyrl-2.txt`: Copied from the XLS output from http://www.yaroslavl.vybory.izbirkom.ru/region/yaroslavl?action=show&root=764013001&tvd=4764013188704&vrn=4764013188693&prver=0&pronetvd=0&region=76&sub_region=76&type=426&vibid=4764013188704
275
-
276
-
277
- == References
278
-
279
- Reference documents are located at the
280
- https://github.com/interscript/interscript-references[interscript-references repository].
281
- Some specifications that have distribution limitations may not be reproduced there.
282
-
283
-
284
- == Links to system definitions
285
-
286
- * https://www.iso.org/committee/48750.html[ISO/TC 46 (see standards published by WG 3)]
287
- * http://geonames.nga.mil/gns/html/romanization.html[BGN/PCGN and BGN Romanization systems (BGN)]
288
- * https://www.gov.uk/government/publications/romanization-systems[BGN/PCGN Romanization systems (PCGN)]
289
- * https://www.loc.gov/catdir/cpso/roman.html[ALA-LC Romanization systems in current use]
290
- * http://catdir.loc.gov/catdir/cpso/roman.html[ALA-LC Romanization systems from 1997]
291
- * http://www.eki.ee/wgrs/[UN Romanization systems]
292
- * http://www.eki.ee/knab/kblatyl2.htm[EKI KNAB systems]
293
-
294
- == Copyright and license
295
-
296
- This is a Ribose project. Copyright Ribose.
297
-