ssmd 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 6818a5bb8c25e44cf95522fe1a5d0852a9114c9d
4
+ data.tar.gz: ca1ff40b6e7a5f51900ceea642771b959bcdb41b
5
+ SHA512:
6
+ metadata.gz: 9e6ff09b5251da93e455a8731ccf624881abf4e52da4c18369c6bd55d413d059f2e7c77f8aec7070e7d530f1c27cab76a65d73534f70c196d9773070070251f8
7
+ data.tar.gz: 4c8e7b110e9e48ac65e3e957cb4b2b30535fa6e5811503f24ff62b3355003001168f969caa33a9288b9f3b6570c24bdace187eea0008c9fd2939342701b614f6
@@ -0,0 +1,12 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+
11
+ # rspec failure tracking
12
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
@@ -0,0 +1,5 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.3.4
5
+ before_install: gem install bundler -v 1.14.6
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at machisuji@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in ssmd.gemspec
4
+ gemspec
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2017 Markus Kahl
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,73 @@
1
+ # SSMD
2
+
3
+ [![Build Status](https://travis-ci.org/machisuji/ssmd.svg?branch=master)](https://travis-ci.org/machisuji/ssmd)
4
+
5
+ Speech Synthesis Markdown (SSMD) is an lightweight alternative syntax for [SSML](https://www.w3.org/TR/speech-synthesis/).
6
+ This repository contains both the reference implementation of the SSMD-to-SSML conversion tool (`ssmd`) as well
7
+ as the [specification](SPECIFICATION.md) of the language.
8
+
9
+ ## Requirements
10
+
11
+ The tools and executable specification provided in this repository require **Ruby 2.3.4** or better.
12
+
13
+ ## Installation
14
+
15
+ Add this line to your application's Gemfile:
16
+
17
+ ```ruby
18
+ gem 'ssmd'
19
+ ```
20
+
21
+ And then execute:
22
+
23
+ $ bundle
24
+
25
+ Or install it yourself as:
26
+
27
+ $ gem install ssmd
28
+
29
+ ## Usage
30
+
31
+ ```ruby
32
+ require 'ssmd'
33
+
34
+ ssmd = "hello *SSMD*!"
35
+ ssmd = SSMD.to_ssml ssmd
36
+
37
+ puts ssmd
38
+ # Output: <speak>hello <emphasis>SSMD</emphasis>!</speak>
39
+ ```
40
+
41
+ **Note:**
42
+
43
+ This version is still under development. So far only the following conversions
44
+ described in the specification are implemented:
45
+
46
+ * Text
47
+ * Emphasis
48
+ * Mark
49
+ * Language
50
+ * Phoneme
51
+
52
+ ## Development
53
+
54
+ After checking out the repo, run `bin/setup` to install dependencies. You can run `bin/console` for an interactive prompt that will allow you to experiment.
55
+
56
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
57
+
58
+ ### Tests
59
+
60
+ Run `rake spec` to run the tests against a given executable.
61
+
62
+ This implementation and any other can be tested against the SSMD specification.
63
+ Said specification is extracted from `SPECIFICATION.md`.
64
+ It runs each SSMD snippet through the tested tool and compares it to the output of
65
+ the following SSML snippet. If they match the test passes.
66
+
67
+ ## Contributing
68
+
69
+ Bug reports and pull requests are welcome on GitHub at https://github.com/machisuji/ssmd. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
70
+
71
+ ## License
72
+
73
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,333 @@
1
+ # SSMD Specification
2
+
3
+ Here we specify how Speech Synthesis Markdown (SSMD) works.
4
+
5
+ ## Syntax
6
+
7
+ SSMD is mapped to SSML using the following rules.
8
+
9
+ * [Text](#text)
10
+ * [Emphasis](#emphasis)
11
+ * [Break](#break)
12
+ * [Language](#language)
13
+ * [Mark](#mark)
14
+ * [Paragraph](#paragraph)
15
+ * [Phoneme](#phoneme)
16
+ * [Prosody](#prosody)
17
+ * [Say-as](#say-as)
18
+ * [Substitution](#substitution)
19
+ * [Extensions](#extensions)
20
+
21
+ ***
22
+
23
+ ### Text
24
+
25
+ Any text written is implicitly wrapped in a `<speak>` root element.
26
+ This will be omitted in the rest of the examples shown in this section.
27
+
28
+ SSMD:
29
+ ```
30
+ text
31
+ ```
32
+
33
+ SSML:
34
+ ```html
35
+ <speak>text</speak>
36
+ ```
37
+
38
+ ***
39
+
40
+ ### Emphasis
41
+
42
+ SSMD:
43
+ ```
44
+ *word*
45
+ ```
46
+
47
+ SSML:
48
+ ```html
49
+ <emphasis>word</emphasis>
50
+ ```
51
+
52
+ ***
53
+
54
+ ### Break
55
+
56
+ Pauses can be indicated by using `...`. Several modifications to the duration are allowed as shown below.
57
+
58
+ SSMD:
59
+ ```
60
+ Hello ... world (default: x-strong break like after a paragraph)
61
+ Hello - ...0 world (skip break when there would otherwise be one like after this dash)
62
+ Hello ...c world (medium break like after a comma)
63
+ Hello ...s world (strong break like after a sentence)
64
+ Hello ...p world (extra string break like after a paragraph)
65
+ Hello ...5s world (5 second break (max 10s))
66
+ Hello ...100ms world (100 millisecond break (max 10000ms))
67
+ Hello ...100 world (100 millisecond break (max 10000ms))
68
+ ```
69
+
70
+ SSML:
71
+ ```html
72
+ Hello <break strength="x-strong"/> world
73
+ Hello - <break strength="none"/> world
74
+ Hello <break strength="medium"/> world
75
+ Hello <break strength="strong"/> world
76
+ Hello <break strength="x-strong"/> world
77
+ Hello <break time="5s"/> world
78
+ Hello <break time="100ms"/> world
79
+ Hello <break time="100ms"/> world
80
+ ```
81
+
82
+ ***
83
+
84
+ ### Language
85
+
86
+ Text passages can be annotated with ISO 639-1 language codes as shown below.
87
+ SSML expects a full code including a country. While you can provide those too
88
+ SSMD will use a sensible default in case where this is omitted.
89
+ As can be seen in the first example where `en` defaults to `en-US` and
90
+ `de` defaults to `de-DE`.
91
+
92
+ SSMD:
93
+ ```
94
+ Ich sah [Guardians of the Galaxy](en) im Kino.
95
+ Ich sah [Guardians of the Galaxy](en-GB) im Kino.
96
+ I saw ["Die Häschenschule"](de) in the cinema.
97
+ ```
98
+
99
+ SSML:
100
+ ```html
101
+ Ich sah <lang xml:lang="en-US">Guardians of the Galaxy</lang> im Kino.
102
+ Ich sah <lang xml:lang="en-GB">Guardians of the Galaxy</lang> im Kino.
103
+ I saw <lang xml:lang="de-DE">"Die Häschenschule"</lang> in the cinema.
104
+ ```
105
+
106
+ ***
107
+
108
+ ### Mark
109
+
110
+ Sections of text can be tagged using marks. They do not effect the synthesis but
111
+ can be returned by SSML processing engines as meta information and to emit
112
+ events during processing based on these marks.
113
+
114
+ SSMD:
115
+ ```
116
+ I always wanted a @animal cat as a pet.
117
+ ```
118
+
119
+ SSML:
120
+ ```html
121
+ I always wanted a <mark name="animal"/> cat as a pet.
122
+ ```
123
+
124
+ ***
125
+
126
+ ### Paragraph
127
+
128
+ Empty lines indicate a paragraph.
129
+
130
+ SSMD:
131
+ ```
132
+ First prepare the ingredients.
133
+ Don't forget to wash them first.
134
+
135
+ Lastly mix them all together.
136
+ ```
137
+
138
+ SSML:
139
+ ```html
140
+ <p>First prepare the ingredients. Don't forget to wash them first.</p>
141
+ <p>Lastly mix them all together.</p>
142
+ ```
143
+
144
+ ***
145
+
146
+ ### Phoneme
147
+
148
+ Sometimes the speech synthesis engine needs to be told how exactly to pronounce a word.
149
+ This can be done via phonemes. While SSML supports IPA, SSMD uses [X-SAMPA](https://en.wikipedia.org/wiki/X-SAMPA) by default.
150
+
151
+ SSMD:
152
+ ```
153
+ The German word ["dich"](ph: dIC) does not sound like dick.
154
+ ```
155
+
156
+ SSML:
157
+ ```html
158
+ The German word <phoneme alphabet="ipa" ph="dɪç">"dich"</phoneme> does not sound like dick.
159
+ ```
160
+
161
+ ***
162
+
163
+ ### Prosody
164
+
165
+ The prosody or rythm depends the volume, rate and pitch of the delivered text.
166
+
167
+ Each of those values can be defined by a number between 1 and 5 where those mean:
168
+
169
+ | number | volume | rate | pitch |
170
+ | ------ | ------ | ---- | ----- |
171
+ | 0 | silent | | |
172
+ | 1 | x-soft | x-slow | x-low |
173
+ | 2 | soft | slow | low |
174
+ | 3 | medium | medium | medium |
175
+ | 4 | loud | fast | high |
176
+ | 5 | x-loud | x-fast | x-high |
177
+
178
+ SSMD:
179
+ ```
180
+ Volume:
181
+
182
+ ~silent~
183
+ --extra soft--
184
+ -soft-
185
+ medium
186
+ +loud+ or LOUD
187
+ ++extra loud++
188
+
189
+ Rate:
190
+
191
+ <<extra slow<<
192
+ <slow<
193
+ medium
194
+ >fast>
195
+ >>extra fast>>
196
+
197
+ Pitch:
198
+
199
+ __extra low__
200
+ _low_
201
+ medium
202
+ ^high^
203
+ ^^extra high^^
204
+
205
+ ++>>^^extra loud, fast and high^^>>++ or
206
+ [extra loud, fast, and high](vrp: 555) or
207
+ [extra loud, fast, and high](v: 5, r: 5, p: 5)
208
+ ```
209
+
210
+ SSML:
211
+ ```html
212
+ Volume:
213
+
214
+ <prosody volume="silent">silent</prosody>
215
+ <prosody volume="x-soft">extra soft</prosody>
216
+ <prosody volume="soft">soft</prosody>
217
+ medium
218
+ <prosody volume="loud">loud</prosody> or <prosody volume="loud">loud</prosody>
219
+ <prosody volume="x-loud">extra loud</prosody>
220
+
221
+ Rate:
222
+
223
+ <prosody rate="x-slow">extra slow</prosody>
224
+ <prosody rate="slow">slow</prosody>
225
+ medium
226
+ <prosody rate="fast">fast</prosody>
227
+ <prosody rate="x-fast">extra fast</prosody>
228
+
229
+ Pitch:
230
+
231
+ <prosody pitch="x-low">extra low</prosody>
232
+ <prosody pitch="low">low</prosody>
233
+ medium
234
+ <prosody pitch="high">high</prosody>
235
+ <prosody pitch="x-high">extra high</prosody>
236
+
237
+ <prosody volume="x-loud" rate="x-fast" pitch="x-high">extra loud, fast and high</prosody> or
238
+ <prosody volume="x-loud" rate="x-fast" pitch="x-high">extra loud, fast and high</prosody> or
239
+ <prosody volume="x-loud" rate="x-fast" pitch="x-high">extra loud, fast and high</prosody>
240
+ ```
241
+
242
+ The shortcuts are listed first. While they can be combined, sometimes it's easier and shorter to just use
243
+ the explizit form shown in the last 2 lines. All of them can be nested, too.
244
+ Moreover changes in volume (`[louder](v: +10dB)`) and pitch (`[lower](p: -4%)`) can also be given explicitly in relative values.
245
+
246
+ ***
247
+
248
+ ### Say-as
249
+
250
+ You can give the speech sythesis engine hints as to what it's supposed to read using `as`.
251
+
252
+ Possible values:
253
+
254
+ * character - spell out each single character, e.g. for KGB
255
+ * number - cardinal number, e.g. 100
256
+ * ordinal - ordinal number, e.g. 1st
257
+ * digits - spell out each single digit, e.g. 123 as 1 - 2 - 3
258
+ * fraction - pronounce number as fraction, e.g. 3/4 as three quarters
259
+ * unit - e.g. 1meter
260
+ * date - read content as a date, must provide format
261
+ * time - duration in minutes and seconds
262
+ * address - read as part of an address
263
+ * telephone - read content as a telephone number
264
+ * expletive - beeps out the content
265
+
266
+ SSMD:
267
+ ```
268
+ Today on [29.12.2017](as: date, format: "dd.mm.yyyy") my
269
+ telephone number is [+49 123456](as: telephone).
270
+ You can't say [fuck](as: expletive) on television.
271
+ ```
272
+
273
+ SSML:
274
+ ```html
275
+ Today on <say-as interpret-as="date" format="dd.mm.yyyy">29.12.2017</say-as> my
276
+ telephone number is <say-as interpret-as="telephone">+49 123456</say-as>.
277
+ You can't say <say-as interpret-as="expletive">fuck</say-as> on television.
278
+ ```
279
+
280
+ ***
281
+
282
+ ### Substitution
283
+
284
+ Allows to substitute the pronuciation of a word, such as an acronym, with an alias.
285
+
286
+ SSMD:
287
+ ```
288
+ I'd like to drink some [H2O](sub: water) now.
289
+ ```
290
+
291
+ SSML:
292
+ ```html
293
+ I'd like to drink some <sub alias="water">H2O</sub> now.
294
+ ```
295
+
296
+ ***
297
+
298
+ ### Extensions
299
+
300
+ It must be possible to extend SSML with constructs specific to certain speech synthesis engines.
301
+ Registered extensions must have a unique name. They can take parameters.
302
+ For instance let's a assume we registered Amazon Polly's whisper effect in some hypothetical SSMD
303
+ library API.
304
+
305
+ ```ruby
306
+ SSMD.register "whisper", "amazon:effect", name: "whispered"
307
+ ```
308
+
309
+ SSMD:
310
+ ```
311
+ If he [whispers](ext: whisper), he lies.
312
+ ```
313
+
314
+ SSML:
315
+ ```html
316
+ If he <amazon:effect name="whispered">whispers</amazon:effect>, he lies.
317
+ ```
318
+
319
+ ***
320
+
321
+ ### Nesting and duplicate annotations
322
+
323
+ Formats can be nested. Duplicate annotations of the same type are ignored.
324
+
325
+ SSMD:
326
+ ```
327
+ Der Film [Guardians of the *Galaxy*](en-GB, de, fr-FR) ist ganz [okay](en-US).
328
+ ```
329
+
330
+ SSML:
331
+ ```html
332
+ Der Film <lang xml:lang="en-GB">Guardians of the <emphasis>Galaxy</emphasis></lang> ist ganz <lang xml:lang="en-US">okay</lang>.
333
+ ```
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "ssmd"
5
+
6
+ require "pry"
7
+ Pry.start
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,14 @@
1
+ require "ssmd/version"
2
+ require "ssmd/converter"
3
+
4
+ module SSMD
5
+ module_function
6
+
7
+ def to_ssml(ssmd)
8
+ Converter.new(ssmd).convert
9
+ end
10
+
11
+ def root_dir
12
+ Gem::Specification.find_by_name("ssmd").gem_dir
13
+ end
14
+ end
@@ -0,0 +1,7 @@
1
+ module SSMD::Annotations
2
+
3
+ end
4
+
5
+ require 'ssmd/annotations/annotation'
6
+ require 'ssmd/annotations/language_annotation'
7
+ require 'ssmd/annotations/phoneme_annotation'
@@ -0,0 +1,29 @@
1
+ module SSMD::Annotations
2
+ class Annotation
3
+ class << self
4
+ def try(text)
5
+ match = /\A#{regex}\Z/.match text
6
+
7
+ if match
8
+ new *match.captures
9
+ end
10
+ end
11
+
12
+ def regex
13
+ raise "subclass responsibility"
14
+ end
15
+ end
16
+
17
+ def initialize
18
+ raise "implement expecting one argumnt for each capture group in the regex"
19
+ end
20
+
21
+ def wrap(text)
22
+ raise "wrap given text in resulting SSML element"
23
+ end
24
+
25
+ def combine(annotation)
26
+ raise "combine this annotation with the given annotation of the same type"
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,37 @@
1
+ require_relative 'annotation'
2
+
3
+ module SSMD::Annotations
4
+ class LanguageAnnotation < Annotation
5
+ attr_reader :language
6
+
7
+ def self.regex
8
+ /([a-z]{2}(?:-[A-Z]{2})?)/
9
+ end
10
+
11
+ def initialize(language)
12
+ @language = complete_language language
13
+ end
14
+
15
+ def wrap(text)
16
+ "<lang xml:lang=\"#{language}\">#{text}</lang>"
17
+ end
18
+
19
+ def combine(annotation)
20
+ self # discard further language annotations
21
+ end
22
+
23
+ def complete_language(language)
24
+ if language.size == 2
25
+ language_completion_table[language] || "#{language}-#{language.upcase}"
26
+ else
27
+ language
28
+ end
29
+ end
30
+
31
+ def language_completion_table
32
+ {
33
+ "en" => "en-US"
34
+ }
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,44 @@
1
+ require_relative 'annotation'
2
+
3
+ require 'pathname'
4
+
5
+ module SSMD::Annotations
6
+ class PhonemeAnnotation < Annotation
7
+ attr_reader :x_sampa, :ipa
8
+
9
+ def self.regex
10
+ /ph: ?(.+)/
11
+ end
12
+
13
+ def initialize(x_sampa)
14
+ @x_sampa = x_sampa
15
+ @ipa = x_sampa_to_ipa x_sampa
16
+ end
17
+
18
+ def wrap(text)
19
+ "<phoneme alphabet=\"ipa\" ph=\"#{ipa}\">#{text}</phoneme>"
20
+ end
21
+
22
+ def combine(annotation)
23
+ self # discard further phoneme annotations
24
+ end
25
+
26
+ def x_sampa_to_ipa(input)
27
+ x_sampa_to_ipa_table.inject(input) do |text, (x_sampa, ipa)|
28
+ text.gsub x_sampa, ipa
29
+ end
30
+ end
31
+
32
+ def x_sampa_to_ipa_table
33
+ @table ||= begin
34
+ lines = File.read(x_sampa_to_ipa_table_file_path).lines
35
+
36
+ lines.map { |line| line.split(" ") }
37
+ end
38
+ end
39
+
40
+ def x_sampa_to_ipa_table_file_path
41
+ Pathname(SSMD.root_dir).join("lib/ssmd/annotations/xsampa_to_ipa_table.txt")
42
+ end
43
+ end
44
+ end
@@ -0,0 +1,174 @@
1
+ a a
2
+ b b
3
+ b_< ɓ
4
+ c c
5
+ d d
6
+ d` ɖ
7
+ d_< ɗ
8
+ e e
9
+ f f
10
+ g ɡ
11
+ g_< ɠ
12
+ h h
13
+ h\ ɦ
14
+ i i
15
+ j j
16
+ j\ ʝ
17
+ k k
18
+ l l
19
+ l` ɭ
20
+ l\ ɺ
21
+ m m
22
+ n n
23
+ n` ɳ
24
+ o o
25
+ p p
26
+ p\ ɸ
27
+ q q
28
+ r r
29
+ r` ɽ
30
+ r\ ɹ
31
+ r\` ɻ
32
+ s s
33
+ s` ʂ
34
+ s\ ɕ
35
+ t t
36
+ t` ʈ
37
+ u u
38
+ v v
39
+ v\ ʋ
40
+ P ʋ
41
+ w w
42
+ x x
43
+ x\ ɧ
44
+ y y
45
+ z z
46
+ z` ʐ
47
+ z\ ʑ
48
+ A ɑ
49
+ B β
50
+ B\ ʙ
51
+ C ç
52
+ D ð
53
+ E ɛ
54
+ F ɱ
55
+ G ɣ
56
+ G\ ɢ
57
+ G\_< ʛ
58
+ H ɥ
59
+ H\ ʜ
60
+ I ɪ
61
+ I\ ɪ̈
62
+ I\ ɨ̞
63
+ J ɲ
64
+ J\ ɟ
65
+ J\_< ʄ
66
+ K ɬ
67
+ K\ ɮ
68
+ L ʎ
69
+ L\ ʟ
70
+ M ɯ
71
+ M\ ɰ
72
+ N ŋ
73
+ N\ ɴ
74
+ O ɔ
75
+ O\ ʘ
76
+ P ʋ
77
+ v\ ʋ
78
+ Q ɒ
79
+ R ʁ
80
+ R\ ʀ
81
+ S ʃ
82
+ T θ
83
+ U ʊ
84
+ U\ ʊ̈
85
+ U\ ʉ̞
86
+ V ʌ
87
+ W ʍ
88
+ X χ
89
+ X\ ħ
90
+ Y ʏ
91
+ Z ʒ
92
+ . .
93
+ " ˈ
94
+ % ˌ
95
+ ' ʲ
96
+ _j ʲ
97
+ : ː
98
+ :\ ˑ
99
+ @ ə
100
+ @\ ɘ
101
+ { æ
102
+ } ʉ
103
+ 1 ɨ
104
+ 2 ø
105
+ 3 ɜ
106
+ 3\ ɞ
107
+ 4 ɾ
108
+ 5 ɫ
109
+ 6 ɐ
110
+ 7 ɤ
111
+ 8 ɵ
112
+ 9 œ
113
+ & ɶ
114
+ ? ʔ
115
+ ?\ ʕ
116
+ <\ ʢ
117
+ >\ ʡ
118
+ ^ ꜛ
119
+ ! ꜜ
120
+ !\ ǃ
121
+ | |
122
+ |\ ǀ
123
+ || ‖
124
+ |\|\ ǁ
125
+ =\ ǂ
126
+ -\ ‿
127
+ _" ̈
128
+ _+ ̟
129
+ _- ̠
130
+ _/ ̌
131
+ _0 ̥
132
+ = ̩
133
+ _= ̩
134
+ _> ʼ
135
+ _?\ ˤ
136
+ _\ ̂
137
+ _^ ̯
138
+ _} ̚
139
+ ` ˞
140
+ ~ ̃
141
+ _~ ̃
142
+ _A ̘
143
+ _a ̺
144
+ _B ̏
145
+ _B_L ᷅
146
+ _c ̜
147
+ _d ̪
148
+ _e ̴
149
+ _F ̂
150
+ _G ˠ
151
+ _H ́
152
+ _H_T ᷄
153
+ _h ʰ
154
+ _j ʲ
155
+ ' ʲ
156
+ _k ̰
157
+ _L ̀
158
+ _l ˡ
159
+ _M ̄
160
+ _m ̻
161
+ _N ̼
162
+ _n ⁿ
163
+ _O ̹
164
+ _o ̞
165
+ _q ̙
166
+ _R ̌
167
+ _R_F ᷈
168
+ _r ̝
169
+ _T ̋
170
+ _t ̤
171
+ _v ̬
172
+ _w ʷ
173
+ _X ̆
174
+ _x ̽
@@ -0,0 +1,35 @@
1
+ require 'ssmd/processors'
2
+
3
+ module SSMD
4
+ class Converter
5
+ attr_reader :input
6
+
7
+ def initialize(input)
8
+ @input = input
9
+ end
10
+
11
+ def convert
12
+ result = processors.inject(input) do |text, processor|
13
+ process processor.new, text
14
+ end
15
+
16
+ "<speak>#{result.strip}</speak>"
17
+ end
18
+
19
+ def processors
20
+ p = SSMD::Processors
21
+
22
+ [
23
+ p::EmphasisProcessor, p::AnnotationProcessor, p::MarkProcessor
24
+ ]
25
+ end
26
+
27
+ def process(processor, input)
28
+ if processor.matches? input
29
+ process processor, processor.substitute(input)
30
+ else
31
+ input
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,8 @@
1
+ module SSMD::Processors
2
+
3
+ end
4
+
5
+ require 'ssmd/processors/processor'
6
+ require 'ssmd/processors/annotation_processor'
7
+ require 'ssmd/processors/emphasis_processor'
8
+ require 'ssmd/processors/mark_processor'
@@ -0,0 +1,86 @@
1
+ require_relative 'processor'
2
+
3
+ require 'ssmd/annotations'
4
+
5
+ module SSMD::Processors
6
+ class AnnotationProcessor < Processor
7
+ attr_reader :text, :annotations
8
+
9
+ def result
10
+ @text, annotations_text = match.captures
11
+
12
+ if annotations_text
13
+ @annotations = combine_annotations parse_annotations(annotations_text)
14
+
15
+ @annotations.inject(text) do |text, a|
16
+ a.wrap(text)
17
+ end
18
+ end
19
+ end
20
+
21
+ def self.annotations
22
+ a = SSMD::Annotations
23
+
24
+ [a::LanguageAnnotation, a::PhonemeAnnotation]
25
+ end
26
+
27
+ def ok?
28
+ !@text.nil? && Array(@annotations).size > 0
29
+ end
30
+
31
+ def error?
32
+ !ok?
33
+ end
34
+
35
+ private
36
+
37
+ def parse_annotations(text)
38
+ text.split(/, ?/).flat_map do |annotation_text|
39
+ annotation = find_annotation annotation_text
40
+
41
+ if annotation.nil?
42
+ @warnings.push "Unknown annotation: #{text}"
43
+ end
44
+
45
+ [annotation].compact
46
+ end
47
+ end
48
+
49
+ def find_annotation(text)
50
+ self.class.annotations.lazy
51
+ .map { |a| a.try text }
52
+ .find { |a| !a.nil? }
53
+ end
54
+
55
+ def combine_annotations(annotations)
56
+ annotations
57
+ .group_by { |a| a.class }
58
+ .values
59
+ .map { |as| as.reduce { |a, b| a.combine b } }
60
+ end
61
+
62
+ ##
63
+ # Matches explicitly annotated sections.
64
+ # For example:
65
+ #
66
+ # [Guardians of the Galaxy](en-GB, v: +4dB, p: -3%)
67
+ def regex
68
+ %r{
69
+ \[ # opening text
70
+ ([^\]]+) # annotated text
71
+ \] # closing text
72
+ \( # opening annotations
73
+ ((?:
74
+ (?:
75
+ #{annotations_regex}
76
+ )(?:,\s?)?
77
+ )+)
78
+ \) # closing annotations
79
+ }x
80
+ end
81
+
82
+ def annotations_regex
83
+ self.class.annotations.map(&:regex).join("|")
84
+ end
85
+ end
86
+ end
@@ -0,0 +1,15 @@
1
+ require_relative 'processor'
2
+
3
+ module SSMD::Processors
4
+ class EmphasisProcessor < Processor
5
+ def result
6
+ text = match.captures.first
7
+
8
+ "<emphasis>#{text}</emphasis>"
9
+ end
10
+
11
+ def regex
12
+ /\*([^\*]+)\*/
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,15 @@
1
+ require_relative 'processor'
2
+
3
+ module SSMD::Processors
4
+ class MarkProcessor < Processor
5
+ def result
6
+ text = match.captures.first
7
+
8
+ "<mark name=\"#{text}\"/>"
9
+ end
10
+
11
+ def regex
12
+ /@(\w+)/
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,29 @@
1
+ module SSMD::Processors
2
+ class Processor
3
+ attr_reader :match
4
+
5
+ def matches?(input)
6
+ @match = regex.match input
7
+
8
+ !match.nil?
9
+ end
10
+
11
+ def substitute(input)
12
+ if match
13
+ match.pre_match + result + match.post_match
14
+ end
15
+ end
16
+
17
+ def result
18
+ raise "subclass responsibility"
19
+ end
20
+
21
+ def regex
22
+ raise "subclass responsibility"
23
+ end
24
+
25
+ def warnings
26
+ @warnings ||= []
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,3 @@
1
+ module SSMD
2
+ VERSION = "0.1.0"
3
+ end
@@ -0,0 +1,40 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'ssmd/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "ssmd"
8
+ spec.version = SSMD::VERSION
9
+ spec.authors = ["Markus Kahl"]
10
+ spec.email = ["machisuji@gmail.com"]
11
+
12
+ spec.summary = %q{
13
+ Speech Synthesis Markdown (SSMD) is an lightweight alternative syntax for SSML
14
+ and the corresponding tool converting SSMD to SSML.
15
+ }
16
+ spec.homepage = "https://github.com/machisuji/ssmd"
17
+ spec.license = "MIT"
18
+
19
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
20
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
21
+ if spec.respond_to?(:metadata)
22
+ spec.metadata['allowed_push_host'] = "https://rubygems.org"
23
+ else
24
+ raise "RubyGems 2.0 or newer is required to protect against " \
25
+ "public gem pushes."
26
+ end
27
+
28
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
29
+ f.match(%r{^(test|spec|features)/})
30
+ end
31
+ spec.bindir = "exe"
32
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
33
+ spec.require_paths = ["lib"]
34
+
35
+ spec.add_development_dependency "bundler", "~> 1.14"
36
+ spec.add_development_dependency "rake", "~> 10.0"
37
+ spec.add_development_dependency "rspec", "~> 3.0"
38
+ spec.add_development_dependency "pry", "~> 0.10.4"
39
+ spec.add_development_dependency "pry-byebug", "~> 3.4", ">= 3.4.2"
40
+ end
metadata ADDED
@@ -0,0 +1,147 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ssmd
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Markus Kahl
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2017-05-13 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.14'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.14'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rspec
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '3.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '3.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: pry
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: 0.10.4
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: 0.10.4
69
+ - !ruby/object:Gem::Dependency
70
+ name: pry-byebug
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '3.4'
76
+ - - ">="
77
+ - !ruby/object:Gem::Version
78
+ version: 3.4.2
79
+ type: :development
80
+ prerelease: false
81
+ version_requirements: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - "~>"
84
+ - !ruby/object:Gem::Version
85
+ version: '3.4'
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ version: 3.4.2
89
+ description:
90
+ email:
91
+ - machisuji@gmail.com
92
+ executables: []
93
+ extensions: []
94
+ extra_rdoc_files: []
95
+ files:
96
+ - ".gitignore"
97
+ - ".rspec"
98
+ - ".travis.yml"
99
+ - CODE_OF_CONDUCT.md
100
+ - Gemfile
101
+ - LICENSE.txt
102
+ - README.md
103
+ - Rakefile
104
+ - SPECIFICATION.md
105
+ - bin/console
106
+ - bin/setup
107
+ - lib/ssmd.rb
108
+ - lib/ssmd/annotations.rb
109
+ - lib/ssmd/annotations/annotation.rb
110
+ - lib/ssmd/annotations/language_annotation.rb
111
+ - lib/ssmd/annotations/phoneme_annotation.rb
112
+ - lib/ssmd/annotations/xsampa_to_ipa_table.txt
113
+ - lib/ssmd/converter.rb
114
+ - lib/ssmd/processors.rb
115
+ - lib/ssmd/processors/annotation_processor.rb
116
+ - lib/ssmd/processors/emphasis_processor.rb
117
+ - lib/ssmd/processors/mark_processor.rb
118
+ - lib/ssmd/processors/processor.rb
119
+ - lib/ssmd/version.rb
120
+ - ssmd.gemspec
121
+ homepage: https://github.com/machisuji/ssmd
122
+ licenses:
123
+ - MIT
124
+ metadata:
125
+ allowed_push_host: https://rubygems.org
126
+ post_install_message:
127
+ rdoc_options: []
128
+ require_paths:
129
+ - lib
130
+ required_ruby_version: !ruby/object:Gem::Requirement
131
+ requirements:
132
+ - - ">="
133
+ - !ruby/object:Gem::Version
134
+ version: '0'
135
+ required_rubygems_version: !ruby/object:Gem::Requirement
136
+ requirements:
137
+ - - ">="
138
+ - !ruby/object:Gem::Version
139
+ version: '0'
140
+ requirements: []
141
+ rubyforge_project:
142
+ rubygems_version: 2.6.12
143
+ signing_key:
144
+ specification_version: 4
145
+ summary: Speech Synthesis Markdown (SSMD) is an lightweight alternative syntax for
146
+ SSML and the corresponding tool converting SSMD to SSML.
147
+ test_files: []