ssmd 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 6818a5bb8c25e44cf95522fe1a5d0852a9114c9d
4
+ data.tar.gz: ca1ff40b6e7a5f51900ceea642771b959bcdb41b
5
+ SHA512:
6
+ metadata.gz: 9e6ff09b5251da93e455a8731ccf624881abf4e52da4c18369c6bd55d413d059f2e7c77f8aec7070e7d530f1c27cab76a65d73534f70c196d9773070070251f8
7
+ data.tar.gz: 4c8e7b110e9e48ac65e3e957cb4b2b30535fa6e5811503f24ff62b3355003001168f969caa33a9288b9f3b6570c24bdace187eea0008c9fd2939342701b614f6
@@ -0,0 +1,12 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+
11
+ # rspec failure tracking
12
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
@@ -0,0 +1,5 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.3.4
5
+ before_install: gem install bundler -v 1.14.6
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at machisuji@gmail.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in ssmd.gemspec
4
+ gemspec
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2017 Markus Kahl
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,73 @@
1
+ # SSMD
2
+
3
+ [![Build Status](https://travis-ci.org/machisuji/ssmd.svg?branch=master)](https://travis-ci.org/machisuji/ssmd)
4
+
5
+ Speech Synthesis Markdown (SSMD) is an lightweight alternative syntax for [SSML](https://www.w3.org/TR/speech-synthesis/).
6
+ This repository contains both the reference implementation of the SSMD-to-SSML conversion tool (`ssmd`) as well
7
+ as the [specification](SPECIFICATION.md) of the language.
8
+
9
+ ## Requirements
10
+
11
+ The tools and executable specification provided in this repository require **Ruby 2.3.4** or better.
12
+
13
+ ## Installation
14
+
15
+ Add this line to your application's Gemfile:
16
+
17
+ ```ruby
18
+ gem 'ssmd'
19
+ ```
20
+
21
+ And then execute:
22
+
23
+ $ bundle
24
+
25
+ Or install it yourself as:
26
+
27
+ $ gem install ssmd
28
+
29
+ ## Usage
30
+
31
+ ```ruby
32
+ require 'ssmd'
33
+
34
+ ssmd = "hello *SSMD*!"
35
+ ssmd = SSMD.to_ssml ssmd
36
+
37
+ puts ssmd
38
+ # Output: <speak>hello <emphasis>SSMD</emphasis>!</speak>
39
+ ```
40
+
41
+ **Note:**
42
+
43
+ This version is still under development. So far only the following conversions
44
+ described in the specification are implemented:
45
+
46
+ * Text
47
+ * Emphasis
48
+ * Mark
49
+ * Language
50
+ * Phoneme
51
+
52
+ ## Development
53
+
54
+ After checking out the repo, run `bin/setup` to install dependencies. You can run `bin/console` for an interactive prompt that will allow you to experiment.
55
+
56
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
57
+
58
+ ### Tests
59
+
60
+ Run `rake spec` to run the tests against a given executable.
61
+
62
+ This implementation and any other can be tested against the SSMD specification.
63
+ Said specification is extracted from `SPECIFICATION.md`.
64
+ It runs each SSMD snippet through the tested tool and compares it to the output of
65
+ the following SSML snippet. If they match the test passes.
66
+
67
+ ## Contributing
68
+
69
+ Bug reports and pull requests are welcome on GitHub at https://github.com/machisuji/ssmd. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
70
+
71
+ ## License
72
+
73
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,333 @@
1
+ # SSMD Specification
2
+
3
+ Here we specify how Speech Synthesis Markdown (SSMD) works.
4
+
5
+ ## Syntax
6
+
7
+ SSMD is mapped to SSML using the following rules.
8
+
9
+ * [Text](#text)
10
+ * [Emphasis](#emphasis)
11
+ * [Break](#break)
12
+ * [Language](#language)
13
+ * [Mark](#mark)
14
+ * [Paragraph](#paragraph)
15
+ * [Phoneme](#phoneme)
16
+ * [Prosody](#prosody)
17
+ * [Say-as](#say-as)
18
+ * [Substitution](#substitution)
19
+ * [Extensions](#extensions)
20
+
21
+ ***
22
+
23
+ ### Text
24
+
25
+ Any text written is implicitly wrapped in a `<speak>` root element.
26
+ This will be omitted in the rest of the examples shown in this section.
27
+
28
+ SSMD:
29
+ ```
30
+ text
31
+ ```
32
+
33
+ SSML:
34
+ ```html
35
+ <speak>text</speak>
36
+ ```
37
+
38
+ ***
39
+
40
+ ### Emphasis
41
+
42
+ SSMD:
43
+ ```
44
+ *word*
45
+ ```
46
+
47
+ SSML:
48
+ ```html
49
+ <emphasis>word</emphasis>
50
+ ```
51
+
52
+ ***
53
+
54
+ ### Break
55
+
56
+ Pauses can be indicated by using `...`. Several modifications to the duration are allowed as shown below.
57
+
58
+ SSMD:
59
+ ```
60
+ Hello ... world (default: x-strong break like after a paragraph)
61
+ Hello - ...0 world (skip break when there would otherwise be one like after this dash)
62
+ Hello ...c world (medium break like after a comma)
63
+ Hello ...s world (strong break like after a sentence)
64
+ Hello ...p world (extra string break like after a paragraph)
65
+ Hello ...5s world (5 second break (max 10s))
66
+ Hello ...100ms world (100 millisecond break (max 10000ms))
67
+ Hello ...100 world (100 millisecond break (max 10000ms))
68
+ ```
69
+
70
+ SSML:
71
+ ```html
72
+ Hello <break strength="x-strong"/> world
73
+ Hello - <break strength="none"/> world
74
+ Hello <break strength="medium"/> world
75
+ Hello <break strength="strong"/> world
76
+ Hello <break strength="x-strong"/> world
77
+ Hello <break time="5s"/> world
78
+ Hello <break time="100ms"/> world
79
+ Hello <break time="100ms"/> world
80
+ ```
81
+
82
+ ***
83
+
84
+ ### Language
85
+
86
+ Text passages can be annotated with ISO 639-1 language codes as shown below.
87
+ SSML expects a full code including a country. While you can provide those too
88
+ SSMD will use a sensible default in case where this is omitted.
89
+ As can be seen in the first example where `en` defaults to `en-US` and
90
+ `de` defaults to `de-DE`.
91
+
92
+ SSMD:
93
+ ```
94
+ Ich sah [Guardians of the Galaxy](en) im Kino.
95
+ Ich sah [Guardians of the Galaxy](en-GB) im Kino.
96
+ I saw ["Die Häschenschule"](de) in the cinema.
97
+ ```
98
+
99
+ SSML:
100
+ ```html
101
+ Ich sah <lang xml:lang="en-US">Guardians of the Galaxy</lang> im Kino.
102
+ Ich sah <lang xml:lang="en-GB">Guardians of the Galaxy</lang> im Kino.
103
+ I saw <lang xml:lang="de-DE">"Die Häschenschule"</lang> in the cinema.
104
+ ```
105
+
106
+ ***
107
+
108
+ ### Mark
109
+
110
+ Sections of text can be tagged using marks. They do not effect the synthesis but
111
+ can be returned by SSML processing engines as meta information and to emit
112
+ events during processing based on these marks.
113
+
114
+ SSMD:
115
+ ```
116
+ I always wanted a @animal cat as a pet.
117
+ ```
118
+
119
+ SSML:
120
+ ```html
121
+ I always wanted a <mark name="animal"/> cat as a pet.
122
+ ```
123
+
124
+ ***
125
+
126
+ ### Paragraph
127
+
128
+ Empty lines indicate a paragraph.
129
+
130
+ SSMD:
131
+ ```
132
+ First prepare the ingredients.
133
+ Don't forget to wash them first.
134
+
135
+ Lastly mix them all together.
136
+ ```
137
+
138
+ SSML:
139
+ ```html
140
+ <p>First prepare the ingredients. Don't forget to wash them first.</p>
141
+ <p>Lastly mix them all together.</p>
142
+ ```
143
+
144
+ ***
145
+
146
+ ### Phoneme
147
+
148
+ Sometimes the speech synthesis engine needs to be told how exactly to pronounce a word.
149
+ This can be done via phonemes. While SSML supports IPA, SSMD uses [X-SAMPA](https://en.wikipedia.org/wiki/X-SAMPA) by default.
150
+
151
+ SSMD:
152
+ ```
153
+ The German word ["dich"](ph: dIC) does not sound like dick.
154
+ ```
155
+
156
+ SSML:
157
+ ```html
158
+ The German word <phoneme alphabet="ipa" ph="dɪç">"dich"</phoneme> does not sound like dick.
159
+ ```
160
+
161
+ ***
162
+
163
+ ### Prosody
164
+
165
+ The prosody or rythm depends the volume, rate and pitch of the delivered text.
166
+
167
+ Each of those values can be defined by a number between 1 and 5 where those mean:
168
+
169
+ | number | volume | rate | pitch |
170
+ | ------ | ------ | ---- | ----- |
171
+ | 0 | silent | | |
172
+ | 1 | x-soft | x-slow | x-low |
173
+ | 2 | soft | slow | low |
174
+ | 3 | medium | medium | medium |
175
+ | 4 | loud | fast | high |
176
+ | 5 | x-loud | x-fast | x-high |
177
+
178
+ SSMD:
179
+ ```
180
+ Volume:
181
+
182
+ ~silent~
183
+ --extra soft--
184
+ -soft-
185
+ medium
186
+ +loud+ or LOUD
187
+ ++extra loud++
188
+
189
+ Rate:
190
+
191
+ <<extra slow<<
192
+ <slow<
193
+ medium
194
+ >fast>
195
+ >>extra fast>>
196
+
197
+ Pitch:
198
+
199
+ __extra low__
200
+ _low_
201
+ medium
202
+ ^high^
203
+ ^^extra high^^
204
+
205
+ ++>>^^extra loud, fast and high^^>>++ or
206
+ [extra loud, fast, and high](vrp: 555) or
207
+ [extra loud, fast, and high](v: 5, r: 5, p: 5)
208
+ ```
209
+
210
+ SSML:
211
+ ```html
212
+ Volume:
213
+
214
+ <prosody volume="silent">silent</prosody>
215
+ <prosody volume="x-soft">extra soft</prosody>
216
+ <prosody volume="soft">soft</prosody>
217
+ medium
218
+ <prosody volume="loud">loud</prosody> or <prosody volume="loud">loud</prosody>
219
+ <prosody volume="x-loud">extra loud</prosody>
220
+
221
+ Rate:
222
+
223
+ <prosody rate="x-slow">extra slow</prosody>
224
+ <prosody rate="slow">slow</prosody>
225
+ medium
226
+ <prosody rate="fast">fast</prosody>
227
+ <prosody rate="x-fast">extra fast</prosody>
228
+
229
+ Pitch:
230
+
231
+ <prosody pitch="x-low">extra low</prosody>
232
+ <prosody pitch="low">low</prosody>
233
+ medium
234
+ <prosody pitch="high">high</prosody>
235
+ <prosody pitch="x-high">extra high</prosody>
236
+
237
+ <prosody volume="x-loud" rate="x-fast" pitch="x-high">extra loud, fast and high</prosody> or
238
+ <prosody volume="x-loud" rate="x-fast" pitch="x-high">extra loud, fast and high</prosody> or
239
+ <prosody volume="x-loud" rate="x-fast" pitch="x-high">extra loud, fast and high</prosody>
240
+ ```
241
+
242
+ The shortcuts are listed first. While they can be combined, sometimes it's easier and shorter to just use
243
+ the explizit form shown in the last 2 lines. All of them can be nested, too.
244
+ Moreover changes in volume (`[louder](v: +10dB)`) and pitch (`[lower](p: -4%)`) can also be given explicitly in relative values.
245
+
246
+ ***
247
+
248
+ ### Say-as
249
+
250
+ You can give the speech sythesis engine hints as to what it's supposed to read using `as`.
251
+
252
+ Possible values:
253
+
254
+ * character - spell out each single character, e.g. for KGB
255
+ * number - cardinal number, e.g. 100
256
+ * ordinal - ordinal number, e.g. 1st
257
+ * digits - spell out each single digit, e.g. 123 as 1 - 2 - 3
258
+ * fraction - pronounce number as fraction, e.g. 3/4 as three quarters
259
+ * unit - e.g. 1meter
260
+ * date - read content as a date, must provide format
261
+ * time - duration in minutes and seconds
262
+ * address - read as part of an address
263
+ * telephone - read content as a telephone number
264
+ * expletive - beeps out the content
265
+
266
+ SSMD:
267
+ ```
268
+ Today on [29.12.2017](as: date, format: "dd.mm.yyyy") my
269
+ telephone number is [+49 123456](as: telephone).
270
+ You can't say [fuck](as: expletive) on television.
271
+ ```
272
+
273
+ SSML:
274
+ ```html
275
+ Today on <say-as interpret-as="date" format="dd.mm.yyyy">29.12.2017</say-as> my
276
+ telephone number is <say-as interpret-as="telephone">+49 123456</say-as>.
277
+ You can't say <say-as interpret-as="expletive">fuck</say-as> on television.
278
+ ```
279
+
280
+ ***
281
+
282
+ ### Substitution
283
+
284
+ Allows to substitute the pronuciation of a word, such as an acronym, with an alias.
285
+
286
+ SSMD:
287
+ ```
288
+ I'd like to drink some [H2O](sub: water) now.
289
+ ```
290
+
291
+ SSML:
292
+ ```html
293
+ I'd like to drink some <sub alias="water">H2O</sub> now.
294
+ ```
295
+
296
+ ***
297
+
298
+ ### Extensions
299
+
300
+ It must be possible to extend SSML with constructs specific to certain speech synthesis engines.
301
+ Registered extensions must have a unique name. They can take parameters.
302
+ For instance let's a assume we registered Amazon Polly's whisper effect in some hypothetical SSMD
303
+ library API.
304
+
305
+ ```ruby
306
+ SSMD.register "whisper", "amazon:effect", name: "whispered"
307
+ ```
308
+
309
+ SSMD:
310
+ ```
311
+ If he [whispers](ext: whisper), he lies.
312
+ ```
313
+
314
+ SSML:
315
+ ```html
316
+ If he <amazon:effect name="whispered">whispers</amazon:effect>, he lies.
317
+ ```
318
+
319
+ ***
320
+
321
+ ### Nesting and duplicate annotations
322
+
323
+ Formats can be nested. Duplicate annotations of the same type are ignored.
324
+
325
+ SSMD:
326
+ ```
327
+ Der Film [Guardians of the *Galaxy*](en-GB, de, fr-FR) ist ganz [okay](en-US).
328
+ ```
329
+
330
+ SSML:
331
+ ```html
332
+ Der Film <lang xml:lang="en-GB">Guardians of the <emphasis>Galaxy</emphasis></lang> ist ganz <lang xml:lang="en-US">okay</lang>.
333
+ ```
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "ssmd"
5
+
6
+ require "pry"
7
+ Pry.start
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,14 @@
1
+ require "ssmd/version"
2
+ require "ssmd/converter"
3
+
4
+ module SSMD
5
+ module_function
6
+
7
+ def to_ssml(ssmd)
8
+ Converter.new(ssmd).convert
9
+ end
10
+
11
+ def root_dir
12
+ Gem::Specification.find_by_name("ssmd").gem_dir
13
+ end
14
+ end
@@ -0,0 +1,7 @@
1
+ module SSMD::Annotations
2
+
3
+ end
4
+
5
+ require 'ssmd/annotations/annotation'
6
+ require 'ssmd/annotations/language_annotation'
7
+ require 'ssmd/annotations/phoneme_annotation'
@@ -0,0 +1,29 @@
1
+ module SSMD::Annotations
2
+ class Annotation
3
+ class << self
4
+ def try(text)
5
+ match = /\A#{regex}\Z/.match text
6
+
7
+ if match
8
+ new *match.captures
9
+ end
10
+ end
11
+
12
+ def regex
13
+ raise "subclass responsibility"
14
+ end
15
+ end
16
+
17
+ def initialize
18
+ raise "implement expecting one argumnt for each capture group in the regex"
19
+ end
20
+
21
+ def wrap(text)
22
+ raise "wrap given text in resulting SSML element"
23
+ end
24
+
25
+ def combine(annotation)
26
+ raise "combine this annotation with the given annotation of the same type"
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,37 @@
1
+ require_relative 'annotation'
2
+
3
+ module SSMD::Annotations
4
+ class LanguageAnnotation < Annotation
5
+ attr_reader :language
6
+
7
+ def self.regex
8
+ /([a-z]{2}(?:-[A-Z]{2})?)/
9
+ end
10
+
11
+ def initialize(language)
12
+ @language = complete_language language
13
+ end
14
+
15
+ def wrap(text)
16
+ "<lang xml:lang=\"#{language}\">#{text}</lang>"
17
+ end
18
+
19
+ def combine(annotation)
20
+ self # discard further language annotations
21
+ end
22
+
23
+ def complete_language(language)
24
+ if language.size == 2
25
+ language_completion_table[language] || "#{language}-#{language.upcase}"
26
+ else
27
+ language
28
+ end
29
+ end
30
+
31
+ def language_completion_table
32
+ {
33
+ "en" => "en-US"
34
+ }
35
+ end
36
+ end
37
+ end
@@ -0,0 +1,44 @@
1
+ require_relative 'annotation'
2
+
3
+ require 'pathname'
4
+
5
+ module SSMD::Annotations
6
+ class PhonemeAnnotation < Annotation
7
+ attr_reader :x_sampa, :ipa
8
+
9
+ def self.regex
10
+ /ph: ?(.+)/
11
+ end
12
+
13
+ def initialize(x_sampa)
14
+ @x_sampa = x_sampa
15
+ @ipa = x_sampa_to_ipa x_sampa
16
+ end
17
+
18
+ def wrap(text)
19
+ "<phoneme alphabet=\"ipa\" ph=\"#{ipa}\">#{text}</phoneme>"
20
+ end
21
+
22
+ def combine(annotation)
23
+ self # discard further phoneme annotations
24
+ end
25
+
26
+ def x_sampa_to_ipa(input)
27
+ x_sampa_to_ipa_table.inject(input) do |text, (x_sampa, ipa)|
28
+ text.gsub x_sampa, ipa
29
+ end
30
+ end
31
+
32
+ def x_sampa_to_ipa_table
33
+ @table ||= begin
34
+ lines = File.read(x_sampa_to_ipa_table_file_path).lines
35
+
36
+ lines.map { |line| line.split(" ") }
37
+ end
38
+ end
39
+
40
+ def x_sampa_to_ipa_table_file_path
41
+ Pathname(SSMD.root_dir).join("lib/ssmd/annotations/xsampa_to_ipa_table.txt")
42
+ end
43
+ end
44
+ end
@@ -0,0 +1,174 @@
1
+ a a
2
+ b b
3
+ b_< ɓ
4
+ c c
5
+ d d
6
+ d` ɖ
7
+ d_< ɗ
8
+ e e
9
+ f f
10
+ g ɡ
11
+ g_< ɠ
12
+ h h
13
+ h\ ɦ
14
+ i i
15
+ j j
16
+ j\ ʝ
17
+ k k
18
+ l l
19
+ l` ɭ
20
+ l\ ɺ
21
+ m m
22
+ n n
23
+ n` ɳ
24
+ o o
25
+ p p
26
+ p\ ɸ
27
+ q q
28
+ r r
29
+ r` ɽ
30
+ r\ ɹ
31
+ r\` ɻ
32
+ s s
33
+ s` ʂ
34
+ s\ ɕ
35
+ t t
36
+ t` ʈ
37
+ u u
38
+ v v
39
+ v\ ʋ
40
+ P ʋ
41
+ w w
42
+ x x
43
+ x\ ɧ
44
+ y y
45
+ z z
46
+ z` ʐ
47
+ z\ ʑ
48
+ A ɑ
49
+ B β
50
+ B\ ʙ
51
+ C ç
52
+ D ð
53
+ E ɛ
54
+ F ɱ
55
+ G ɣ
56
+ G\ ɢ
57
+ G\_< ʛ
58
+ H ɥ
59
+ H\ ʜ
60
+ I ɪ
61
+ I\ ɪ̈
62
+ I\ ɨ̞
63
+ J ɲ
64
+ J\ ɟ
65
+ J\_< ʄ
66
+ K ɬ
67
+ K\ ɮ
68
+ L ʎ
69
+ L\ ʟ
70
+ M ɯ
71
+ M\ ɰ
72
+ N ŋ
73
+ N\ ɴ
74
+ O ɔ
75
+ O\ ʘ
76
+ P ʋ
77
+ v\ ʋ
78
+ Q ɒ
79
+ R ʁ
80
+ R\ ʀ
81
+ S ʃ
82
+ T θ
83
+ U ʊ
84
+ U\ ʊ̈
85
+ U\ ʉ̞
86
+ V ʌ
87
+ W ʍ
88
+ X χ
89
+ X\ ħ
90
+ Y ʏ
91
+ Z ʒ
92
+ . .
93
+ " ˈ
94
+ % ˌ
95
+ ' ʲ
96
+ _j ʲ
97
+ : ː
98
+ :\ ˑ
99
+ @ ə
100
+ @\ ɘ
101
+ { æ
102
+ } ʉ
103
+ 1 ɨ
104
+ 2 ø
105
+ 3 ɜ
106
+ 3\ ɞ
107
+ 4 ɾ
108
+ 5 ɫ
109
+ 6 ɐ
110
+ 7 ɤ
111
+ 8 ɵ
112
+ 9 œ
113
+ & ɶ
114
+ ? ʔ
115
+ ?\ ʕ
116
+ <\ ʢ
117
+ >\ ʡ
118
+ ^ ꜛ
119
+ ! ꜜ
120
+ !\ ǃ
121
+ | |
122
+ |\ ǀ
123
+ || ‖
124
+ |\|\ ǁ
125
+ =\ ǂ
126
+ -\ ‿
127
+ _" ̈
128
+ _+ ̟
129
+ _- ̠
130
+ _/ ̌
131
+ _0 ̥
132
+ = ̩
133
+ _= ̩
134
+ _> ʼ
135
+ _?\ ˤ
136
+ _\ ̂
137
+ _^ ̯
138
+ _} ̚
139
+ ` ˞
140
+ ~ ̃
141
+ _~ ̃
142
+ _A ̘
143
+ _a ̺
144
+ _B ̏
145
+ _B_L ᷅
146
+ _c ̜
147
+ _d ̪
148
+ _e ̴
149
+ _F ̂
150
+ _G ˠ
151
+ _H ́
152
+ _H_T ᷄
153
+ _h ʰ
154
+ _j ʲ
155
+ ' ʲ
156
+ _k ̰
157
+ _L ̀
158
+ _l ˡ
159
+ _M ̄
160
+ _m ̻
161
+ _N ̼
162
+ _n ⁿ
163
+ _O ̹
164
+ _o ̞
165
+ _q ̙
166
+ _R ̌
167
+ _R_F ᷈
168
+ _r ̝
169
+ _T ̋
170
+ _t ̤
171
+ _v ̬
172
+ _w ʷ
173
+ _X ̆
174
+ _x ̽
@@ -0,0 +1,35 @@
1
+ require 'ssmd/processors'
2
+
3
+ module SSMD
4
+ class Converter
5
+ attr_reader :input
6
+
7
+ def initialize(input)
8
+ @input = input
9
+ end
10
+
11
+ def convert
12
+ result = processors.inject(input) do |text, processor|
13
+ process processor.new, text
14
+ end
15
+
16
+ "<speak>#{result.strip}</speak>"
17
+ end
18
+
19
+ def processors
20
+ p = SSMD::Processors
21
+
22
+ [
23
+ p::EmphasisProcessor, p::AnnotationProcessor, p::MarkProcessor
24
+ ]
25
+ end
26
+
27
+ def process(processor, input)
28
+ if processor.matches? input
29
+ process processor, processor.substitute(input)
30
+ else
31
+ input
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,8 @@
1
+ module SSMD::Processors
2
+
3
+ end
4
+
5
+ require 'ssmd/processors/processor'
6
+ require 'ssmd/processors/annotation_processor'
7
+ require 'ssmd/processors/emphasis_processor'
8
+ require 'ssmd/processors/mark_processor'
@@ -0,0 +1,86 @@
1
+ require_relative 'processor'
2
+
3
+ require 'ssmd/annotations'
4
+
5
+ module SSMD::Processors
6
+ class AnnotationProcessor < Processor
7
+ attr_reader :text, :annotations
8
+
9
+ def result
10
+ @text, annotations_text = match.captures
11
+
12
+ if annotations_text
13
+ @annotations = combine_annotations parse_annotations(annotations_text)
14
+
15
+ @annotations.inject(text) do |text, a|
16
+ a.wrap(text)
17
+ end
18
+ end
19
+ end
20
+
21
+ def self.annotations
22
+ a = SSMD::Annotations
23
+
24
+ [a::LanguageAnnotation, a::PhonemeAnnotation]
25
+ end
26
+
27
+ def ok?
28
+ !@text.nil? && Array(@annotations).size > 0
29
+ end
30
+
31
+ def error?
32
+ !ok?
33
+ end
34
+
35
+ private
36
+
37
+ def parse_annotations(text)
38
+ text.split(/, ?/).flat_map do |annotation_text|
39
+ annotation = find_annotation annotation_text
40
+
41
+ if annotation.nil?
42
+ @warnings.push "Unknown annotation: #{text}"
43
+ end
44
+
45
+ [annotation].compact
46
+ end
47
+ end
48
+
49
+ def find_annotation(text)
50
+ self.class.annotations.lazy
51
+ .map { |a| a.try text }
52
+ .find { |a| !a.nil? }
53
+ end
54
+
55
+ def combine_annotations(annotations)
56
+ annotations
57
+ .group_by { |a| a.class }
58
+ .values
59
+ .map { |as| as.reduce { |a, b| a.combine b } }
60
+ end
61
+
62
+ ##
63
+ # Matches explicitly annotated sections.
64
+ # For example:
65
+ #
66
+ # [Guardians of the Galaxy](en-GB, v: +4dB, p: -3%)
67
+ def regex
68
+ %r{
69
+ \[ # opening text
70
+ ([^\]]+) # annotated text
71
+ \] # closing text
72
+ \( # opening annotations
73
+ ((?:
74
+ (?:
75
+ #{annotations_regex}
76
+ )(?:,\s?)?
77
+ )+)
78
+ \) # closing annotations
79
+ }x
80
+ end
81
+
82
+ def annotations_regex
83
+ self.class.annotations.map(&:regex).join("|")
84
+ end
85
+ end
86
+ end
@@ -0,0 +1,15 @@
1
+ require_relative 'processor'
2
+
3
+ module SSMD::Processors
4
+ class EmphasisProcessor < Processor
5
+ def result
6
+ text = match.captures.first
7
+
8
+ "<emphasis>#{text}</emphasis>"
9
+ end
10
+
11
+ def regex
12
+ /\*([^\*]+)\*/
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,15 @@
1
+ require_relative 'processor'
2
+
3
+ module SSMD::Processors
4
+ class MarkProcessor < Processor
5
+ def result
6
+ text = match.captures.first
7
+
8
+ "<mark name=\"#{text}\"/>"
9
+ end
10
+
11
+ def regex
12
+ /@(\w+)/
13
+ end
14
+ end
15
+ end
@@ -0,0 +1,29 @@
1
+ module SSMD::Processors
2
+ class Processor
3
+ attr_reader :match
4
+
5
+ def matches?(input)
6
+ @match = regex.match input
7
+
8
+ !match.nil?
9
+ end
10
+
11
+ def substitute(input)
12
+ if match
13
+ match.pre_match + result + match.post_match
14
+ end
15
+ end
16
+
17
+ def result
18
+ raise "subclass responsibility"
19
+ end
20
+
21
+ def regex
22
+ raise "subclass responsibility"
23
+ end
24
+
25
+ def warnings
26
+ @warnings ||= []
27
+ end
28
+ end
29
+ end
@@ -0,0 +1,3 @@
1
+ module SSMD
2
+ VERSION = "0.1.0"
3
+ end
@@ -0,0 +1,40 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'ssmd/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "ssmd"
8
+ spec.version = SSMD::VERSION
9
+ spec.authors = ["Markus Kahl"]
10
+ spec.email = ["machisuji@gmail.com"]
11
+
12
+ spec.summary = %q{
13
+ Speech Synthesis Markdown (SSMD) is an lightweight alternative syntax for SSML
14
+ and the corresponding tool converting SSMD to SSML.
15
+ }
16
+ spec.homepage = "https://github.com/machisuji/ssmd"
17
+ spec.license = "MIT"
18
+
19
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
20
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
21
+ if spec.respond_to?(:metadata)
22
+ spec.metadata['allowed_push_host'] = "https://rubygems.org"
23
+ else
24
+ raise "RubyGems 2.0 or newer is required to protect against " \
25
+ "public gem pushes."
26
+ end
27
+
28
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
29
+ f.match(%r{^(test|spec|features)/})
30
+ end
31
+ spec.bindir = "exe"
32
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
33
+ spec.require_paths = ["lib"]
34
+
35
+ spec.add_development_dependency "bundler", "~> 1.14"
36
+ spec.add_development_dependency "rake", "~> 10.0"
37
+ spec.add_development_dependency "rspec", "~> 3.0"
38
+ spec.add_development_dependency "pry", "~> 0.10.4"
39
+ spec.add_development_dependency "pry-byebug", "~> 3.4", ">= 3.4.2"
40
+ end
metadata ADDED
@@ -0,0 +1,147 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ssmd
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Markus Kahl
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2017-05-13 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: bundler
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.14'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.14'
27
+ - !ruby/object:Gem::Dependency
28
+ name: rake
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '10.0'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '10.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rspec
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '3.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '3.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: pry
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: 0.10.4
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: 0.10.4
69
+ - !ruby/object:Gem::Dependency
70
+ name: pry-byebug
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '3.4'
76
+ - - ">="
77
+ - !ruby/object:Gem::Version
78
+ version: 3.4.2
79
+ type: :development
80
+ prerelease: false
81
+ version_requirements: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - "~>"
84
+ - !ruby/object:Gem::Version
85
+ version: '3.4'
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ version: 3.4.2
89
+ description:
90
+ email:
91
+ - machisuji@gmail.com
92
+ executables: []
93
+ extensions: []
94
+ extra_rdoc_files: []
95
+ files:
96
+ - ".gitignore"
97
+ - ".rspec"
98
+ - ".travis.yml"
99
+ - CODE_OF_CONDUCT.md
100
+ - Gemfile
101
+ - LICENSE.txt
102
+ - README.md
103
+ - Rakefile
104
+ - SPECIFICATION.md
105
+ - bin/console
106
+ - bin/setup
107
+ - lib/ssmd.rb
108
+ - lib/ssmd/annotations.rb
109
+ - lib/ssmd/annotations/annotation.rb
110
+ - lib/ssmd/annotations/language_annotation.rb
111
+ - lib/ssmd/annotations/phoneme_annotation.rb
112
+ - lib/ssmd/annotations/xsampa_to_ipa_table.txt
113
+ - lib/ssmd/converter.rb
114
+ - lib/ssmd/processors.rb
115
+ - lib/ssmd/processors/annotation_processor.rb
116
+ - lib/ssmd/processors/emphasis_processor.rb
117
+ - lib/ssmd/processors/mark_processor.rb
118
+ - lib/ssmd/processors/processor.rb
119
+ - lib/ssmd/version.rb
120
+ - ssmd.gemspec
121
+ homepage: https://github.com/machisuji/ssmd
122
+ licenses:
123
+ - MIT
124
+ metadata:
125
+ allowed_push_host: https://rubygems.org
126
+ post_install_message:
127
+ rdoc_options: []
128
+ require_paths:
129
+ - lib
130
+ required_ruby_version: !ruby/object:Gem::Requirement
131
+ requirements:
132
+ - - ">="
133
+ - !ruby/object:Gem::Version
134
+ version: '0'
135
+ required_rubygems_version: !ruby/object:Gem::Requirement
136
+ requirements:
137
+ - - ">="
138
+ - !ruby/object:Gem::Version
139
+ version: '0'
140
+ requirements: []
141
+ rubyforge_project:
142
+ rubygems_version: 2.6.12
143
+ signing_key:
144
+ specification_version: 4
145
+ summary: Speech Synthesis Markdown (SSMD) is an lightweight alternative syntax for
146
+ SSML and the corresponding tool converting SSMD to SSML.
147
+ test_files: []