ibm_watson 2.0.2 → 2.1.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +6 -29
- data/lib/ibm_watson/assistant_v1.rb +114 -79
- data/lib/ibm_watson/assistant_v2.rb +83 -59
- data/lib/ibm_watson/compare_comply_v1.rb +11 -4
- data/lib/ibm_watson/discovery_v1.rb +5 -12
- data/lib/ibm_watson/discovery_v2.rb +201 -110
- data/lib/ibm_watson/language_translator_v3.rb +1 -2
- data/lib/ibm_watson/natural_language_classifier_v1.rb +14 -6
- data/lib/ibm_watson/natural_language_understanding_v1.rb +690 -3
- data/lib/ibm_watson/personality_insights_v3.rb +13 -11
- data/lib/ibm_watson/speech_to_text_v1.rb +582 -333
- data/lib/ibm_watson/text_to_speech_v1.rb +617 -35
- data/lib/ibm_watson/tone_analyzer_v3.rb +1 -2
- data/lib/ibm_watson/version.rb +1 -1
- data/lib/ibm_watson/visual_recognition_v3.rb +1 -2
- data/lib/ibm_watson/visual_recognition_v4.rb +11 -8
- data/test/integration/test_discovery_v2.rb +15 -0
- data/test/integration/test_natural_language_understanding_v1.rb +134 -1
- data/test/integration/test_text_to_speech_v1.rb +57 -0
- data/test/unit/test_discovery_v2.rb +29 -0
- data/test/unit/test_natural_language_understanding_v1.rb +231 -0
- data/test/unit/test_text_to_speech_v1.rb +145 -0
- metadata +3 -3
@@ -14,17 +14,20 @@
|
|
14
14
|
# See the License for the specific language governing permissions and
|
15
15
|
# limitations under the License.
|
16
16
|
#
|
17
|
-
# IBM OpenAPI SDK Code Generator Version: 3.
|
17
|
+
# IBM OpenAPI SDK Code Generator Version: 3.38.0-07189efd-20210827-205025
|
18
18
|
#
|
19
|
-
# IBM
|
20
|
-
#
|
21
|
-
#
|
22
|
-
#
|
23
|
-
# Watson™ Natural Language
|
24
|
-
#
|
25
|
-
#
|
26
|
-
#
|
27
|
-
#
|
19
|
+
# IBM Watson™ Personality Insights is discontinued. Existing instances are
|
20
|
+
# supported until 1 December 2021, but as of 1 December 2020, you cannot create new
|
21
|
+
# instances. Any instance that exists on 1 December 2021 will be deleted.<br/><br/>No
|
22
|
+
# direct replacement exists for Personality Insights. However, you can consider using [IBM
|
23
|
+
# Watson™ Natural Language
|
24
|
+
# Understanding](https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-about)
|
25
|
+
# on IBM Cloud® as part of a replacement analytic workflow for your Personality
|
26
|
+
# Insights use cases. You can use Natural Language Understanding to extract data and
|
27
|
+
# insights from text, such as keywords, categories, sentiment, emotion, and syntax. For
|
28
|
+
# more information about the personality models in Personality Insights, see [The science
|
29
|
+
# behind the
|
30
|
+
# service](https://cloud.ibm.com/docs/personality-insights?topic=personality-insights-science).
|
28
31
|
# {: deprecated}
|
29
32
|
#
|
30
33
|
# The IBM Watson Personality Insights service enables applications to derive insights from
|
@@ -54,7 +57,6 @@ require "json"
|
|
54
57
|
require "ibm_cloud_sdk_core"
|
55
58
|
require_relative "./common.rb"
|
56
59
|
|
57
|
-
# Module for the Watson APIs
|
58
60
|
module IBMWatson
|
59
61
|
##
|
60
62
|
# The Personality Insights V3 service.
|
@@ -14,14 +14,20 @@
|
|
14
14
|
# See the License for the specific language governing permissions and
|
15
15
|
# limitations under the License.
|
16
16
|
#
|
17
|
-
# IBM OpenAPI SDK Code Generator Version: 3.
|
17
|
+
# IBM OpenAPI SDK Code Generator Version: 3.38.0-07189efd-20210827-205025
|
18
18
|
#
|
19
19
|
# The IBM Watson™ Speech to Text service provides APIs that use IBM's
|
20
20
|
# speech-recognition capabilities to produce transcripts of spoken audio. The service can
|
21
21
|
# transcribe speech from various languages and audio formats. In addition to basic
|
22
22
|
# transcription, the service can produce detailed information about many different aspects
|
23
|
-
# of the audio.
|
24
|
-
#
|
23
|
+
# of the audio. It returns all JSON response content in the UTF-8 character set.
|
24
|
+
#
|
25
|
+
# The service supports two types of models: previous-generation models that include the
|
26
|
+
# terms `Broadband` and `Narrowband` in their names, and next-generation models that
|
27
|
+
# include the terms `Multimedia` and `Telephony` in their names. Broadband and multimedia
|
28
|
+
# models have minimum sampling rates of 16 kHz. Narrowband and telephony models have
|
29
|
+
# minimum sampling rates of 8 kHz. The next-generation models offer high throughput and
|
30
|
+
# greater transcription accuracy.
|
25
31
|
#
|
26
32
|
# For speech recognition, the service supports synchronous and asynchronous HTTP
|
27
33
|
# Representational State Transfer (REST) interfaces. It also supports a WebSocket
|
@@ -36,9 +42,10 @@
|
|
36
42
|
# is a formal language specification that lets you restrict the phrases that the service
|
37
43
|
# can recognize.
|
38
44
|
#
|
39
|
-
# Language model customization
|
40
|
-
#
|
41
|
-
# beta functionality for all
|
45
|
+
# Language model customization is available for most previous- and next-generation models.
|
46
|
+
# Acoustic model customization is available for all previous-generation models. Grammars
|
47
|
+
# are beta functionality that is available for all previous-generation models that support
|
48
|
+
# language model customization.
|
42
49
|
|
43
50
|
require "concurrent"
|
44
51
|
require "erb"
|
@@ -46,7 +53,6 @@ require "json"
|
|
46
53
|
require "ibm_cloud_sdk_core"
|
47
54
|
require_relative "./common.rb"
|
48
55
|
|
49
|
-
# Module for the Watson APIs
|
50
56
|
module IBMWatson
|
51
57
|
##
|
52
58
|
# The Speech to Text V1 service.
|
@@ -89,8 +95,8 @@ module IBMWatson
|
|
89
95
|
# among other things. The ordering of the list of models can change from call to
|
90
96
|
# call; do not rely on an alphabetized or static list of models.
|
91
97
|
#
|
92
|
-
# **See also:** [
|
93
|
-
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models
|
98
|
+
# **See also:** [Listing
|
99
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-list).
|
94
100
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
95
101
|
def list_models
|
96
102
|
headers = {
|
@@ -116,10 +122,11 @@ module IBMWatson
|
|
116
122
|
# with the service. The information includes the name of the model and its minimum
|
117
123
|
# sampling rate in Hertz, among other things.
|
118
124
|
#
|
119
|
-
# **See also:** [
|
120
|
-
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models
|
121
|
-
# @param model_id [String] The identifier of the model in the form of its name from the output of the
|
122
|
-
#
|
125
|
+
# **See also:** [Listing
|
126
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-list).
|
127
|
+
# @param model_id [String] The identifier of the model in the form of its name from the output of the [List
|
128
|
+
# models](#listmodels) method. (**Note:** The model `ar-AR_BroadbandModel` is
|
129
|
+
# deprecated; use `ar-MS_BroadbandModel` instead.).
|
123
130
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
124
131
|
def get_model(model_id:)
|
125
132
|
raise ArgumentError.new("model_id must be provided") if model_id.nil?
|
@@ -144,7 +151,7 @@ module IBMWatson
|
|
144
151
|
#########################
|
145
152
|
|
146
153
|
##
|
147
|
-
# @!method recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
154
|
+
# @!method recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
148
155
|
# Recognize audio.
|
149
156
|
# Sends audio and returns transcription results for a recognition request. You can
|
150
157
|
# pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The
|
@@ -211,8 +218,31 @@ module IBMWatson
|
|
211
218
|
# sampling rate of the audio is lower than the minimum required rate, the request
|
212
219
|
# fails.
|
213
220
|
#
|
214
|
-
# **See also:** [
|
215
|
-
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats
|
221
|
+
# **See also:** [Supported audio
|
222
|
+
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
|
223
|
+
#
|
224
|
+
#
|
225
|
+
# ### Next-generation models
|
226
|
+
#
|
227
|
+
# The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8
|
228
|
+
# kHz) models for many languages. Next-generation models have higher throughput than
|
229
|
+
# the service's previous generation of `Broadband` and `Narrowband` models. When you
|
230
|
+
# use next-generation models, the service can return transcriptions more quickly and
|
231
|
+
# also provide noticeably better transcription accuracy.
|
232
|
+
#
|
233
|
+
# You specify a next-generation model by using the `model` query parameter, as you
|
234
|
+
# do a previous-generation model. Many next-generation models also support the
|
235
|
+
# `low_latency` parameter, which is not available with previous-generation models.
|
236
|
+
#
|
237
|
+
# But next-generation models do not support all of the parameters that are available
|
238
|
+
# for use with previous-generation models. For more information about all parameters
|
239
|
+
# that are supported for use with next-generation models, see [Supported features
|
240
|
+
# for next-generation
|
241
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features).
|
242
|
+
#
|
243
|
+
#
|
244
|
+
# **See also:** [Next-generation languages and
|
245
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
216
246
|
#
|
217
247
|
#
|
218
248
|
# ### Multipart speech recognition
|
@@ -235,15 +265,19 @@ module IBMWatson
|
|
235
265
|
# @param audio [File] The audio to transcribe.
|
236
266
|
# @param content_type [String] The format (MIME type) of the audio. For more information about specifying an
|
237
267
|
# audio format, see **Audio formats (content types)** in the method description.
|
238
|
-
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
239
|
-
#
|
240
|
-
#
|
268
|
+
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
269
|
+
# (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
|
270
|
+
# `ar-MS_BroadbandModel` instead.) See [Previous-generation languages and
|
271
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models) and
|
272
|
+
# [Next-generation languages and
|
273
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
241
274
|
# @param language_customization_id [String] The customization ID (GUID) of a custom language model that is to be used with the
|
242
275
|
# recognition request. The base model of the specified custom language model must
|
243
276
|
# match the model specified with the `model` parameter. You must make the request
|
244
277
|
# with credentials for the instance of the service that owns the custom model. By
|
245
|
-
# default, no custom language model is used. See [
|
246
|
-
#
|
278
|
+
# default, no custom language model is used. See [Using a custom language model for
|
279
|
+
# speech
|
280
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse).
|
247
281
|
#
|
248
282
|
#
|
249
283
|
# **Note:** Use this parameter instead of the deprecated `customization_id`
|
@@ -252,14 +286,16 @@ module IBMWatson
|
|
252
286
|
# recognition request. The base model of the specified custom acoustic model must
|
253
287
|
# match the model specified with the `model` parameter. You must make the request
|
254
288
|
# with credentials for the instance of the service that owns the custom model. By
|
255
|
-
# default, no custom acoustic model is used. See [
|
256
|
-
#
|
289
|
+
# default, no custom acoustic model is used. See [Using a custom acoustic model for
|
290
|
+
# speech
|
291
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acousticUse).
|
257
292
|
# @param base_model_version [String] The version of the specified base model that is to be used with the recognition
|
258
293
|
# request. Multiple versions of a base model can exist when a model is updated for
|
259
294
|
# internal improvements. The parameter is intended primarily for use with custom
|
260
295
|
# models that have been upgraded for a new base model. The default value depends on
|
261
|
-
# whether the parameter is used with or without a custom model. See [
|
262
|
-
#
|
296
|
+
# whether the parameter is used with or without a custom model. See [Making speech
|
297
|
+
# recognition requests with upgraded custom
|
298
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade-use#custom-upgrade-use-recognition).
|
263
299
|
# @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
|
264
300
|
# recognition request, the customization weight tells the service how much weight to
|
265
301
|
# give to words from the custom language model compared to those from the base model
|
@@ -276,8 +312,8 @@ module IBMWatson
|
|
276
312
|
# custom model's domain, but it can negatively affect performance on non-domain
|
277
313
|
# phrases.
|
278
314
|
#
|
279
|
-
# See [
|
280
|
-
#
|
315
|
+
# See [Using customization
|
316
|
+
# weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
|
281
317
|
# @param inactivity_timeout [Fixnum] The time in seconds after which, if only silence (no speech) is detected in
|
282
318
|
# streaming audio, the connection is closed with a 400 error. The parameter is
|
283
319
|
# useful for stopping audio submission from a live microphone when a user simply
|
@@ -294,56 +330,61 @@ module IBMWatson
|
|
294
330
|
# for double-byte languages might be shorter. Keywords are case-insensitive.
|
295
331
|
#
|
296
332
|
# See [Keyword
|
297
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
333
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
298
334
|
# @param keywords_threshold [Float] A confidence value that is the lower bound for spotting a keyword. A word is
|
299
335
|
# considered to match a keyword if its confidence is greater than or equal to the
|
300
336
|
# threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold,
|
301
337
|
# you must also specify one or more keywords. The service performs no keyword
|
302
338
|
# spotting if you omit either parameter. See [Keyword
|
303
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
339
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
304
340
|
# @param max_alternatives [Fixnum] The maximum number of alternative transcripts that the service is to return. By
|
305
341
|
# default, the service returns a single transcript. If you specify a value of `0`,
|
306
342
|
# the service uses the default value, `1`. See [Maximum
|
307
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
343
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#max-alternatives).
|
308
344
|
# @param word_alternatives_threshold [Float] A confidence value that is the lower bound for identifying a hypothesis as a
|
309
345
|
# possible word alternative (also known as "Confusion Networks"). An alternative
|
310
346
|
# word is considered if its confidence is greater than or equal to the threshold.
|
311
347
|
# Specify a probability between 0.0 and 1.0. By default, the service computes no
|
312
348
|
# alternative words. See [Word
|
313
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
349
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#word-alternatives).
|
314
350
|
# @param word_confidence [Boolean] If `true`, the service returns a confidence measure in the range of 0.0 to 1.0 for
|
315
351
|
# each word. By default, the service returns no word confidence scores. See [Word
|
316
|
-
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
352
|
+
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-confidence).
|
317
353
|
# @param timestamps [Boolean] If `true`, the service returns time alignment for each word. By default, no
|
318
354
|
# timestamps are returned. See [Word
|
319
|
-
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
355
|
+
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-timestamps).
|
320
356
|
# @param profanity_filter [Boolean] If `true`, the service filters profanity from all output except for keyword
|
321
357
|
# results by replacing inappropriate words with a series of asterisks. Set the
|
322
358
|
# parameter to `false` to return results with no censoring. Applies to US English
|
323
|
-
# transcription only. See [Profanity
|
324
|
-
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
359
|
+
# and Japanese transcription only. See [Profanity
|
360
|
+
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#profanity-filtering).
|
325
361
|
# @param smart_formatting [Boolean] If `true`, the service converts dates, times, series of digits and numbers, phone
|
326
362
|
# numbers, currency values, and internet addresses into more readable, conventional
|
327
363
|
# representations in the final transcript of a recognition request. For US English,
|
328
364
|
# the service also converts certain keyword strings to punctuation symbols. By
|
329
365
|
# default, the service performs no smart formatting.
|
330
366
|
#
|
331
|
-
# **
|
367
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
368
|
+
# and Spanish transcription only.
|
332
369
|
#
|
333
370
|
# See [Smart
|
334
|
-
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
371
|
+
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#smart-formatting).
|
335
372
|
# @param speaker_labels [Boolean] If `true`, the response includes labels that identify which words were spoken by
|
336
373
|
# which participants in a multi-person exchange. By default, the service returns no
|
337
374
|
# speaker labels. Setting `speaker_labels` to `true` forces the `timestamps`
|
338
375
|
# parameter to be `true`, regardless of whether you specify `false` for the
|
339
376
|
# parameter.
|
340
377
|
#
|
341
|
-
# **
|
342
|
-
#
|
343
|
-
#
|
378
|
+
# **Beta:** The parameter is beta functionality.
|
379
|
+
# * For previous-generation models, the parameter can be used for Australian
|
380
|
+
# English, US English, German, Japanese, Korean, and Spanish (both broadband and
|
381
|
+
# narrowband models) and UK English (narrowband model) transcription only.
|
382
|
+
# * For next-generation models, the parameter can be used for English (Australian,
|
383
|
+
# Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only.
|
344
384
|
#
|
345
|
-
#
|
346
|
-
#
|
385
|
+
# Restrictions and limitations apply to the use of speaker labels for both types of
|
386
|
+
# models. See [Speaker
|
387
|
+
# labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
|
347
388
|
# @param customization_id [String] **Deprecated.** Use the `language_customization_id` parameter to specify the
|
348
389
|
# customization ID (GUID) of a custom language model that is to be used with the
|
349
390
|
# recognition request. Do not specify both parameters with a request.
|
@@ -351,8 +392,12 @@ module IBMWatson
|
|
351
392
|
# specify a grammar, you must also use the `language_customization_id` parameter to
|
352
393
|
# specify the name of the custom language model for which the grammar is defined.
|
353
394
|
# The service recognizes only strings that are recognized by the specified grammar;
|
354
|
-
# it does not recognize other custom words from the model's words resource.
|
355
|
-
#
|
395
|
+
# it does not recognize other custom words from the model's words resource.
|
396
|
+
#
|
397
|
+
# **Beta:** The parameter is beta functionality.
|
398
|
+
#
|
399
|
+
# See [Using a grammar for speech
|
400
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-grammarUse).
|
356
401
|
# @param redaction [Boolean] If `true`, the service redacts, or masks, numeric data from final transcripts. The
|
357
402
|
# feature redacts any number that has three or more consecutive digits by replacing
|
358
403
|
# each digit with an `X` character. It is intended to redact sensitive numeric data,
|
@@ -364,16 +409,17 @@ module IBMWatson
|
|
364
409
|
# `keywords_threshold` parameters) and returns only a single final transcript
|
365
410
|
# (forces the `max_alternatives` parameter to be `1`).
|
366
411
|
#
|
367
|
-
# **
|
412
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
413
|
+
# and Korean transcription only.
|
368
414
|
#
|
369
415
|
# See [Numeric
|
370
|
-
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
416
|
+
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#numeric-redaction).
|
371
417
|
# @param audio_metrics [Boolean] If `true`, requests detailed information about the signal characteristics of the
|
372
418
|
# input audio. The service returns audio metrics with the final transcription
|
373
419
|
# results. By default, the service returns no audio metrics.
|
374
420
|
#
|
375
421
|
# See [Audio
|
376
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
422
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio-metrics).
|
377
423
|
# @param end_of_phrase_silence_time [Float] If `true`, specifies the duration of the pause interval at which the service
|
378
424
|
# splits a transcript into multiple final results. If the service detects pauses or
|
379
425
|
# extended silence before it reaches the end of the audio stream, its response can
|
@@ -390,7 +436,7 @@ module IBMWatson
|
|
390
436
|
# Chinese is 0.6 seconds.
|
391
437
|
#
|
392
438
|
# See [End of phrase silence
|
393
|
-
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
439
|
+
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#silence-time).
|
394
440
|
# @param split_transcript_at_phrase_end [Boolean] If `true`, directs the service to split the transcript into multiple final results
|
395
441
|
# based on semantic features of the input, for example, at the conclusion of
|
396
442
|
# meaningful phrases such as sentences. The service bases its understanding of
|
@@ -400,7 +446,7 @@ module IBMWatson
|
|
400
446
|
# interval.
|
401
447
|
#
|
402
448
|
# See [Split transcript at phrase
|
403
|
-
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
449
|
+
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#split-transcript).
|
404
450
|
# @param speech_detector_sensitivity [Float] The sensitivity of speech activity detection that the service is to perform. Use
|
405
451
|
# the parameter to suppress word insertions from music, coughing, and other
|
406
452
|
# non-speech events. The service biases the audio it passes for speech recognition
|
@@ -412,8 +458,8 @@ module IBMWatson
|
|
412
458
|
# * 0.5 (the default) provides a reasonable compromise for the level of sensitivity.
|
413
459
|
# * 1.0 suppresses no audio (speech detection sensitivity is disabled).
|
414
460
|
#
|
415
|
-
# The values increase on a monotonic curve. See [Speech
|
416
|
-
#
|
461
|
+
# The values increase on a monotonic curve. See [Speech detector
|
462
|
+
# sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity).
|
417
463
|
# @param background_audio_suppression [Float] The level to which the service is to suppress background audio based on its volume
|
418
464
|
# to prevent it from being transcribed as speech. Use the parameter to suppress side
|
419
465
|
# conversations or background noise.
|
@@ -424,10 +470,24 @@ module IBMWatson
|
|
424
470
|
# * 0.5 provides a reasonable level of audio suppression for general usage.
|
425
471
|
# * 1.0 suppresses all audio (no audio is transcribed).
|
426
472
|
#
|
427
|
-
# The values increase on a monotonic curve. See [
|
428
|
-
#
|
473
|
+
# The values increase on a monotonic curve. See [Background audio
|
474
|
+
# suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression).
|
475
|
+
# @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
|
476
|
+
# latency, directs the service to produce results even more quickly than it usually
|
477
|
+
# does. Next-generation models produce transcription results faster than
|
478
|
+
# previous-generation models. The `low_latency` parameter causes the models to
|
479
|
+
# produce results even more quickly, though the results might be less accurate when
|
480
|
+
# the parameter is used.
|
481
|
+
#
|
482
|
+
# The parameter is not available for previous-generation `Broadband` and
|
483
|
+
# `Narrowband` models. It is available only for some next-generation models. For a
|
484
|
+
# list of next-generation models that support low latency, see [Supported
|
485
|
+
# next-generation language
|
486
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported).
|
487
|
+
# * For more information about the `low_latency` parameter, see [Low
|
488
|
+
# latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
|
429
489
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
430
|
-
def recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
490
|
+
def recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
431
491
|
raise ArgumentError.new("audio must be provided") if audio.nil?
|
432
492
|
|
433
493
|
headers = {
|
@@ -460,7 +520,8 @@ module IBMWatson
|
|
460
520
|
"end_of_phrase_silence_time" => end_of_phrase_silence_time,
|
461
521
|
"split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
|
462
522
|
"speech_detector_sensitivity" => speech_detector_sensitivity,
|
463
|
-
"background_audio_suppression" => background_audio_suppression
|
523
|
+
"background_audio_suppression" => background_audio_suppression,
|
524
|
+
"low_latency" => low_latency
|
464
525
|
}
|
465
526
|
|
466
527
|
data = audio
|
@@ -479,7 +540,7 @@ module IBMWatson
|
|
479
540
|
end
|
480
541
|
|
481
542
|
##
|
482
|
-
# @!method recognize_using_websocket(content_type: nil,recognize_callback:,audio: nil,chunk_data: false,model: nil,customization_id: nil,acoustic_customization_id: nil,customization_weight: nil,base_model_version: nil,inactivity_timeout: nil,interim_results: nil,keywords: nil,keywords_threshold: nil,max_alternatives: nil,word_alternatives_threshold: nil,word_confidence: nil,timestamps: nil,profanity_filter: nil,smart_formatting: nil,speaker_labels: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
543
|
+
# @!method recognize_using_websocket(content_type: nil,recognize_callback:,audio: nil,chunk_data: false,model: nil,customization_id: nil,acoustic_customization_id: nil,customization_weight: nil,base_model_version: nil,inactivity_timeout: nil,interim_results: nil,keywords: nil,keywords_threshold: nil,max_alternatives: nil,word_alternatives_threshold: nil,word_confidence: nil,timestamps: nil,profanity_filter: nil,smart_formatting: nil,speaker_labels: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
483
544
|
# Sends audio for speech recognition using web sockets.
|
484
545
|
# @param content_type [String] The type of the input: audio/basic, audio/flac, audio/l16, audio/mp3, audio/mpeg, audio/mulaw, audio/ogg, audio/ogg;codecs=opus, audio/ogg;codecs=vorbis, audio/wav, audio/webm, audio/webm;codecs=opus, audio/webm;codecs=vorbis, or multipart/form-data.
|
485
546
|
# @param recognize_callback [RecognizeCallback] The instance handling events returned from the service.
|
@@ -596,6 +657,23 @@ module IBMWatson
|
|
596
657
|
#
|
597
658
|
# The values increase on a monotonic curve. See [Speech Activity
|
598
659
|
# Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
|
660
|
+
# @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
|
661
|
+
# latency, directs the service to produce results even more quickly than it usually
|
662
|
+
# does. Next-generation models produce transcription results faster than
|
663
|
+
# previous-generation models. The `low_latency` parameter causes the models to
|
664
|
+
# produce results even more quickly, though the results might be less accurate when
|
665
|
+
# the parameter is used.
|
666
|
+
#
|
667
|
+
# **Note:** The parameter is beta functionality. It is not available for
|
668
|
+
# previous-generation `Broadband` and `Narrowband` models. It is available only for
|
669
|
+
# some next-generation models.
|
670
|
+
#
|
671
|
+
# * For a list of next-generation models that support low latency, see [Supported
|
672
|
+
# language
|
673
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported)
|
674
|
+
# for next-generation models.
|
675
|
+
# * For more information about the `low_latency` parameter, see [Low
|
676
|
+
# latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
|
599
677
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
600
678
|
def recognize_using_websocket(
|
601
679
|
content_type: nil,
|
@@ -627,7 +705,8 @@ module IBMWatson
|
|
627
705
|
end_of_phrase_silence_time: nil,
|
628
706
|
split_transcript_at_phrase_end: nil,
|
629
707
|
speech_detector_sensitivity: nil,
|
630
|
-
background_audio_suppression: nil
|
708
|
+
background_audio_suppression: nil,
|
709
|
+
low_latency: nil
|
631
710
|
)
|
632
711
|
raise ArgumentError("Audio must be provided") if audio.nil? && !chunk_data
|
633
712
|
raise ArgumentError("Recognize callback must be provided") if recognize_callback.nil?
|
@@ -669,7 +748,8 @@ module IBMWatson
|
|
669
748
|
"end_of_phrase_silence_time" => end_of_phrase_silence_time,
|
670
749
|
"split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
|
671
750
|
"speech_detector_sensitivity" => speech_detector_sensitivity,
|
672
|
-
"background_audio_suppression" => background_audio_suppression
|
751
|
+
"background_audio_suppression" => background_audio_suppression,
|
752
|
+
"low_latency" => low_latency
|
673
753
|
}
|
674
754
|
options.delete_if { |_, v| v.nil? }
|
675
755
|
WebSocketClient.new(audio: audio, chunk_data: chunk_data, options: options, recognize_callback: recognize_callback, service_url: service_url, headers: headers, disable_ssl_verification: @disable_ssl_verification)
|
@@ -697,9 +777,9 @@ module IBMWatson
|
|
697
777
|
# The service sends only a single `GET` request to the callback URL. If the service
|
698
778
|
# does not receive a reply with a response code of 200 and a body that echoes the
|
699
779
|
# challenge string sent by the service within five seconds, it does not allowlist
|
700
|
-
# the URL; it instead sends status code 400 in response to the
|
701
|
-
# callback
|
702
|
-
#
|
780
|
+
# the URL; it instead sends status code 400 in response to the request to register a
|
781
|
+
# callback. If the requested callback URL is already allowlisted, the service
|
782
|
+
# responds to the initial registration request with response code 200.
|
703
783
|
#
|
704
784
|
# If you specify a user secret with the request, the service uses it as a key to
|
705
785
|
# calculate an HMAC-SHA1 signature of the challenge string in its response to the
|
@@ -754,9 +834,10 @@ module IBMWatson
|
|
754
834
|
##
|
755
835
|
# @!method unregister_callback(callback_url:)
|
756
836
|
# Unregister a callback.
|
757
|
-
# Unregisters a callback URL that was previously allowlisted with a
|
758
|
-
# callback
|
759
|
-
# URL can no longer be used with asynchronous recognition
|
837
|
+
# Unregisters a callback URL that was previously allowlisted with a [Register a
|
838
|
+
# callback](#registercallback) request for use with the asynchronous interface. Once
|
839
|
+
# unregistered, the URL can no longer be used with asynchronous recognition
|
840
|
+
# requests.
|
760
841
|
#
|
761
842
|
# **See also:** [Unregistering a callback
|
762
843
|
# URL](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-async#unregister).
|
@@ -787,7 +868,7 @@ module IBMWatson
|
|
787
868
|
end
|
788
869
|
|
789
870
|
##
|
790
|
-
# @!method create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
871
|
+
# @!method create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
791
872
|
# Create a job.
|
792
873
|
# Creates a job for a new asynchronous recognition request. The job is owned by the
|
793
874
|
# instance of the service whose credentials are used to create it. How you learn the
|
@@ -799,17 +880,17 @@ module IBMWatson
|
|
799
880
|
# to subscribe to specific events and to specify a string that is to be included
|
800
881
|
# with each notification for the job.
|
801
882
|
# * By polling the service: Omit the `callback_url`, `events`, and `user_token`
|
802
|
-
# parameters. You must then use the
|
803
|
-
# check the status of the job, using the latter to
|
804
|
-
# is complete.
|
883
|
+
# parameters. You must then use the [Check jobs](#checkjobs) or [Check a
|
884
|
+
# job](#checkjob) methods to check the status of the job, using the latter to
|
885
|
+
# retrieve the results when the job is complete.
|
805
886
|
#
|
806
887
|
# The two approaches are not mutually exclusive. You can poll the service for job
|
807
888
|
# status or obtain results from the service manually even if you include a callback
|
808
889
|
# URL. In both cases, you can include the `results_ttl` parameter to specify how
|
809
890
|
# long the results are to remain available after the job is complete. Using the
|
810
|
-
# HTTPS
|
811
|
-
# them via callback notification over HTTP because it provides
|
812
|
-
# addition to authentication and data integrity.
|
891
|
+
# HTTPS [Check a job](#checkjob) method to retrieve results is more secure than
|
892
|
+
# receiving them via callback notification over HTTP because it provides
|
893
|
+
# confidentiality in addition to authentication and data integrity.
|
813
894
|
#
|
814
895
|
# The method supports the same basic parameters as other HTTP and WebSocket
|
815
896
|
# recognition requests. It also supports the following parameters specific to the
|
@@ -883,18 +964,44 @@ module IBMWatson
|
|
883
964
|
# sampling rate of the audio is lower than the minimum required rate, the request
|
884
965
|
# fails.
|
885
966
|
#
|
886
|
-
# **See also:** [
|
887
|
-
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats
|
967
|
+
# **See also:** [Supported audio
|
968
|
+
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
|
969
|
+
#
|
970
|
+
#
|
971
|
+
# ### Next-generation models
|
972
|
+
#
|
973
|
+
# The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8
|
974
|
+
# kHz) models for many languages. Next-generation models have higher throughput than
|
975
|
+
# the service's previous generation of `Broadband` and `Narrowband` models. When you
|
976
|
+
# use next-generation models, the service can return transcriptions more quickly and
|
977
|
+
# also provide noticeably better transcription accuracy.
|
978
|
+
#
|
979
|
+
# You specify a next-generation model by using the `model` query parameter, as you
|
980
|
+
# do a previous-generation model. Many next-generation models also support the
|
981
|
+
# `low_latency` parameter, which is not available with previous-generation models.
|
982
|
+
#
|
983
|
+
# But next-generation models do not support all of the parameters that are available
|
984
|
+
# for use with previous-generation models. For more information about all parameters
|
985
|
+
# that are supported for use with next-generation models, see [Supported features
|
986
|
+
# for next-generation
|
987
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features).
|
988
|
+
#
|
989
|
+
#
|
990
|
+
# **See also:** [Next-generation languages and
|
991
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
888
992
|
# @param audio [File] The audio to transcribe.
|
889
993
|
# @param content_type [String] The format (MIME type) of the audio. For more information about specifying an
|
890
994
|
# audio format, see **Audio formats (content types)** in the method description.
|
891
|
-
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
892
|
-
#
|
893
|
-
#
|
995
|
+
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
996
|
+
# (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
|
997
|
+
# `ar-MS_BroadbandModel` instead.) See [Previous-generation languages and
|
998
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models) and
|
999
|
+
# [Next-generation languages and
|
1000
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
894
1001
|
# @param callback_url [String] A URL to which callback notifications are to be sent. The URL must already be
|
895
|
-
# successfully allowlisted by using the
|
896
|
-
# include the same callback URL with any number of job creation
|
897
|
-
# parameter to poll the service for job completion and results.
|
1002
|
+
# successfully allowlisted by using the [Register a callback](#registercallback)
|
1003
|
+
# method. You can include the same callback URL with any number of job creation
|
1004
|
+
# requests. Omit the parameter to poll the service for job completion and results.
|
898
1005
|
#
|
899
1006
|
# Use the `user_token` parameter to specify a unique user-specified string with each
|
900
1007
|
# job to differentiate the callback notifications for the jobs.
|
@@ -903,8 +1010,8 @@ module IBMWatson
|
|
903
1010
|
# * `recognitions.started` generates a callback notification when the service begins
|
904
1011
|
# to process the job.
|
905
1012
|
# * `recognitions.completed` generates a callback notification when the job is
|
906
|
-
# complete. You must use the
|
907
|
-
# they time out or are deleted.
|
1013
|
+
# complete. You must use the [Check a job](#checkjob) method to retrieve the results
|
1014
|
+
# before they time out or are deleted.
|
908
1015
|
# * `recognitions.completed_with_results` generates a callback notification when the
|
909
1016
|
# job is complete. The notification includes the results of the request.
|
910
1017
|
# * `recognitions.failed` generates a callback notification if the service
|
@@ -929,8 +1036,9 @@ module IBMWatson
|
|
929
1036
|
# recognition request. The base model of the specified custom language model must
|
930
1037
|
# match the model specified with the `model` parameter. You must make the request
|
931
1038
|
# with credentials for the instance of the service that owns the custom model. By
|
932
|
-
# default, no custom language model is used. See [
|
933
|
-
#
|
1039
|
+
# default, no custom language model is used. See [Using a custom language model for
|
1040
|
+
# speech
|
1041
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse).
|
934
1042
|
#
|
935
1043
|
#
|
936
1044
|
# **Note:** Use this parameter instead of the deprecated `customization_id`
|
@@ -939,14 +1047,16 @@ module IBMWatson
|
|
939
1047
|
# recognition request. The base model of the specified custom acoustic model must
|
940
1048
|
# match the model specified with the `model` parameter. You must make the request
|
941
1049
|
# with credentials for the instance of the service that owns the custom model. By
|
942
|
-
# default, no custom acoustic model is used. See [
|
943
|
-
#
|
1050
|
+
# default, no custom acoustic model is used. See [Using a custom acoustic model for
|
1051
|
+
# speech
|
1052
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acousticUse).
|
944
1053
|
# @param base_model_version [String] The version of the specified base model that is to be used with the recognition
|
945
1054
|
# request. Multiple versions of a base model can exist when a model is updated for
|
946
1055
|
# internal improvements. The parameter is intended primarily for use with custom
|
947
1056
|
# models that have been upgraded for a new base model. The default value depends on
|
948
|
-
# whether the parameter is used with or without a custom model. See [
|
949
|
-
#
|
1057
|
+
# whether the parameter is used with or without a custom model. See [Making speech
|
1058
|
+
# recognition requests with upgraded custom
|
1059
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade-use#custom-upgrade-use-recognition).
|
950
1060
|
# @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
|
951
1061
|
# recognition request, the customization weight tells the service how much weight to
|
952
1062
|
# give to words from the custom language model compared to those from the base model
|
@@ -963,8 +1073,8 @@ module IBMWatson
|
|
963
1073
|
# custom model's domain, but it can negatively affect performance on non-domain
|
964
1074
|
# phrases.
|
965
1075
|
#
|
966
|
-
# See [
|
967
|
-
#
|
1076
|
+
# See [Using customization
|
1077
|
+
# weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
|
968
1078
|
# @param inactivity_timeout [Fixnum] The time in seconds after which, if only silence (no speech) is detected in
|
969
1079
|
# streaming audio, the connection is closed with a 400 error. The parameter is
|
970
1080
|
# useful for stopping audio submission from a live microphone when a user simply
|
@@ -981,56 +1091,61 @@ module IBMWatson
|
|
981
1091
|
# for double-byte languages might be shorter. Keywords are case-insensitive.
|
982
1092
|
#
|
983
1093
|
# See [Keyword
|
984
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1094
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
985
1095
|
# @param keywords_threshold [Float] A confidence value that is the lower bound for spotting a keyword. A word is
|
986
1096
|
# considered to match a keyword if its confidence is greater than or equal to the
|
987
1097
|
# threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold,
|
988
1098
|
# you must also specify one or more keywords. The service performs no keyword
|
989
1099
|
# spotting if you omit either parameter. See [Keyword
|
990
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1100
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
991
1101
|
# @param max_alternatives [Fixnum] The maximum number of alternative transcripts that the service is to return. By
|
992
1102
|
# default, the service returns a single transcript. If you specify a value of `0`,
|
993
1103
|
# the service uses the default value, `1`. See [Maximum
|
994
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1104
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#max-alternatives).
|
995
1105
|
# @param word_alternatives_threshold [Float] A confidence value that is the lower bound for identifying a hypothesis as a
|
996
1106
|
# possible word alternative (also known as "Confusion Networks"). An alternative
|
997
1107
|
# word is considered if its confidence is greater than or equal to the threshold.
|
998
1108
|
# Specify a probability between 0.0 and 1.0. By default, the service computes no
|
999
1109
|
# alternative words. See [Word
|
1000
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1110
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#word-alternatives).
|
1001
1111
|
# @param word_confidence [Boolean] If `true`, the service returns a confidence measure in the range of 0.0 to 1.0 for
|
1002
1112
|
# each word. By default, the service returns no word confidence scores. See [Word
|
1003
|
-
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1113
|
+
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-confidence).
|
1004
1114
|
# @param timestamps [Boolean] If `true`, the service returns time alignment for each word. By default, no
|
1005
1115
|
# timestamps are returned. See [Word
|
1006
|
-
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1116
|
+
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-timestamps).
|
1007
1117
|
# @param profanity_filter [Boolean] If `true`, the service filters profanity from all output except for keyword
|
1008
1118
|
# results by replacing inappropriate words with a series of asterisks. Set the
|
1009
1119
|
# parameter to `false` to return results with no censoring. Applies to US English
|
1010
|
-
# transcription only. See [Profanity
|
1011
|
-
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1120
|
+
# and Japanese transcription only. See [Profanity
|
1121
|
+
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#profanity-filtering).
|
1012
1122
|
# @param smart_formatting [Boolean] If `true`, the service converts dates, times, series of digits and numbers, phone
|
1013
1123
|
# numbers, currency values, and internet addresses into more readable, conventional
|
1014
1124
|
# representations in the final transcript of a recognition request. For US English,
|
1015
1125
|
# the service also converts certain keyword strings to punctuation symbols. By
|
1016
1126
|
# default, the service performs no smart formatting.
|
1017
1127
|
#
|
1018
|
-
# **
|
1128
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
1129
|
+
# and Spanish transcription only.
|
1019
1130
|
#
|
1020
1131
|
# See [Smart
|
1021
|
-
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1132
|
+
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#smart-formatting).
|
1022
1133
|
# @param speaker_labels [Boolean] If `true`, the response includes labels that identify which words were spoken by
|
1023
1134
|
# which participants in a multi-person exchange. By default, the service returns no
|
1024
1135
|
# speaker labels. Setting `speaker_labels` to `true` forces the `timestamps`
|
1025
1136
|
# parameter to be `true`, regardless of whether you specify `false` for the
|
1026
1137
|
# parameter.
|
1027
1138
|
#
|
1028
|
-
# **
|
1029
|
-
#
|
1030
|
-
#
|
1139
|
+
# **Beta:** The parameter is beta functionality.
|
1140
|
+
# * For previous-generation models, the parameter can be used for Australian
|
1141
|
+
# English, US English, German, Japanese, Korean, and Spanish (both broadband and
|
1142
|
+
# narrowband models) and UK English (narrowband model) transcription only.
|
1143
|
+
# * For next-generation models, the parameter can be used for English (Australian,
|
1144
|
+
# Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only.
|
1031
1145
|
#
|
1032
|
-
#
|
1033
|
-
#
|
1146
|
+
# Restrictions and limitations apply to the use of speaker labels for both types of
|
1147
|
+
# models. See [Speaker
|
1148
|
+
# labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
|
1034
1149
|
# @param customization_id [String] **Deprecated.** Use the `language_customization_id` parameter to specify the
|
1035
1150
|
# customization ID (GUID) of a custom language model that is to be used with the
|
1036
1151
|
# recognition request. Do not specify both parameters with a request.
|
@@ -1038,8 +1153,12 @@ module IBMWatson
|
|
1038
1153
|
# specify a grammar, you must also use the `language_customization_id` parameter to
|
1039
1154
|
# specify the name of the custom language model for which the grammar is defined.
|
1040
1155
|
# The service recognizes only strings that are recognized by the specified grammar;
|
1041
|
-
# it does not recognize other custom words from the model's words resource.
|
1042
|
-
#
|
1156
|
+
# it does not recognize other custom words from the model's words resource.
|
1157
|
+
#
|
1158
|
+
# **Beta:** The parameter is beta functionality.
|
1159
|
+
#
|
1160
|
+
# See [Using a grammar for speech
|
1161
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-grammarUse).
|
1043
1162
|
# @param redaction [Boolean] If `true`, the service redacts, or masks, numeric data from final transcripts. The
|
1044
1163
|
# feature redacts any number that has three or more consecutive digits by replacing
|
1045
1164
|
# each digit with an `X` character. It is intended to redact sensitive numeric data,
|
@@ -1051,10 +1170,11 @@ module IBMWatson
|
|
1051
1170
|
# `keywords_threshold` parameters) and returns only a single final transcript
|
1052
1171
|
# (forces the `max_alternatives` parameter to be `1`).
|
1053
1172
|
#
|
1054
|
-
# **
|
1173
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
1174
|
+
# and Korean transcription only.
|
1055
1175
|
#
|
1056
1176
|
# See [Numeric
|
1057
|
-
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1177
|
+
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#numeric-redaction).
|
1058
1178
|
# @param processing_metrics [Boolean] If `true`, requests processing metrics about the service's transcription of the
|
1059
1179
|
# input audio. The service returns processing metrics at the interval specified by
|
1060
1180
|
# the `processing_metrics_interval` parameter. It also returns processing metrics
|
@@ -1062,7 +1182,7 @@ module IBMWatson
|
|
1062
1182
|
# the service returns no processing metrics.
|
1063
1183
|
#
|
1064
1184
|
# See [Processing
|
1065
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
1185
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing-metrics).
|
1066
1186
|
# @param processing_metrics_interval [Float] Specifies the interval in real wall-clock seconds at which the service is to
|
1067
1187
|
# return processing metrics. The parameter is ignored unless the
|
1068
1188
|
# `processing_metrics` parameter is set to `true`.
|
@@ -1076,13 +1196,13 @@ module IBMWatson
|
|
1076
1196
|
# the service returns processing metrics only for transcription events.
|
1077
1197
|
#
|
1078
1198
|
# See [Processing
|
1079
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
1199
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing-metrics).
|
1080
1200
|
# @param audio_metrics [Boolean] If `true`, requests detailed information about the signal characteristics of the
|
1081
1201
|
# input audio. The service returns audio metrics with the final transcription
|
1082
1202
|
# results. By default, the service returns no audio metrics.
|
1083
1203
|
#
|
1084
1204
|
# See [Audio
|
1085
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
1205
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio-metrics).
|
1086
1206
|
# @param end_of_phrase_silence_time [Float] If `true`, specifies the duration of the pause interval at which the service
|
1087
1207
|
# splits a transcript into multiple final results. If the service detects pauses or
|
1088
1208
|
# extended silence before it reaches the end of the audio stream, its response can
|
@@ -1099,7 +1219,7 @@ module IBMWatson
|
|
1099
1219
|
# Chinese is 0.6 seconds.
|
1100
1220
|
#
|
1101
1221
|
# See [End of phrase silence
|
1102
|
-
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1222
|
+
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#silence-time).
|
1103
1223
|
# @param split_transcript_at_phrase_end [Boolean] If `true`, directs the service to split the transcript into multiple final results
|
1104
1224
|
# based on semantic features of the input, for example, at the conclusion of
|
1105
1225
|
# meaningful phrases such as sentences. The service bases its understanding of
|
@@ -1109,7 +1229,7 @@ module IBMWatson
|
|
1109
1229
|
# interval.
|
1110
1230
|
#
|
1111
1231
|
# See [Split transcript at phrase
|
1112
|
-
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1232
|
+
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#split-transcript).
|
1113
1233
|
# @param speech_detector_sensitivity [Float] The sensitivity of speech activity detection that the service is to perform. Use
|
1114
1234
|
# the parameter to suppress word insertions from music, coughing, and other
|
1115
1235
|
# non-speech events. The service biases the audio it passes for speech recognition
|
@@ -1121,8 +1241,8 @@ module IBMWatson
|
|
1121
1241
|
# * 0.5 (the default) provides a reasonable compromise for the level of sensitivity.
|
1122
1242
|
# * 1.0 suppresses no audio (speech detection sensitivity is disabled).
|
1123
1243
|
#
|
1124
|
-
# The values increase on a monotonic curve. See [Speech
|
1125
|
-
#
|
1244
|
+
# The values increase on a monotonic curve. See [Speech detector
|
1245
|
+
# sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity).
|
1126
1246
|
# @param background_audio_suppression [Float] The level to which the service is to suppress background audio based on its volume
|
1127
1247
|
# to prevent it from being transcribed as speech. Use the parameter to suppress side
|
1128
1248
|
# conversations or background noise.
|
@@ -1133,10 +1253,24 @@ module IBMWatson
|
|
1133
1253
|
# * 0.5 provides a reasonable level of audio suppression for general usage.
|
1134
1254
|
# * 1.0 suppresses all audio (no audio is transcribed).
|
1135
1255
|
#
|
1136
|
-
# The values increase on a monotonic curve. See [
|
1137
|
-
#
|
1256
|
+
# The values increase on a monotonic curve. See [Background audio
|
1257
|
+
# suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression).
|
1258
|
+
# @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
|
1259
|
+
# latency, directs the service to produce results even more quickly than it usually
|
1260
|
+
# does. Next-generation models produce transcription results faster than
|
1261
|
+
# previous-generation models. The `low_latency` parameter causes the models to
|
1262
|
+
# produce results even more quickly, though the results might be less accurate when
|
1263
|
+
# the parameter is used.
|
1264
|
+
#
|
1265
|
+
# The parameter is not available for previous-generation `Broadband` and
|
1266
|
+
# `Narrowband` models. It is available only for some next-generation models. For a
|
1267
|
+
# list of next-generation models that support low latency, see [Supported
|
1268
|
+
# next-generation language
|
1269
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported).
|
1270
|
+
# * For more information about the `low_latency` parameter, see [Low
|
1271
|
+
# latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
|
1138
1272
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
1139
|
-
def create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
1273
|
+
def create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
1140
1274
|
raise ArgumentError.new("audio must be provided") if audio.nil?
|
1141
1275
|
|
1142
1276
|
headers = {
|
@@ -1175,7 +1309,8 @@ module IBMWatson
|
|
1175
1309
|
"end_of_phrase_silence_time" => end_of_phrase_silence_time,
|
1176
1310
|
"split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
|
1177
1311
|
"speech_detector_sensitivity" => speech_detector_sensitivity,
|
1178
|
-
"background_audio_suppression" => background_audio_suppression
|
1312
|
+
"background_audio_suppression" => background_audio_suppression,
|
1313
|
+
"low_latency" => low_latency
|
1179
1314
|
}
|
1180
1315
|
|
1181
1316
|
data = audio
|
@@ -1200,10 +1335,10 @@ module IBMWatson
|
|
1200
1335
|
# credentials with which it is called. The method also returns the creation and
|
1201
1336
|
# update times of each job, and, if a job was created with a callback URL and a user
|
1202
1337
|
# token, the user token for the job. To obtain the results for a job whose status is
|
1203
|
-
# `completed` or not one of the latest 100 outstanding jobs, use the
|
1204
|
-
# method. A job and its results remain available until you delete
|
1205
|
-
#
|
1206
|
-
# first.
|
1338
|
+
# `completed` or not one of the latest 100 outstanding jobs, use the [Check a
|
1339
|
+
# job[(#checkjob) method. A job and its results remain available until you delete
|
1340
|
+
# them with the [Delete a job](#deletejob) method or until the job's time to live
|
1341
|
+
# expires, whichever comes first.
|
1207
1342
|
#
|
1208
1343
|
# **See also:** [Checking the status of the latest
|
1209
1344
|
# jobs](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-async#jobs).
|
@@ -1237,8 +1372,8 @@ module IBMWatson
|
|
1237
1372
|
# You can use the method to retrieve the results of any job, regardless of whether
|
1238
1373
|
# it was submitted with a callback URL and the `recognitions.completed_with_results`
|
1239
1374
|
# event, and you can retrieve the results multiple times for as long as they remain
|
1240
|
-
# available. Use the
|
1241
|
-
# recent jobs associated with the calling credentials.
|
1375
|
+
# available. Use the [Check jobs](#checkjobs) method to request information about
|
1376
|
+
# the most recent jobs associated with the calling credentials.
|
1242
1377
|
#
|
1243
1378
|
# **See also:** [Checking the status and retrieving the results of a
|
1244
1379
|
# job](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-async#job).
|
@@ -1326,28 +1461,28 @@ module IBMWatson
|
|
1326
1461
|
# customizes.
|
1327
1462
|
#
|
1328
1463
|
# To determine whether a base model supports language model customization, use the
|
1329
|
-
#
|
1330
|
-
# to `true`. You can also refer to [Language support
|
1331
|
-
#
|
1464
|
+
# [Get a model](#getmodel) method and check that the attribute
|
1465
|
+
# `custom_language_model` is set to `true`. You can also refer to [Language support
|
1466
|
+
# for
|
1467
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
1332
1468
|
# @param dialect [String] The dialect of the specified language that is to be used with the custom language
|
1333
1469
|
# model. For most languages, the dialect matches the language of the base model by
|
1334
|
-
# default. For example, `en-US` is used for
|
1335
|
-
#
|
1470
|
+
# default. For example, `en-US` is used for the US English language models. All
|
1471
|
+
# dialect values are case-insensitive.
|
1336
1472
|
#
|
1337
|
-
#
|
1338
|
-
#
|
1473
|
+
# The parameter is meaningful only for Spanish language models, for which you can
|
1474
|
+
# always safely omit the parameter to have the service create the correct mapping.
|
1475
|
+
# For Spanish, the service creates a custom language model that is suited for speech
|
1476
|
+
# in one of the following dialects:
|
1339
1477
|
# * `es-ES` for Castilian Spanish (`es-ES` models)
|
1340
1478
|
# * `es-LA` for Latin American Spanish (`es-AR`, `es-CL`, `es-CO`, and `es-PE`
|
1341
1479
|
# models)
|
1342
1480
|
# * `es-US` for Mexican (North American) Spanish (`es-MX` models)
|
1343
1481
|
#
|
1344
|
-
#
|
1345
|
-
#
|
1346
|
-
#
|
1347
|
-
#
|
1348
|
-
# must match the language of the base model. If you specify the `dialect` for
|
1349
|
-
# Spanish language models, its value must match one of the defined mappings as
|
1350
|
-
# indicated (`es-ES`, `es-LA`, or `es-MX`). All dialect values are case-insensitive.
|
1482
|
+
# If you specify the `dialect` parameter for a non-Spanish language model, its value
|
1483
|
+
# must match the language of the base model. If you specify the `dialect` for a
|
1484
|
+
# Spanish language model, its value must match one of the defined mappings (`es-ES`,
|
1485
|
+
# `es-LA`, or `es-MX`).
|
1351
1486
|
# @param description [String] A description of the new custom language model. Use a localized description that
|
1352
1487
|
# matches the language of the custom model.
|
1353
1488
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
@@ -1393,11 +1528,12 @@ module IBMWatson
|
|
1393
1528
|
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageLanguageModels#listModels-language).
|
1394
1529
|
# @param language [String] The identifier of the language for which custom language or custom acoustic models
|
1395
1530
|
# are to be returned. Omit the parameter to see all custom language or custom
|
1396
|
-
# acoustic models that are owned by the requesting credentials.
|
1531
|
+
# acoustic models that are owned by the requesting credentials. (**Note:** The
|
1532
|
+
# identifier `ar-AR` is deprecated; use `ar-MS` instead.)
|
1397
1533
|
#
|
1398
1534
|
# To determine the languages for which customization is available, see [Language
|
1399
1535
|
# support for
|
1400
|
-
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1536
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
1401
1537
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
1402
1538
|
def list_language_models(language: nil)
|
1403
1539
|
headers = {
|
@@ -1501,12 +1637,13 @@ module IBMWatson
|
|
1501
1637
|
# the current load on the service. The method returns an HTTP 200 response code to
|
1502
1638
|
# indicate that the training process has begun.
|
1503
1639
|
#
|
1504
|
-
# You can monitor the status of the training by using the
|
1505
|
-
# model
|
1506
|
-
# seconds. The method returns a `LanguageModel` object that
|
1507
|
-
# `progress` fields. A status of `available` means that the
|
1508
|
-
# and ready to use. The service cannot accept subsequent
|
1509
|
-
# requests to add new resources until the existing request
|
1640
|
+
# You can monitor the status of the training by using the [Get a custom language
|
1641
|
+
# model](#getlanguagemodel) method to poll the model's status. Use a loop to check
|
1642
|
+
# the status every 10 seconds. The method returns a `LanguageModel` object that
|
1643
|
+
# includes `status` and `progress` fields. A status of `available` means that the
|
1644
|
+
# custom model is trained and ready to use. The service cannot accept subsequent
|
1645
|
+
# training requests or requests to add new resources until the existing request
|
1646
|
+
# completes.
|
1510
1647
|
#
|
1511
1648
|
# **See also:** [Train the custom language
|
1512
1649
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#trainModel-language).
|
@@ -1526,14 +1663,18 @@ module IBMWatson
|
|
1526
1663
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
1527
1664
|
# the request. You must make the request with credentials for the instance of the
|
1528
1665
|
# service that owns the custom model.
|
1529
|
-
# @param word_type_to_add [String]
|
1530
|
-
# train the model:
|
1666
|
+
# @param word_type_to_add [String] _For custom models that are based on previous-generation models_, the type of
|
1667
|
+
# words from the custom language model's words resource on which to train the model:
|
1531
1668
|
# * `all` (the default) trains the model on all new words, regardless of whether
|
1532
1669
|
# they were extracted from corpora or grammars or were added or modified by the
|
1533
1670
|
# user.
|
1534
|
-
# * `user` trains the model only on
|
1671
|
+
# * `user` trains the model only on custom words that were added or modified by the
|
1535
1672
|
# user directly. The model is not trained on new words extracted from corpora or
|
1536
1673
|
# grammars.
|
1674
|
+
#
|
1675
|
+
# _For custom models that are based on next-generation models_, the service ignores
|
1676
|
+
# the parameter. The words resource contains only custom words that the user adds or
|
1677
|
+
# modifies directly, so the parameter is unnecessary.
|
1537
1678
|
# @param customization_weight [Float] Specifies a customization weight for the custom language model. The customization
|
1538
1679
|
# weight tells the service how much weight to give to words from the custom language
|
1539
1680
|
# model compared to those from the base model for speech recognition. Specify a
|
@@ -1548,6 +1689,9 @@ module IBMWatson
|
|
1548
1689
|
# The value that you assign is used for all recognition requests that use the model.
|
1549
1690
|
# You can override it for any recognition request by specifying a customization
|
1550
1691
|
# weight for that request.
|
1692
|
+
#
|
1693
|
+
# See [Using customization
|
1694
|
+
# weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
|
1551
1695
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
1552
1696
|
def train_language_model(customization_id:, word_type_to_add: nil, customization_weight: nil)
|
1553
1697
|
raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
|
@@ -1621,15 +1765,19 @@ module IBMWatson
|
|
1621
1765
|
#
|
1622
1766
|
# The method returns an HTTP 200 response code to indicate that the upgrade process
|
1623
1767
|
# has begun successfully. You can monitor the status of the upgrade by using the
|
1624
|
-
#
|
1625
|
-
# returns a `LanguageModel` object that includes `status` and
|
1626
|
-
# a loop to check the status every 10 seconds. While it is
|
1627
|
-
# custom model has the status `upgrading`. When the upgrade is
|
1628
|
-
# resumes the status that it had prior to upgrade. The service
|
1629
|
-
# subsequent requests for the model until the upgrade completes.
|
1768
|
+
# [Get a custom language model](#getlanguagemodel) method to poll the model's
|
1769
|
+
# status. The method returns a `LanguageModel` object that includes `status` and
|
1770
|
+
# `progress` fields. Use a loop to check the status every 10 seconds. While it is
|
1771
|
+
# being upgraded, the custom model has the status `upgrading`. When the upgrade is
|
1772
|
+
# complete, the model resumes the status that it had prior to upgrade. The service
|
1773
|
+
# cannot accept subsequent requests for the model until the upgrade completes.
|
1774
|
+
#
|
1775
|
+
# **Note:** Upgrading is necessary only for custom language models that are based on
|
1776
|
+
# previous-generation models. Only a single version of a custom model that is based
|
1777
|
+
# on a next-generation model is ever available.
|
1630
1778
|
#
|
1631
1779
|
# **See also:** [Upgrading a custom language
|
1632
|
-
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1780
|
+
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-language).
|
1633
1781
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
1634
1782
|
# the request. You must make the request with credentials for the instance of the
|
1635
1783
|
# service that owns the custom model.
|
@@ -1660,9 +1808,10 @@ module IBMWatson
|
|
1660
1808
|
# @!method list_corpora(customization_id:)
|
1661
1809
|
# List corpora.
|
1662
1810
|
# Lists information about all corpora from a custom language model. The information
|
1663
|
-
# includes the total number of words
|
1664
|
-
#
|
1665
|
-
#
|
1811
|
+
# includes the name, status, and total number of words for each corpus. _For custom
|
1812
|
+
# models that are based on previous-generation models_, it also includes the number
|
1813
|
+
# of out-of-vocabulary (OOV) words from the corpus. You must use credentials for the
|
1814
|
+
# instance of the service that owns a model to list its corpora.
|
1666
1815
|
#
|
1667
1816
|
# **See also:** [Listing corpora for a custom language
|
1668
1817
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageCorpora#listCorpora).
|
@@ -1696,51 +1845,60 @@ module IBMWatson
|
|
1696
1845
|
# Use multiple requests to submit multiple corpus text files. You must use
|
1697
1846
|
# credentials for the instance of the service that owns a model to add a corpus to
|
1698
1847
|
# it. Adding a corpus does not affect the custom language model until you train the
|
1699
|
-
# model for the new data by using the
|
1848
|
+
# model for the new data by using the [Train a custom language
|
1849
|
+
# model](#trainlanguagemodel) method.
|
1700
1850
|
#
|
1701
1851
|
# Submit a plain text file that contains sample sentences from the domain of
|
1702
|
-
# interest to enable the service to
|
1703
|
-
# add that represent the context in which speakers use words from the domain,
|
1704
|
-
# better the service's recognition accuracy.
|
1852
|
+
# interest to enable the service to parse the words in context. The more sentences
|
1853
|
+
# you add that represent the context in which speakers use words from the domain,
|
1854
|
+
# the better the service's recognition accuracy.
|
1705
1855
|
#
|
1706
1856
|
# The call returns an HTTP 201 response code if the corpus is valid. The service
|
1707
|
-
# then asynchronously processes
|
1708
|
-
#
|
1709
|
-
#
|
1710
|
-
#
|
1711
|
-
#
|
1712
|
-
#
|
1713
|
-
#
|
1714
|
-
#
|
1715
|
-
#
|
1716
|
-
#
|
1717
|
-
#
|
1718
|
-
#
|
1719
|
-
#
|
1720
|
-
#
|
1721
|
-
#
|
1857
|
+
# then asynchronously processes and automatically extracts data from the contents of
|
1858
|
+
# the corpus. This operation can take on the order of minutes to complete depending
|
1859
|
+
# on the current load on the service, the total number of words in the corpus, and,
|
1860
|
+
# _for custom models that are based on previous-generation models_, the number of
|
1861
|
+
# new (out-of-vocabulary) words in the corpus. You cannot submit requests to add
|
1862
|
+
# additional resources to the custom model or to train the model until the service's
|
1863
|
+
# analysis of the corpus for the current request completes. Use the [Get a
|
1864
|
+
# corpus](#getcorpus) method to check the status of the analysis.
|
1865
|
+
#
|
1866
|
+
# _For custom models that are based on previous-generation models_, the service
|
1867
|
+
# auto-populates the model's words resource with words from the corpus that are not
|
1868
|
+
# found in its base vocabulary. These words are referred to as out-of-vocabulary
|
1869
|
+
# (OOV) words. After adding a corpus, you must validate the words resource to ensure
|
1870
|
+
# that each OOV word's definition is complete and valid. You can use the [List
|
1871
|
+
# custom words](#listwords) method to examine the words resource. You can use other
|
1872
|
+
# words method to eliminate typos and modify how words are pronounced as needed.
|
1722
1873
|
#
|
1723
1874
|
# To add a corpus file that has the same name as an existing corpus, set the
|
1724
1875
|
# `allow_overwrite` parameter to `true`; otherwise, the request fails. Overwriting
|
1725
1876
|
# an existing corpus causes the service to process the corpus text file and extract
|
1726
|
-
#
|
1727
|
-
#
|
1728
|
-
#
|
1729
|
-
#
|
1877
|
+
# its data anew. _For a custom model that is based on a previous-generation model_,
|
1878
|
+
# the service first removes any OOV words that are associated with the existing
|
1879
|
+
# corpus from the model's words resource unless they were also added by another
|
1880
|
+
# corpus or grammar, or they have been modified in some way with the [Add custom
|
1881
|
+
# words](#addwords) or [Add a custom word](#addword) method.
|
1730
1882
|
#
|
1731
1883
|
# The service limits the overall amount of data that you can add to a custom model
|
1732
|
-
# to a maximum of 10 million total words from all sources combined.
|
1733
|
-
#
|
1734
|
-
#
|
1735
|
-
# directly.
|
1884
|
+
# to a maximum of 10 million total words from all sources combined. _For a custom
|
1885
|
+
# model that is based on a previous-generation model_, you can add no more than 90
|
1886
|
+
# thousand custom (OOV) words to a model. This includes words that the service
|
1887
|
+
# extracts from corpora and grammars, and words that you add directly.
|
1736
1888
|
#
|
1737
1889
|
# **See also:**
|
1738
1890
|
# * [Add a corpus to the custom language
|
1739
1891
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addCorpus)
|
1740
|
-
# * [Working with
|
1741
|
-
#
|
1742
|
-
# * [
|
1743
|
-
#
|
1892
|
+
# * [Working with corpora for previous-generation
|
1893
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingCorpora)
|
1894
|
+
# * [Working with corpora for next-generation
|
1895
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingCorpora-ng)
|
1896
|
+
#
|
1897
|
+
# * [Validating a words resource for previous-generation
|
1898
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel)
|
1899
|
+
#
|
1900
|
+
# * [Validating a words resource for next-generation
|
1901
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng).
|
1744
1902
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
1745
1903
|
# the request. You must make the request with credentials for the instance of the
|
1746
1904
|
# service that owns the custom model.
|
@@ -1763,10 +1921,10 @@ module IBMWatson
|
|
1763
1921
|
# in UTF-8 if it contains non-ASCII characters; the service assumes UTF-8 encoding
|
1764
1922
|
# if it encounters non-ASCII characters.
|
1765
1923
|
#
|
1766
|
-
# Make sure that you know the character encoding of the file. You must use that
|
1924
|
+
# Make sure that you know the character encoding of the file. You must use that same
|
1767
1925
|
# encoding when working with the words in the custom language model. For more
|
1768
|
-
# information, see [Character
|
1769
|
-
#
|
1926
|
+
# information, see [Character encoding for custom
|
1927
|
+
# words](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageWords#charEncoding).
|
1770
1928
|
#
|
1771
1929
|
#
|
1772
1930
|
# With the `curl` command, use the `--data-binary` option to upload the file for the
|
@@ -1815,9 +1973,10 @@ module IBMWatson
|
|
1815
1973
|
# @!method get_corpus(customization_id:, corpus_name:)
|
1816
1974
|
# Get a corpus.
|
1817
1975
|
# Gets information about a corpus from a custom language model. The information
|
1818
|
-
# includes the total number of words
|
1819
|
-
#
|
1820
|
-
#
|
1976
|
+
# includes the name, status, and total number of words for the corpus. _For custom
|
1977
|
+
# models that are based on previous-generation models_, it also includes the number
|
1978
|
+
# of out-of-vocabulary (OOV) words from the corpus. You must use credentials for the
|
1979
|
+
# instance of the service that owns a model to list its corpora.
|
1821
1980
|
#
|
1822
1981
|
# **See also:** [Listing corpora for a custom language
|
1823
1982
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageCorpora#listCorpora).
|
@@ -1850,14 +2009,18 @@ module IBMWatson
|
|
1850
2009
|
##
|
1851
2010
|
# @!method delete_corpus(customization_id:, corpus_name:)
|
1852
2011
|
# Delete a corpus.
|
1853
|
-
# Deletes an existing corpus from a custom language model.
|
1854
|
-
#
|
1855
|
-
# model
|
1856
|
-
#
|
1857
|
-
#
|
1858
|
-
#
|
1859
|
-
#
|
1860
|
-
#
|
2012
|
+
# Deletes an existing corpus from a custom language model. Removing a corpus does
|
2013
|
+
# not affect the custom model until you train the model with the [Train a custom
|
2014
|
+
# language model](#trainlanguagemodel) method. You must use credentials for the
|
2015
|
+
# instance of the service that owns a model to delete its corpora.
|
2016
|
+
#
|
2017
|
+
# _For custom models that are based on previous-generation models_, the service
|
2018
|
+
# removes any out-of-vocabulary (OOV) words that are associated with the corpus from
|
2019
|
+
# the custom model's words resource unless they were also added by another corpus or
|
2020
|
+
# grammar, or they were modified in some way with the [Add custom words](#addwords)
|
2021
|
+
# or [Add a custom word](#addword) method.
|
2022
|
+
#
|
2023
|
+
#
|
1861
2024
|
#
|
1862
2025
|
# **See also:** [Deleting a corpus from a custom language
|
1863
2026
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageCorpora#deleteCorpus).
|
@@ -1895,10 +2058,11 @@ module IBMWatson
|
|
1895
2058
|
# List custom words.
|
1896
2059
|
# Lists information about custom words from a custom language model. You can list
|
1897
2060
|
# all words from the custom model's words resource, only custom words that were
|
1898
|
-
# added or modified by the user, or
|
1899
|
-
#
|
1900
|
-
#
|
1901
|
-
#
|
2061
|
+
# added or modified by the user, or, _for a custom model that is based on a
|
2062
|
+
# previous-generation model_, only out-of-vocabulary (OOV) words that were extracted
|
2063
|
+
# from corpora or are recognized by grammars. You can also indicate the order in
|
2064
|
+
# which the service is to return words; by default, the service lists words in
|
2065
|
+
# ascending alphabetical order. You must use credentials for the instance of the
|
1902
2066
|
# service that owns a model to list information about its words.
|
1903
2067
|
#
|
1904
2068
|
# **See also:** [Listing words from a custom language
|
@@ -1911,6 +2075,10 @@ module IBMWatson
|
|
1911
2075
|
# * `user` shows only custom words that were added or modified by the user directly.
|
1912
2076
|
# * `corpora` shows only OOV that were extracted from corpora.
|
1913
2077
|
# * `grammars` shows only OOV words that are recognized by grammars.
|
2078
|
+
#
|
2079
|
+
# _For a custom model that is based on a next-generation model_, only `all` and
|
2080
|
+
# `user` apply. Both options return the same results. Words from other sources are
|
2081
|
+
# not added to custom models that are based on next-generation models.
|
1914
2082
|
# @param sort [String] Indicates the order in which the words are to be listed, `alphabetical` or by
|
1915
2083
|
# `count`. You can prepend an optional `+` or `-` to an argument to indicate whether
|
1916
2084
|
# the results are to be sorted in ascending or descending order. By default, words
|
@@ -1947,10 +2115,14 @@ module IBMWatson
|
|
1947
2115
|
##
|
1948
2116
|
# @!method add_words(customization_id:, words:)
|
1949
2117
|
# Add custom words.
|
1950
|
-
# Adds one or more custom words to a custom language model.
|
2118
|
+
# Adds one or more custom words to a custom language model. You can use this method
|
2119
|
+
# to add words or to modify existing words in a custom model's words resource. _For
|
2120
|
+
# custom models that are based on previous-generation models_, the service populates
|
1951
2121
|
# the words resource for a custom model with out-of-vocabulary (OOV) words from each
|
1952
|
-
# corpus or grammar that is added to the model. You can use this method to
|
1953
|
-
#
|
2122
|
+
# corpus or grammar that is added to the model. You can use this method to modify
|
2123
|
+
# OOV words in the model's words resource.
|
2124
|
+
#
|
2125
|
+
# _For a custom model that is based on a previous-generation model_, the words
|
1954
2126
|
# resource for a model can contain a maximum of 90 thousand custom (OOV) words. This
|
1955
2127
|
# includes words that the service extracts from corpora and grammars and words that
|
1956
2128
|
# you add directly.
|
@@ -1958,25 +2130,26 @@ module IBMWatson
|
|
1958
2130
|
# You must use credentials for the instance of the service that owns a model to add
|
1959
2131
|
# or modify custom words for the model. Adding or modifying custom words does not
|
1960
2132
|
# affect the custom model until you train the model for the new data by using the
|
1961
|
-
#
|
2133
|
+
# [Train a custom language model](#trainlanguagemodel) method.
|
1962
2134
|
#
|
1963
2135
|
# You add custom words by providing a `CustomWords` object, which is an array of
|
1964
|
-
# `CustomWord` objects, one per word.
|
1965
|
-
#
|
1966
|
-
#
|
1967
|
-
# * The `sounds_like` field provides an array of one or more pronunciations for the
|
1968
|
-
# word. Use the parameter to specify how the word can be pronounced by users. Use
|
1969
|
-
# the parameter for words that are difficult to pronounce, foreign words, acronyms,
|
1970
|
-
# and so on. For example, you might specify that the word `IEEE` can sound like `i
|
1971
|
-
# triple e`. You can specify a maximum of five sounds-like pronunciations for a
|
1972
|
-
# word. If you omit the `sounds_like` field, the service attempts to set the field
|
1973
|
-
# to its pronunciation of the word. It cannot generate a pronunciation for all
|
1974
|
-
# words, so you must review the word's definition to ensure that it is complete and
|
1975
|
-
# valid.
|
2136
|
+
# `CustomWord` objects, one per word. Use the object's `word` parameter to identify
|
2137
|
+
# the word that is to be added. You can also provide one or both of the optional
|
2138
|
+
# `display_as` or `sounds_like` fields for each word.
|
1976
2139
|
# * The `display_as` field provides a different way of spelling the word in a
|
1977
2140
|
# transcript. Use the parameter when you want the word to appear different from its
|
1978
2141
|
# usual representation or from its spelling in training data. For example, you might
|
1979
|
-
# indicate that the word `IBM
|
2142
|
+
# indicate that the word `IBM` is to be displayed as `IBM™`.
|
2143
|
+
# * The `sounds_like` field, _which can be used only with a custom model that is
|
2144
|
+
# based on a previous-generation model_, provides an array of one or more
|
2145
|
+
# pronunciations for the word. Use the parameter to specify how the word can be
|
2146
|
+
# pronounced by users. Use the parameter for words that are difficult to pronounce,
|
2147
|
+
# foreign words, acronyms, and so on. For example, you might specify that the word
|
2148
|
+
# `IEEE` can sound like `i triple e`. You can specify a maximum of five sounds-like
|
2149
|
+
# pronunciations for a word. If you omit the `sounds_like` field, the service
|
2150
|
+
# attempts to set the field to its pronunciation of the word. It cannot generate a
|
2151
|
+
# pronunciation for all words, so you must review the word's definition to ensure
|
2152
|
+
# that it is complete and valid.
|
1980
2153
|
#
|
1981
2154
|
# If you add a custom word that already exists in the words resource for the custom
|
1982
2155
|
# model, the new definition overwrites the existing data for the word. If the
|
@@ -1988,26 +2161,30 @@ module IBMWatson
|
|
1988
2161
|
# time that it takes for the analysis to complete depends on the number of new words
|
1989
2162
|
# that you add but is generally faster than adding a corpus or grammar.
|
1990
2163
|
#
|
1991
|
-
# You can monitor the status of the request by using the
|
1992
|
-
# model
|
1993
|
-
# seconds. The method returns a `Customization` object that
|
1994
|
-
# field. A status of `ready` means that the words have been
|
1995
|
-
# model. The service cannot accept requests to add new data or
|
1996
|
-
# until the existing request completes.
|
1997
|
-
#
|
1998
|
-
# You can use the **List custom words** or **List a custom word** method to review
|
1999
|
-
# the words that you add. Words with an invalid `sounds_like` field include an
|
2000
|
-
# `error` field that describes the problem. You can use other words-related methods
|
2001
|
-
# to correct errors, eliminate typos, and modify how words are pronounced as needed.
|
2164
|
+
# You can monitor the status of the request by using the [Get a custom language
|
2165
|
+
# model](#getlanguagemodel) method to poll the model's status. Use a loop to check
|
2166
|
+
# the status every 10 seconds. The method returns a `Customization` object that
|
2167
|
+
# includes a `status` field. A status of `ready` means that the words have been
|
2168
|
+
# added to the custom model. The service cannot accept requests to add new data or
|
2169
|
+
# to train the model until the existing request completes.
|
2002
2170
|
#
|
2171
|
+
# You can use the [List custom words](#listwords) or [Get a custom word](#getword)
|
2172
|
+
# method to review the words that you add. Words with an invalid `sounds_like` field
|
2173
|
+
# include an `error` field that describes the problem. You can use other
|
2174
|
+
# words-related methods to correct errors, eliminate typos, and modify how words are
|
2175
|
+
# pronounced as needed.
|
2003
2176
|
#
|
2004
2177
|
# **See also:**
|
2005
2178
|
# * [Add words to the custom language
|
2006
2179
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addWords)
|
2007
|
-
# * [Working with custom
|
2008
|
-
#
|
2009
|
-
# * [
|
2010
|
-
#
|
2180
|
+
# * [Working with custom words for previous-generation
|
2181
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingWords)
|
2182
|
+
# * [Working with custom words for next-generation
|
2183
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingWords-ng)
|
2184
|
+
# * [Validating a words resource for previous-generation
|
2185
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel)
|
2186
|
+
# * [Validating a words resource for next-generation
|
2187
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng).
|
2011
2188
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
2012
2189
|
# the request. You must make the request with credentials for the instance of the
|
2013
2190
|
# service that owns the custom model.
|
@@ -2043,47 +2220,57 @@ module IBMWatson
|
|
2043
2220
|
##
|
2044
2221
|
# @!method add_word(customization_id:, word_name:, word: nil, sounds_like: nil, display_as: nil)
|
2045
2222
|
# Add a custom word.
|
2046
|
-
# Adds a custom word to a custom language model.
|
2047
|
-
#
|
2048
|
-
#
|
2049
|
-
#
|
2050
|
-
#
|
2051
|
-
#
|
2223
|
+
# Adds a custom word to a custom language model. You can use this method to add a
|
2224
|
+
# word or to modify an existing word in the words resource. _For custom models that
|
2225
|
+
# are based on previous-generation models_, the service populates the words resource
|
2226
|
+
# for a custom model with out-of-vocabulary (OOV) words from each corpus or grammar
|
2227
|
+
# that is added to the model. You can use this method to modify OOV words in the
|
2228
|
+
# model's words resource.
|
2229
|
+
#
|
2230
|
+
# _For a custom model that is based on a previous-generation models_, the words
|
2231
|
+
# resource for a model can contain a maximum of 90 thousand custom (OOV) words. This
|
2232
|
+
# includes words that the service extracts from corpora and grammars and words that
|
2233
|
+
# you add directly.
|
2052
2234
|
#
|
2053
2235
|
# You must use credentials for the instance of the service that owns a model to add
|
2054
2236
|
# or modify a custom word for the model. Adding or modifying a custom word does not
|
2055
2237
|
# affect the custom model until you train the model for the new data by using the
|
2056
|
-
#
|
2238
|
+
# [Train a custom language model](#trainlanguagemodel) method.
|
2057
2239
|
#
|
2058
2240
|
# Use the `word_name` parameter to specify the custom word that is to be added or
|
2059
2241
|
# modified. Use the `CustomWord` object to provide one or both of the optional
|
2060
|
-
# `
|
2061
|
-
# * The `sounds_like` field provides an array of one or more pronunciations for the
|
2062
|
-
# word. Use the parameter to specify how the word can be pronounced by users. Use
|
2063
|
-
# the parameter for words that are difficult to pronounce, foreign words, acronyms,
|
2064
|
-
# and so on. For example, you might specify that the word `IEEE` can sound like `i
|
2065
|
-
# triple e`. You can specify a maximum of five sounds-like pronunciations for a
|
2066
|
-
# word. If you omit the `sounds_like` field, the service attempts to set the field
|
2067
|
-
# to its pronunciation of the word. It cannot generate a pronunciation for all
|
2068
|
-
# words, so you must review the word's definition to ensure that it is complete and
|
2069
|
-
# valid.
|
2242
|
+
# `display_as` or `sounds_like` fields for the word.
|
2070
2243
|
# * The `display_as` field provides a different way of spelling the word in a
|
2071
2244
|
# transcript. Use the parameter when you want the word to appear different from its
|
2072
2245
|
# usual representation or from its spelling in training data. For example, you might
|
2073
|
-
# indicate that the word `IBM
|
2246
|
+
# indicate that the word `IBM` is to be displayed as `IBM™`.
|
2247
|
+
# * The `sounds_like` field, _which can be used only with a custom model that is
|
2248
|
+
# based on a previous-generation model_, provides an array of one or more
|
2249
|
+
# pronunciations for the word. Use the parameter to specify how the word can be
|
2250
|
+
# pronounced by users. Use the parameter for words that are difficult to pronounce,
|
2251
|
+
# foreign words, acronyms, and so on. For example, you might specify that the word
|
2252
|
+
# `IEEE` can sound like `i triple e`. You can specify a maximum of five sounds-like
|
2253
|
+
# pronunciations for a word. If you omit the `sounds_like` field, the service
|
2254
|
+
# attempts to set the field to its pronunciation of the word. It cannot generate a
|
2255
|
+
# pronunciation for all words, so you must review the word's definition to ensure
|
2256
|
+
# that it is complete and valid.
|
2074
2257
|
#
|
2075
2258
|
# If you add a custom word that already exists in the words resource for the custom
|
2076
2259
|
# model, the new definition overwrites the existing data for the word. If the
|
2077
2260
|
# service encounters an error, it does not add the word to the words resource. Use
|
2078
|
-
# the
|
2261
|
+
# the [Get a custom word](#getword) method to review the word that you add.
|
2079
2262
|
#
|
2080
2263
|
# **See also:**
|
2081
2264
|
# * [Add words to the custom language
|
2082
2265
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addWords)
|
2083
|
-
# * [Working with custom
|
2084
|
-
#
|
2085
|
-
# * [
|
2086
|
-
#
|
2266
|
+
# * [Working with custom words for previous-generation
|
2267
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingWords)
|
2268
|
+
# * [Working with custom words for next-generation
|
2269
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingWords-ng)
|
2270
|
+
# * [Validating a words resource for previous-generation
|
2271
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel)
|
2272
|
+
# * [Validating a words resource for next-generation
|
2273
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng).
|
2087
2274
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
2088
2275
|
# the request. You must make the request with credentials for the instance of the
|
2089
2276
|
# service that owns the custom model.
|
@@ -2092,14 +2279,16 @@ module IBMWatson
|
|
2092
2279
|
# the tokens of compound words. URL-encode the word if it includes non-ASCII
|
2093
2280
|
# characters. For more information, see [Character
|
2094
2281
|
# encoding](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#charEncoding).
|
2095
|
-
# @param word [String] For the
|
2096
|
-
# be added to or updated in the custom model. Do not include spaces in
|
2097
|
-
# a `-` (dash) or `_` (underscore) to connect the tokens of compound
|
2098
|
-
#
|
2099
|
-
#
|
2100
|
-
#
|
2101
|
-
#
|
2102
|
-
#
|
2282
|
+
# @param word [String] For the [Add custom words](#addwords) method, you must specify the custom word
|
2283
|
+
# that is to be added to or updated in the custom model. Do not include spaces in
|
2284
|
+
# the word. Use a `-` (dash) or `_` (underscore) to connect the tokens of compound
|
2285
|
+
# words.
|
2286
|
+
#
|
2287
|
+
# Omit this parameter for the [Add a custom word](#addword) method.
|
2288
|
+
# @param sounds_like [Array[String]] _For a custom model that is based on a previous-generation model_, an array of
|
2289
|
+
# sounds-like pronunciations for the custom word. Specify how words that are
|
2290
|
+
# difficult to pronounce, foreign words, acronyms, and so on can be pronounced by
|
2291
|
+
# users.
|
2103
2292
|
# * For a word that is not in the service's base vocabulary, omit the parameter to
|
2104
2293
|
# have the service automatically generate a sounds-like pronunciation for the word.
|
2105
2294
|
# * For a word that is in the service's base vocabulary, use the parameter to
|
@@ -2109,6 +2298,10 @@ module IBMWatson
|
|
2109
2298
|
#
|
2110
2299
|
# A word can have at most five sounds-like pronunciations. A pronunciation can
|
2111
2300
|
# include at most 40 characters not including spaces.
|
2301
|
+
#
|
2302
|
+
# _For a custom model that is based on a next-generation model_, omit this field.
|
2303
|
+
# Custom models based on next-generation models do not support the `sounds_like`
|
2304
|
+
# field. The service ignores the field.
|
2112
2305
|
# @param display_as [String] An alternative spelling for the custom word when it appears in a transcript. Use
|
2113
2306
|
# the parameter when you want the word to have a spelling that is different from its
|
2114
2307
|
# usual representation or from its spelling in corpora training data.
|
@@ -2183,11 +2376,12 @@ module IBMWatson
|
|
2183
2376
|
# Delete a custom word.
|
2184
2377
|
# Deletes a custom word from a custom language model. You can remove any word that
|
2185
2378
|
# you added to the custom model's words resource via any means. However, if the word
|
2186
|
-
# also exists in the service's base vocabulary, the service removes
|
2187
|
-
#
|
2379
|
+
# also exists in the service's base vocabulary, the service removes the word only
|
2380
|
+
# from the words resource; the word remains in the base vocabulary. Removing a
|
2188
2381
|
# custom word does not affect the custom model until you train the model with the
|
2189
|
-
#
|
2190
|
-
# instance of the service that owns a model to delete its words.
|
2382
|
+
# [Train a custom language model](#trainlanguagemodel) method. You must use
|
2383
|
+
# credentials for the instance of the service that owns a model to delete its words.
|
2384
|
+
#
|
2191
2385
|
#
|
2192
2386
|
# **See also:** [Deleting a word from a custom language
|
2193
2387
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageWords#deleteWord).
|
@@ -2228,7 +2422,11 @@ module IBMWatson
|
|
2228
2422
|
# Lists information about all grammars from a custom language model. The information
|
2229
2423
|
# includes the total number of out-of-vocabulary (OOV) words, name, and status of
|
2230
2424
|
# each grammar. You must use credentials for the instance of the service that owns a
|
2231
|
-
# model to list its grammars.
|
2425
|
+
# model to list its grammars. Grammars are available for all languages and models
|
2426
|
+
# that support language customization.
|
2427
|
+
#
|
2428
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2429
|
+
# They are not supported for next-generation models.
|
2232
2430
|
#
|
2233
2431
|
# **See also:** [Listing grammars from a custom language
|
2234
2432
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageGrammars#listGrammars).
|
@@ -2262,8 +2460,8 @@ module IBMWatson
|
|
2262
2460
|
# UTF-8 format that defines the grammar. Use multiple requests to submit multiple
|
2263
2461
|
# grammar files. You must use credentials for the instance of the service that owns
|
2264
2462
|
# a model to add a grammar to it. Adding a grammar does not affect the custom
|
2265
|
-
# language model until you train the model for the new data by using the
|
2266
|
-
# custom language model
|
2463
|
+
# language model until you train the model for the new data by using the [Train a
|
2464
|
+
# custom language model](#trainlanguagemodel) method.
|
2267
2465
|
#
|
2268
2466
|
# The call returns an HTTP 201 response code if the grammar is valid. The service
|
2269
2467
|
# then asynchronously processes the contents of the grammar and automatically
|
@@ -2271,27 +2469,33 @@ module IBMWatson
|
|
2271
2469
|
# to complete depending on the size and complexity of the grammar, as well as the
|
2272
2470
|
# current load on the service. You cannot submit requests to add additional
|
2273
2471
|
# resources to the custom model or to train the model until the service's analysis
|
2274
|
-
# of the grammar for the current request completes. Use the
|
2275
|
-
# to check the status of the analysis.
|
2472
|
+
# of the grammar for the current request completes. Use the [Get a
|
2473
|
+
# grammar](#getgrammar) method to check the status of the analysis.
|
2276
2474
|
#
|
2277
2475
|
# The service populates the model's words resource with any word that is recognized
|
2278
2476
|
# by the grammar that is not found in the model's base vocabulary. These are
|
2279
|
-
# referred to as out-of-vocabulary (OOV) words. You can use the
|
2280
|
-
# words
|
2281
|
-
# to eliminate typos and modify how words are pronounced as
|
2477
|
+
# referred to as out-of-vocabulary (OOV) words. You can use the [List custom
|
2478
|
+
# words](#listwords) method to examine the words resource and use other
|
2479
|
+
# words-related methods to eliminate typos and modify how words are pronounced as
|
2480
|
+
# needed.
|
2282
2481
|
#
|
2283
2482
|
# To add a grammar that has the same name as an existing grammar, set the
|
2284
2483
|
# `allow_overwrite` parameter to `true`; otherwise, the request fails. Overwriting
|
2285
2484
|
# an existing grammar causes the service to process the grammar file and extract OOV
|
2286
2485
|
# words anew. Before doing so, it removes any OOV words associated with the existing
|
2287
2486
|
# grammar from the model's words resource unless they were also added by another
|
2288
|
-
# resource or they have been modified in some way with the
|
2289
|
-
#
|
2487
|
+
# resource or they have been modified in some way with the [Add custom
|
2488
|
+
# words](#addwords) or [Add a custom word](#addword) method.
|
2290
2489
|
#
|
2291
2490
|
# The service limits the overall amount of data that you can add to a custom model
|
2292
2491
|
# to a maximum of 10 million total words from all sources combined. Also, you can
|
2293
2492
|
# add no more than 90 thousand OOV words to a model. This includes words that the
|
2294
2493
|
# service extracts from corpora and grammars and words that you add directly.
|
2494
|
+
# Grammars are available for all languages and models that support language
|
2495
|
+
# customization.
|
2496
|
+
#
|
2497
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2498
|
+
# They are not supported for next-generation models.
|
2295
2499
|
#
|
2296
2500
|
# **See also:**
|
2297
2501
|
# * [Understanding
|
@@ -2374,7 +2578,11 @@ module IBMWatson
|
|
2374
2578
|
# Gets information about a grammar from a custom language model. The information
|
2375
2579
|
# includes the total number of out-of-vocabulary (OOV) words, name, and status of
|
2376
2580
|
# the grammar. You must use credentials for the instance of the service that owns a
|
2377
|
-
# model to list its grammars.
|
2581
|
+
# model to list its grammars. Grammars are available for all languages and models
|
2582
|
+
# that support language customization.
|
2583
|
+
#
|
2584
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2585
|
+
# They are not supported for next-generation models.
|
2378
2586
|
#
|
2379
2587
|
# **See also:** [Listing grammars from a custom language
|
2380
2588
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageGrammars#listGrammars).
|
@@ -2410,10 +2618,15 @@ module IBMWatson
|
|
2410
2618
|
# Deletes an existing grammar from a custom language model. The service removes any
|
2411
2619
|
# out-of-vocabulary (OOV) words associated with the grammar from the custom model's
|
2412
2620
|
# words resource unless they were also added by another resource or they were
|
2413
|
-
# modified in some way with the
|
2414
|
-
# method. Removing a grammar does not affect the custom model until
|
2415
|
-
# model with the
|
2416
|
-
# for the instance of the service that owns a model
|
2621
|
+
# modified in some way with the [Add custom words](#addwords) or [Add a custom
|
2622
|
+
# word](#addword) method. Removing a grammar does not affect the custom model until
|
2623
|
+
# you train the model with the [Train a custom language model](#trainlanguagemodel)
|
2624
|
+
# method. You must use credentials for the instance of the service that owns a model
|
2625
|
+
# to delete its grammar. Grammars are available for all languages and models that
|
2626
|
+
# support language customization.
|
2627
|
+
#
|
2628
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2629
|
+
# They are not supported for next-generation models.
|
2417
2630
|
#
|
2418
2631
|
# **See also:** [Deleting a grammar from a custom language
|
2419
2632
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageGrammars#deleteGrammar).
|
@@ -2459,6 +2672,9 @@ module IBMWatson
|
|
2459
2672
|
# do not lose any models, but you cannot create any more until your model count is
|
2460
2673
|
# below the limit.
|
2461
2674
|
#
|
2675
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2676
|
+
# previous-generation models. It is not supported for next-generation models.
|
2677
|
+
#
|
2462
2678
|
# **See also:** [Create a custom acoustic
|
2463
2679
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#createModel-acoustic).
|
2464
2680
|
# @param name [String] A user-defined name for the new custom acoustic model. Use a name that is unique
|
@@ -2468,11 +2684,12 @@ module IBMWatson
|
|
2468
2684
|
# custom model`.
|
2469
2685
|
# @param base_model_name [String] The name of the base language model that is to be customized by the new custom
|
2470
2686
|
# acoustic model. The new custom model can be used only with the base model that it
|
2471
|
-
# customizes.
|
2687
|
+
# customizes. (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
|
2688
|
+
# `ar-MS_BroadbandModel` instead.)
|
2472
2689
|
#
|
2473
2690
|
# To determine whether a base model supports acoustic model customization, refer to
|
2474
2691
|
# [Language support for
|
2475
|
-
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
2692
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
2476
2693
|
# @param description [String] A description of the new custom acoustic model. Use a localized description that
|
2477
2694
|
# matches the language of the custom model.
|
2478
2695
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
@@ -2513,15 +2730,19 @@ module IBMWatson
|
|
2513
2730
|
# all languages. You must use credentials for the instance of the service that owns
|
2514
2731
|
# a model to list information about it.
|
2515
2732
|
#
|
2733
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2734
|
+
# previous-generation models. It is not supported for next-generation models.
|
2735
|
+
#
|
2516
2736
|
# **See also:** [Listing custom acoustic
|
2517
2737
|
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
|
2518
2738
|
# @param language [String] The identifier of the language for which custom language or custom acoustic models
|
2519
2739
|
# are to be returned. Omit the parameter to see all custom language or custom
|
2520
|
-
# acoustic models that are owned by the requesting credentials.
|
2740
|
+
# acoustic models that are owned by the requesting credentials. (**Note:** The
|
2741
|
+
# identifier `ar-AR` is deprecated; use `ar-MS` instead.)
|
2521
2742
|
#
|
2522
2743
|
# To determine the languages for which customization is available, see [Language
|
2523
2744
|
# support for
|
2524
|
-
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
2745
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
2525
2746
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
2526
2747
|
def list_acoustic_models(language: nil)
|
2527
2748
|
headers = {
|
@@ -2551,6 +2772,9 @@ module IBMWatson
|
|
2551
2772
|
# Gets information about a specified custom acoustic model. You must use credentials
|
2552
2773
|
# for the instance of the service that owns a model to list information about it.
|
2553
2774
|
#
|
2775
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2776
|
+
# previous-generation models. It is not supported for next-generation models.
|
2777
|
+
#
|
2554
2778
|
# **See also:** [Listing custom acoustic
|
2555
2779
|
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
|
2556
2780
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2584,6 +2808,9 @@ module IBMWatson
|
|
2584
2808
|
# processed. You must use credentials for the instance of the service that owns a
|
2585
2809
|
# model to delete it.
|
2586
2810
|
#
|
2811
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2812
|
+
# previous-generation models. It is not supported for next-generation models.
|
2813
|
+
#
|
2587
2814
|
# **See also:** [Deleting a custom acoustic
|
2588
2815
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#deleteModel-acoustic).
|
2589
2816
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2628,14 +2855,14 @@ module IBMWatson
|
|
2628
2855
|
# audio. The method returns an HTTP 200 response code to indicate that the training
|
2629
2856
|
# process has begun.
|
2630
2857
|
#
|
2631
|
-
# You can monitor the status of the training by using the
|
2632
|
-
# model
|
2633
|
-
# minute. The method returns an `AcousticModel` object that
|
2634
|
-
# `progress` fields. A status of `available` indicates that
|
2635
|
-
# trained and ready to use. The service cannot train a model
|
2636
|
-
# another request for the model. The service cannot accept
|
2637
|
-
# requests, or requests to add new audio resources, until the
|
2638
|
-
# request completes.
|
2858
|
+
# You can monitor the status of the training by using the [Get a custom acoustic
|
2859
|
+
# model](#getacousticmodel) method to poll the model's status. Use a loop to check
|
2860
|
+
# the status once a minute. The method returns an `AcousticModel` object that
|
2861
|
+
# includes `status` and `progress` fields. A status of `available` indicates that
|
2862
|
+
# the custom model is trained and ready to use. The service cannot train a model
|
2863
|
+
# while it is handling another request for the model. The service cannot accept
|
2864
|
+
# subsequent training requests, or requests to add new audio resources, until the
|
2865
|
+
# existing training request completes.
|
2639
2866
|
#
|
2640
2867
|
# You can use the optional `custom_language_model_id` parameter to specify the GUID
|
2641
2868
|
# of a separately created custom language model that is to be used during training.
|
@@ -2646,6 +2873,9 @@ module IBMWatson
|
|
2646
2873
|
# same version of the same base model, and the custom language model must be fully
|
2647
2874
|
# trained and available.
|
2648
2875
|
#
|
2876
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2877
|
+
# previous-generation models. It is not supported for next-generation models.
|
2878
|
+
#
|
2649
2879
|
# **See also:**
|
2650
2880
|
# * [Train the custom acoustic
|
2651
2881
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#trainModel-acoustic)
|
@@ -2717,6 +2947,9 @@ module IBMWatson
|
|
2717
2947
|
# request completes. You must use credentials for the instance of the service that
|
2718
2948
|
# owns a model to reset it.
|
2719
2949
|
#
|
2950
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2951
|
+
# previous-generation models. It is not supported for next-generation models.
|
2952
|
+
#
|
2720
2953
|
# **See also:** [Resetting a custom acoustic
|
2721
2954
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#resetModel-acoustic).
|
2722
2955
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2755,14 +2988,14 @@ module IBMWatson
|
|
2755
2988
|
#
|
2756
2989
|
# The method returns an HTTP 200 response code to indicate that the upgrade process
|
2757
2990
|
# has begun successfully. You can monitor the status of the upgrade by using the
|
2758
|
-
#
|
2759
|
-
# returns an `AcousticModel` object that includes `status` and
|
2760
|
-
# Use a loop to check the status once a minute. While it is being
|
2761
|
-
# custom model has the status `upgrading`. When the upgrade is
|
2762
|
-
# resumes the status that it had prior to upgrade. The service
|
2763
|
-
# model while it is handling another request for the model. The
|
2764
|
-
# accept subsequent requests for the model until the existing upgrade
|
2765
|
-
# completes.
|
2991
|
+
# [Get a custom acoustic model](#getacousticmodel) method to poll the model's
|
2992
|
+
# status. The method returns an `AcousticModel` object that includes `status` and
|
2993
|
+
# `progress` fields. Use a loop to check the status once a minute. While it is being
|
2994
|
+
# upgraded, the custom model has the status `upgrading`. When the upgrade is
|
2995
|
+
# complete, the model resumes the status that it had prior to upgrade. The service
|
2996
|
+
# cannot upgrade a model while it is handling another request for the model. The
|
2997
|
+
# service cannot accept subsequent requests for the model until the existing upgrade
|
2998
|
+
# request completes.
|
2766
2999
|
#
|
2767
3000
|
# If the custom acoustic model was trained with a separately created custom language
|
2768
3001
|
# model, you must use the `custom_language_model_id` parameter to specify the GUID
|
@@ -2770,8 +3003,11 @@ module IBMWatson
|
|
2770
3003
|
# the custom acoustic model can be upgraded. Omit the parameter if the custom
|
2771
3004
|
# acoustic model was not trained with a custom language model.
|
2772
3005
|
#
|
3006
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3007
|
+
# previous-generation models. It is not supported for next-generation models.
|
3008
|
+
#
|
2773
3009
|
# **See also:** [Upgrading a custom acoustic
|
2774
|
-
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
3010
|
+
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
|
2775
3011
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
2776
3012
|
# the request. You must make the request with credentials for the instance of the
|
2777
3013
|
# service that owns the custom model.
|
@@ -2785,7 +3021,7 @@ module IBMWatson
|
|
2785
3021
|
# upgrade of a custom acoustic model that is trained with a custom language model,
|
2786
3022
|
# and only if you receive a 400 response code and the message `No input data
|
2787
3023
|
# modified since last training`. See [Upgrading a custom acoustic
|
2788
|
-
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
3024
|
+
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
|
2789
3025
|
# @return [nil]
|
2790
3026
|
def upgrade_acoustic_model(customization_id:, custom_language_model_id: nil, force: nil)
|
2791
3027
|
raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
|
@@ -2825,6 +3061,9 @@ module IBMWatson
|
|
2825
3061
|
# to a request to add it to the custom acoustic model. You must use credentials for
|
2826
3062
|
# the instance of the service that owns a model to list its audio resources.
|
2827
3063
|
#
|
3064
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3065
|
+
# previous-generation models. It is not supported for next-generation models.
|
3066
|
+
#
|
2828
3067
|
# **See also:** [Listing audio resources for a custom acoustic
|
2829
3068
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio).
|
2830
3069
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2857,8 +3096,8 @@ module IBMWatson
|
|
2857
3096
|
# the acoustic characteristics of the audio that you plan to transcribe. You must
|
2858
3097
|
# use credentials for the instance of the service that owns a model to add an audio
|
2859
3098
|
# resource to it. Adding audio data does not affect the custom acoustic model until
|
2860
|
-
# you train the model for the new data by using the
|
2861
|
-
# model
|
3099
|
+
# you train the model for the new data by using the [Train a custom acoustic
|
3100
|
+
# model](#trainacousticmodel) method.
|
2862
3101
|
#
|
2863
3102
|
# You can add individual audio files or an archive file that contains multiple audio
|
2864
3103
|
# files. Adding multiple audio files via a single archive file is significantly more
|
@@ -2883,11 +3122,14 @@ module IBMWatson
|
|
2883
3122
|
# upgrade the model until the service's analysis of all audio resources for current
|
2884
3123
|
# requests completes.
|
2885
3124
|
#
|
2886
|
-
# To determine the status of the service's analysis of the audio, use the
|
2887
|
-
# audio resource
|
2888
|
-
# customization ID of the custom model and the name of the audio
|
2889
|
-
# returns the status of the resource. Use a loop to check the
|
2890
|
-
# every few seconds until it becomes `ok`.
|
3125
|
+
# To determine the status of the service's analysis of the audio, use the [Get an
|
3126
|
+
# audio resource](#getaudio) method to poll the status of the audio. The method
|
3127
|
+
# accepts the customization ID of the custom model and the name of the audio
|
3128
|
+
# resource, and it returns the status of the resource. Use a loop to check the
|
3129
|
+
# status of the audio every few seconds until it becomes `ok`.
|
3130
|
+
#
|
3131
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3132
|
+
# previous-generation models. It is not supported for next-generation models.
|
2891
3133
|
#
|
2892
3134
|
# **See also:** [Add audio to the custom acoustic
|
2893
3135
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#addAudio).
|
@@ -2923,8 +3165,8 @@ module IBMWatson
|
|
2923
3165
|
# If the sampling rate of the audio is lower than the minimum required rate, the
|
2924
3166
|
# service labels the audio file as `invalid`.
|
2925
3167
|
#
|
2926
|
-
# **See also:** [
|
2927
|
-
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats
|
3168
|
+
# **See also:** [Supported audio
|
3169
|
+
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
|
2928
3170
|
#
|
2929
3171
|
#
|
2930
3172
|
# ### Content types for archive-type resources
|
@@ -2982,7 +3224,7 @@ module IBMWatson
|
|
2982
3224
|
# For an archive-type resource, the media type of the archive file. For more
|
2983
3225
|
# information, see **Content types for archive-type resources** in the method
|
2984
3226
|
# description.
|
2985
|
-
# @param contained_content_type [String]
|
3227
|
+
# @param contained_content_type [String] _For an archive-type resource_, specify the format of the audio files that are
|
2986
3228
|
# contained in the archive file if they are of type `audio/alaw`, `audio/basic`,
|
2987
3229
|
# `audio/l16`, or `audio/mulaw`. Include the `rate`, `channels`, and `endianness`
|
2988
3230
|
# parameters where necessary. In this case, all audio files that are contained in
|
@@ -2996,7 +3238,7 @@ module IBMWatson
|
|
2996
3238
|
# speech recognition. For more information, see **Content types for audio-type
|
2997
3239
|
# resources** in the method description.
|
2998
3240
|
#
|
2999
|
-
#
|
3241
|
+
# _For an audio-type resource_, omit the header.
|
3000
3242
|
# @param allow_overwrite [Boolean] If `true`, the specified audio resource overwrites an existing audio resource with
|
3001
3243
|
# the same name. If `false`, the request fails if an audio resource with the same
|
3002
3244
|
# name already exists. The parameter has no effect if an audio resource with the
|
@@ -3041,9 +3283,9 @@ module IBMWatson
|
|
3041
3283
|
# Gets information about an audio resource from a custom acoustic model. The method
|
3042
3284
|
# returns an `AudioListing` object whose fields depend on the type of audio resource
|
3043
3285
|
# that you specify with the method's `audio_name` parameter:
|
3044
|
-
# *
|
3286
|
+
# * _For an audio-type resource_, the object's fields match those of an
|
3045
3287
|
# `AudioResource` object: `duration`, `name`, `details`, and `status`.
|
3046
|
-
# *
|
3288
|
+
# * _For an archive-type resource_, the object includes a `container` field whose
|
3047
3289
|
# fields match those of an `AudioResource` object. It also includes an `audio`
|
3048
3290
|
# field, which contains an array of `AudioResource` objects that provides
|
3049
3291
|
# information about the audio files that are contained in the archive.
|
@@ -3051,14 +3293,17 @@ module IBMWatson
|
|
3051
3293
|
# The information includes the status of the specified audio resource. The status is
|
3052
3294
|
# important for checking the service's analysis of a resource that you add to the
|
3053
3295
|
# custom model.
|
3054
|
-
# *
|
3055
|
-
# object.
|
3056
|
-
# *
|
3296
|
+
# * _For an audio-type resource_, the `status` field is located in the
|
3297
|
+
# `AudioListing` object.
|
3298
|
+
# * _For an archive-type resource_, the `status` field is located in the
|
3057
3299
|
# `AudioResource` object that is returned in the `container` field.
|
3058
3300
|
#
|
3059
3301
|
# You must use credentials for the instance of the service that owns a model to list
|
3060
3302
|
# its audio resources.
|
3061
3303
|
#
|
3304
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3305
|
+
# previous-generation models. It is not supported for next-generation models.
|
3306
|
+
#
|
3062
3307
|
# **See also:** [Listing audio resources for a custom acoustic
|
3063
3308
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio).
|
3064
3309
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -3095,10 +3340,14 @@ module IBMWatson
|
|
3095
3340
|
# not allow deletion of individual files from an archive resource.
|
3096
3341
|
#
|
3097
3342
|
# Removing an audio resource does not affect the custom model until you train the
|
3098
|
-
# model on its updated data by using the
|
3099
|
-
# You can delete an existing audio resource from
|
3100
|
-
# is being added to the model. You must use
|
3101
|
-
# service that owns a model to delete its audio
|
3343
|
+
# model on its updated data by using the [Train a custom acoustic
|
3344
|
+
# model](#trainacousticmodel) method. You can delete an existing audio resource from
|
3345
|
+
# a model while a different resource is being added to the model. You must use
|
3346
|
+
# credentials for the instance of the service that owns a model to delete its audio
|
3347
|
+
# resources.
|
3348
|
+
#
|
3349
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3350
|
+
# previous-generation models. It is not supported for next-generation models.
|
3102
3351
|
#
|
3103
3352
|
# **See also:** [Deleting an audio resource from a custom acoustic
|
3104
3353
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#deleteAudio).
|