ibm_watson 2.0.2 → 2.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +6 -29
- data/lib/ibm_watson/assistant_v1.rb +114 -79
- data/lib/ibm_watson/assistant_v2.rb +83 -59
- data/lib/ibm_watson/compare_comply_v1.rb +11 -4
- data/lib/ibm_watson/discovery_v1.rb +5 -12
- data/lib/ibm_watson/discovery_v2.rb +201 -110
- data/lib/ibm_watson/language_translator_v3.rb +1 -2
- data/lib/ibm_watson/natural_language_classifier_v1.rb +14 -6
- data/lib/ibm_watson/natural_language_understanding_v1.rb +690 -3
- data/lib/ibm_watson/personality_insights_v3.rb +13 -11
- data/lib/ibm_watson/speech_to_text_v1.rb +582 -333
- data/lib/ibm_watson/text_to_speech_v1.rb +617 -35
- data/lib/ibm_watson/tone_analyzer_v3.rb +1 -2
- data/lib/ibm_watson/version.rb +1 -1
- data/lib/ibm_watson/visual_recognition_v3.rb +1 -2
- data/lib/ibm_watson/visual_recognition_v4.rb +11 -8
- data/test/integration/test_discovery_v2.rb +15 -0
- data/test/integration/test_natural_language_understanding_v1.rb +134 -1
- data/test/integration/test_text_to_speech_v1.rb +57 -0
- data/test/unit/test_discovery_v2.rb +29 -0
- data/test/unit/test_natural_language_understanding_v1.rb +231 -0
- data/test/unit/test_text_to_speech_v1.rb +145 -0
- metadata +3 -3
@@ -14,17 +14,20 @@
|
|
14
14
|
# See the License for the specific language governing permissions and
|
15
15
|
# limitations under the License.
|
16
16
|
#
|
17
|
-
# IBM OpenAPI SDK Code Generator Version: 3.
|
17
|
+
# IBM OpenAPI SDK Code Generator Version: 3.38.0-07189efd-20210827-205025
|
18
18
|
#
|
19
|
-
# IBM
|
20
|
-
#
|
21
|
-
#
|
22
|
-
#
|
23
|
-
# Watson™ Natural Language
|
24
|
-
#
|
25
|
-
#
|
26
|
-
#
|
27
|
-
#
|
19
|
+
# IBM Watson™ Personality Insights is discontinued. Existing instances are
|
20
|
+
# supported until 1 December 2021, but as of 1 December 2020, you cannot create new
|
21
|
+
# instances. Any instance that exists on 1 December 2021 will be deleted.<br/><br/>No
|
22
|
+
# direct replacement exists for Personality Insights. However, you can consider using [IBM
|
23
|
+
# Watson™ Natural Language
|
24
|
+
# Understanding](https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-about)
|
25
|
+
# on IBM Cloud® as part of a replacement analytic workflow for your Personality
|
26
|
+
# Insights use cases. You can use Natural Language Understanding to extract data and
|
27
|
+
# insights from text, such as keywords, categories, sentiment, emotion, and syntax. For
|
28
|
+
# more information about the personality models in Personality Insights, see [The science
|
29
|
+
# behind the
|
30
|
+
# service](https://cloud.ibm.com/docs/personality-insights?topic=personality-insights-science).
|
28
31
|
# {: deprecated}
|
29
32
|
#
|
30
33
|
# The IBM Watson Personality Insights service enables applications to derive insights from
|
@@ -54,7 +57,6 @@ require "json"
|
|
54
57
|
require "ibm_cloud_sdk_core"
|
55
58
|
require_relative "./common.rb"
|
56
59
|
|
57
|
-
# Module for the Watson APIs
|
58
60
|
module IBMWatson
|
59
61
|
##
|
60
62
|
# The Personality Insights V3 service.
|
@@ -14,14 +14,20 @@
|
|
14
14
|
# See the License for the specific language governing permissions and
|
15
15
|
# limitations under the License.
|
16
16
|
#
|
17
|
-
# IBM OpenAPI SDK Code Generator Version: 3.
|
17
|
+
# IBM OpenAPI SDK Code Generator Version: 3.38.0-07189efd-20210827-205025
|
18
18
|
#
|
19
19
|
# The IBM Watson™ Speech to Text service provides APIs that use IBM's
|
20
20
|
# speech-recognition capabilities to produce transcripts of spoken audio. The service can
|
21
21
|
# transcribe speech from various languages and audio formats. In addition to basic
|
22
22
|
# transcription, the service can produce detailed information about many different aspects
|
23
|
-
# of the audio.
|
24
|
-
#
|
23
|
+
# of the audio. It returns all JSON response content in the UTF-8 character set.
|
24
|
+
#
|
25
|
+
# The service supports two types of models: previous-generation models that include the
|
26
|
+
# terms `Broadband` and `Narrowband` in their names, and next-generation models that
|
27
|
+
# include the terms `Multimedia` and `Telephony` in their names. Broadband and multimedia
|
28
|
+
# models have minimum sampling rates of 16 kHz. Narrowband and telephony models have
|
29
|
+
# minimum sampling rates of 8 kHz. The next-generation models offer high throughput and
|
30
|
+
# greater transcription accuracy.
|
25
31
|
#
|
26
32
|
# For speech recognition, the service supports synchronous and asynchronous HTTP
|
27
33
|
# Representational State Transfer (REST) interfaces. It also supports a WebSocket
|
@@ -36,9 +42,10 @@
|
|
36
42
|
# is a formal language specification that lets you restrict the phrases that the service
|
37
43
|
# can recognize.
|
38
44
|
#
|
39
|
-
# Language model customization
|
40
|
-
#
|
41
|
-
# beta functionality for all
|
45
|
+
# Language model customization is available for most previous- and next-generation models.
|
46
|
+
# Acoustic model customization is available for all previous-generation models. Grammars
|
47
|
+
# are beta functionality that is available for all previous-generation models that support
|
48
|
+
# language model customization.
|
42
49
|
|
43
50
|
require "concurrent"
|
44
51
|
require "erb"
|
@@ -46,7 +53,6 @@ require "json"
|
|
46
53
|
require "ibm_cloud_sdk_core"
|
47
54
|
require_relative "./common.rb"
|
48
55
|
|
49
|
-
# Module for the Watson APIs
|
50
56
|
module IBMWatson
|
51
57
|
##
|
52
58
|
# The Speech to Text V1 service.
|
@@ -89,8 +95,8 @@ module IBMWatson
|
|
89
95
|
# among other things. The ordering of the list of models can change from call to
|
90
96
|
# call; do not rely on an alphabetized or static list of models.
|
91
97
|
#
|
92
|
-
# **See also:** [
|
93
|
-
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models
|
98
|
+
# **See also:** [Listing
|
99
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-list).
|
94
100
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
95
101
|
def list_models
|
96
102
|
headers = {
|
@@ -116,10 +122,11 @@ module IBMWatson
|
|
116
122
|
# with the service. The information includes the name of the model and its minimum
|
117
123
|
# sampling rate in Hertz, among other things.
|
118
124
|
#
|
119
|
-
# **See also:** [
|
120
|
-
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models
|
121
|
-
# @param model_id [String] The identifier of the model in the form of its name from the output of the
|
122
|
-
#
|
125
|
+
# **See also:** [Listing
|
126
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-list).
|
127
|
+
# @param model_id [String] The identifier of the model in the form of its name from the output of the [List
|
128
|
+
# models](#listmodels) method. (**Note:** The model `ar-AR_BroadbandModel` is
|
129
|
+
# deprecated; use `ar-MS_BroadbandModel` instead.).
|
123
130
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
124
131
|
def get_model(model_id:)
|
125
132
|
raise ArgumentError.new("model_id must be provided") if model_id.nil?
|
@@ -144,7 +151,7 @@ module IBMWatson
|
|
144
151
|
#########################
|
145
152
|
|
146
153
|
##
|
147
|
-
# @!method recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
154
|
+
# @!method recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
148
155
|
# Recognize audio.
|
149
156
|
# Sends audio and returns transcription results for a recognition request. You can
|
150
157
|
# pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The
|
@@ -211,8 +218,31 @@ module IBMWatson
|
|
211
218
|
# sampling rate of the audio is lower than the minimum required rate, the request
|
212
219
|
# fails.
|
213
220
|
#
|
214
|
-
# **See also:** [
|
215
|
-
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats
|
221
|
+
# **See also:** [Supported audio
|
222
|
+
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
|
223
|
+
#
|
224
|
+
#
|
225
|
+
# ### Next-generation models
|
226
|
+
#
|
227
|
+
# The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8
|
228
|
+
# kHz) models for many languages. Next-generation models have higher throughput than
|
229
|
+
# the service's previous generation of `Broadband` and `Narrowband` models. When you
|
230
|
+
# use next-generation models, the service can return transcriptions more quickly and
|
231
|
+
# also provide noticeably better transcription accuracy.
|
232
|
+
#
|
233
|
+
# You specify a next-generation model by using the `model` query parameter, as you
|
234
|
+
# do a previous-generation model. Many next-generation models also support the
|
235
|
+
# `low_latency` parameter, which is not available with previous-generation models.
|
236
|
+
#
|
237
|
+
# But next-generation models do not support all of the parameters that are available
|
238
|
+
# for use with previous-generation models. For more information about all parameters
|
239
|
+
# that are supported for use with next-generation models, see [Supported features
|
240
|
+
# for next-generation
|
241
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features).
|
242
|
+
#
|
243
|
+
#
|
244
|
+
# **See also:** [Next-generation languages and
|
245
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
216
246
|
#
|
217
247
|
#
|
218
248
|
# ### Multipart speech recognition
|
@@ -235,15 +265,19 @@ module IBMWatson
|
|
235
265
|
# @param audio [File] The audio to transcribe.
|
236
266
|
# @param content_type [String] The format (MIME type) of the audio. For more information about specifying an
|
237
267
|
# audio format, see **Audio formats (content types)** in the method description.
|
238
|
-
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
239
|
-
#
|
240
|
-
#
|
268
|
+
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
269
|
+
# (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
|
270
|
+
# `ar-MS_BroadbandModel` instead.) See [Previous-generation languages and
|
271
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models) and
|
272
|
+
# [Next-generation languages and
|
273
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
241
274
|
# @param language_customization_id [String] The customization ID (GUID) of a custom language model that is to be used with the
|
242
275
|
# recognition request. The base model of the specified custom language model must
|
243
276
|
# match the model specified with the `model` parameter. You must make the request
|
244
277
|
# with credentials for the instance of the service that owns the custom model. By
|
245
|
-
# default, no custom language model is used. See [
|
246
|
-
#
|
278
|
+
# default, no custom language model is used. See [Using a custom language model for
|
279
|
+
# speech
|
280
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse).
|
247
281
|
#
|
248
282
|
#
|
249
283
|
# **Note:** Use this parameter instead of the deprecated `customization_id`
|
@@ -252,14 +286,16 @@ module IBMWatson
|
|
252
286
|
# recognition request. The base model of the specified custom acoustic model must
|
253
287
|
# match the model specified with the `model` parameter. You must make the request
|
254
288
|
# with credentials for the instance of the service that owns the custom model. By
|
255
|
-
# default, no custom acoustic model is used. See [
|
256
|
-
#
|
289
|
+
# default, no custom acoustic model is used. See [Using a custom acoustic model for
|
290
|
+
# speech
|
291
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acousticUse).
|
257
292
|
# @param base_model_version [String] The version of the specified base model that is to be used with the recognition
|
258
293
|
# request. Multiple versions of a base model can exist when a model is updated for
|
259
294
|
# internal improvements. The parameter is intended primarily for use with custom
|
260
295
|
# models that have been upgraded for a new base model. The default value depends on
|
261
|
-
# whether the parameter is used with or without a custom model. See [
|
262
|
-
#
|
296
|
+
# whether the parameter is used with or without a custom model. See [Making speech
|
297
|
+
# recognition requests with upgraded custom
|
298
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade-use#custom-upgrade-use-recognition).
|
263
299
|
# @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
|
264
300
|
# recognition request, the customization weight tells the service how much weight to
|
265
301
|
# give to words from the custom language model compared to those from the base model
|
@@ -276,8 +312,8 @@ module IBMWatson
|
|
276
312
|
# custom model's domain, but it can negatively affect performance on non-domain
|
277
313
|
# phrases.
|
278
314
|
#
|
279
|
-
# See [
|
280
|
-
#
|
315
|
+
# See [Using customization
|
316
|
+
# weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
|
281
317
|
# @param inactivity_timeout [Fixnum] The time in seconds after which, if only silence (no speech) is detected in
|
282
318
|
# streaming audio, the connection is closed with a 400 error. The parameter is
|
283
319
|
# useful for stopping audio submission from a live microphone when a user simply
|
@@ -294,56 +330,61 @@ module IBMWatson
|
|
294
330
|
# for double-byte languages might be shorter. Keywords are case-insensitive.
|
295
331
|
#
|
296
332
|
# See [Keyword
|
297
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
333
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
298
334
|
# @param keywords_threshold [Float] A confidence value that is the lower bound for spotting a keyword. A word is
|
299
335
|
# considered to match a keyword if its confidence is greater than or equal to the
|
300
336
|
# threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold,
|
301
337
|
# you must also specify one or more keywords. The service performs no keyword
|
302
338
|
# spotting if you omit either parameter. See [Keyword
|
303
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
339
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
304
340
|
# @param max_alternatives [Fixnum] The maximum number of alternative transcripts that the service is to return. By
|
305
341
|
# default, the service returns a single transcript. If you specify a value of `0`,
|
306
342
|
# the service uses the default value, `1`. See [Maximum
|
307
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
343
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#max-alternatives).
|
308
344
|
# @param word_alternatives_threshold [Float] A confidence value that is the lower bound for identifying a hypothesis as a
|
309
345
|
# possible word alternative (also known as "Confusion Networks"). An alternative
|
310
346
|
# word is considered if its confidence is greater than or equal to the threshold.
|
311
347
|
# Specify a probability between 0.0 and 1.0. By default, the service computes no
|
312
348
|
# alternative words. See [Word
|
313
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
349
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#word-alternatives).
|
314
350
|
# @param word_confidence [Boolean] If `true`, the service returns a confidence measure in the range of 0.0 to 1.0 for
|
315
351
|
# each word. By default, the service returns no word confidence scores. See [Word
|
316
|
-
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
352
|
+
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-confidence).
|
317
353
|
# @param timestamps [Boolean] If `true`, the service returns time alignment for each word. By default, no
|
318
354
|
# timestamps are returned. See [Word
|
319
|
-
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
355
|
+
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-timestamps).
|
320
356
|
# @param profanity_filter [Boolean] If `true`, the service filters profanity from all output except for keyword
|
321
357
|
# results by replacing inappropriate words with a series of asterisks. Set the
|
322
358
|
# parameter to `false` to return results with no censoring. Applies to US English
|
323
|
-
# transcription only. See [Profanity
|
324
|
-
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
359
|
+
# and Japanese transcription only. See [Profanity
|
360
|
+
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#profanity-filtering).
|
325
361
|
# @param smart_formatting [Boolean] If `true`, the service converts dates, times, series of digits and numbers, phone
|
326
362
|
# numbers, currency values, and internet addresses into more readable, conventional
|
327
363
|
# representations in the final transcript of a recognition request. For US English,
|
328
364
|
# the service also converts certain keyword strings to punctuation symbols. By
|
329
365
|
# default, the service performs no smart formatting.
|
330
366
|
#
|
331
|
-
# **
|
367
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
368
|
+
# and Spanish transcription only.
|
332
369
|
#
|
333
370
|
# See [Smart
|
334
|
-
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
371
|
+
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#smart-formatting).
|
335
372
|
# @param speaker_labels [Boolean] If `true`, the response includes labels that identify which words were spoken by
|
336
373
|
# which participants in a multi-person exchange. By default, the service returns no
|
337
374
|
# speaker labels. Setting `speaker_labels` to `true` forces the `timestamps`
|
338
375
|
# parameter to be `true`, regardless of whether you specify `false` for the
|
339
376
|
# parameter.
|
340
377
|
#
|
341
|
-
# **
|
342
|
-
#
|
343
|
-
#
|
378
|
+
# **Beta:** The parameter is beta functionality.
|
379
|
+
# * For previous-generation models, the parameter can be used for Australian
|
380
|
+
# English, US English, German, Japanese, Korean, and Spanish (both broadband and
|
381
|
+
# narrowband models) and UK English (narrowband model) transcription only.
|
382
|
+
# * For next-generation models, the parameter can be used for English (Australian,
|
383
|
+
# Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only.
|
344
384
|
#
|
345
|
-
#
|
346
|
-
#
|
385
|
+
# Restrictions and limitations apply to the use of speaker labels for both types of
|
386
|
+
# models. See [Speaker
|
387
|
+
# labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
|
347
388
|
# @param customization_id [String] **Deprecated.** Use the `language_customization_id` parameter to specify the
|
348
389
|
# customization ID (GUID) of a custom language model that is to be used with the
|
349
390
|
# recognition request. Do not specify both parameters with a request.
|
@@ -351,8 +392,12 @@ module IBMWatson
|
|
351
392
|
# specify a grammar, you must also use the `language_customization_id` parameter to
|
352
393
|
# specify the name of the custom language model for which the grammar is defined.
|
353
394
|
# The service recognizes only strings that are recognized by the specified grammar;
|
354
|
-
# it does not recognize other custom words from the model's words resource.
|
355
|
-
#
|
395
|
+
# it does not recognize other custom words from the model's words resource.
|
396
|
+
#
|
397
|
+
# **Beta:** The parameter is beta functionality.
|
398
|
+
#
|
399
|
+
# See [Using a grammar for speech
|
400
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-grammarUse).
|
356
401
|
# @param redaction [Boolean] If `true`, the service redacts, or masks, numeric data from final transcripts. The
|
357
402
|
# feature redacts any number that has three or more consecutive digits by replacing
|
358
403
|
# each digit with an `X` character. It is intended to redact sensitive numeric data,
|
@@ -364,16 +409,17 @@ module IBMWatson
|
|
364
409
|
# `keywords_threshold` parameters) and returns only a single final transcript
|
365
410
|
# (forces the `max_alternatives` parameter to be `1`).
|
366
411
|
#
|
367
|
-
# **
|
412
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
413
|
+
# and Korean transcription only.
|
368
414
|
#
|
369
415
|
# See [Numeric
|
370
|
-
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
416
|
+
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#numeric-redaction).
|
371
417
|
# @param audio_metrics [Boolean] If `true`, requests detailed information about the signal characteristics of the
|
372
418
|
# input audio. The service returns audio metrics with the final transcription
|
373
419
|
# results. By default, the service returns no audio metrics.
|
374
420
|
#
|
375
421
|
# See [Audio
|
376
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
422
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio-metrics).
|
377
423
|
# @param end_of_phrase_silence_time [Float] If `true`, specifies the duration of the pause interval at which the service
|
378
424
|
# splits a transcript into multiple final results. If the service detects pauses or
|
379
425
|
# extended silence before it reaches the end of the audio stream, its response can
|
@@ -390,7 +436,7 @@ module IBMWatson
|
|
390
436
|
# Chinese is 0.6 seconds.
|
391
437
|
#
|
392
438
|
# See [End of phrase silence
|
393
|
-
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
439
|
+
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#silence-time).
|
394
440
|
# @param split_transcript_at_phrase_end [Boolean] If `true`, directs the service to split the transcript into multiple final results
|
395
441
|
# based on semantic features of the input, for example, at the conclusion of
|
396
442
|
# meaningful phrases such as sentences. The service bases its understanding of
|
@@ -400,7 +446,7 @@ module IBMWatson
|
|
400
446
|
# interval.
|
401
447
|
#
|
402
448
|
# See [Split transcript at phrase
|
403
|
-
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
449
|
+
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#split-transcript).
|
404
450
|
# @param speech_detector_sensitivity [Float] The sensitivity of speech activity detection that the service is to perform. Use
|
405
451
|
# the parameter to suppress word insertions from music, coughing, and other
|
406
452
|
# non-speech events. The service biases the audio it passes for speech recognition
|
@@ -412,8 +458,8 @@ module IBMWatson
|
|
412
458
|
# * 0.5 (the default) provides a reasonable compromise for the level of sensitivity.
|
413
459
|
# * 1.0 suppresses no audio (speech detection sensitivity is disabled).
|
414
460
|
#
|
415
|
-
# The values increase on a monotonic curve. See [Speech
|
416
|
-
#
|
461
|
+
# The values increase on a monotonic curve. See [Speech detector
|
462
|
+
# sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity).
|
417
463
|
# @param background_audio_suppression [Float] The level to which the service is to suppress background audio based on its volume
|
418
464
|
# to prevent it from being transcribed as speech. Use the parameter to suppress side
|
419
465
|
# conversations or background noise.
|
@@ -424,10 +470,24 @@ module IBMWatson
|
|
424
470
|
# * 0.5 provides a reasonable level of audio suppression for general usage.
|
425
471
|
# * 1.0 suppresses all audio (no audio is transcribed).
|
426
472
|
#
|
427
|
-
# The values increase on a monotonic curve. See [
|
428
|
-
#
|
473
|
+
# The values increase on a monotonic curve. See [Background audio
|
474
|
+
# suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression).
|
475
|
+
# @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
|
476
|
+
# latency, directs the service to produce results even more quickly than it usually
|
477
|
+
# does. Next-generation models produce transcription results faster than
|
478
|
+
# previous-generation models. The `low_latency` parameter causes the models to
|
479
|
+
# produce results even more quickly, though the results might be less accurate when
|
480
|
+
# the parameter is used.
|
481
|
+
#
|
482
|
+
# The parameter is not available for previous-generation `Broadband` and
|
483
|
+
# `Narrowband` models. It is available only for some next-generation models. For a
|
484
|
+
# list of next-generation models that support low latency, see [Supported
|
485
|
+
# next-generation language
|
486
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported).
|
487
|
+
# * For more information about the `low_latency` parameter, see [Low
|
488
|
+
# latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
|
429
489
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
430
|
-
def recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
490
|
+
def recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
431
491
|
raise ArgumentError.new("audio must be provided") if audio.nil?
|
432
492
|
|
433
493
|
headers = {
|
@@ -460,7 +520,8 @@ module IBMWatson
|
|
460
520
|
"end_of_phrase_silence_time" => end_of_phrase_silence_time,
|
461
521
|
"split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
|
462
522
|
"speech_detector_sensitivity" => speech_detector_sensitivity,
|
463
|
-
"background_audio_suppression" => background_audio_suppression
|
523
|
+
"background_audio_suppression" => background_audio_suppression,
|
524
|
+
"low_latency" => low_latency
|
464
525
|
}
|
465
526
|
|
466
527
|
data = audio
|
@@ -479,7 +540,7 @@ module IBMWatson
|
|
479
540
|
end
|
480
541
|
|
481
542
|
##
|
482
|
-
# @!method recognize_using_websocket(content_type: nil,recognize_callback:,audio: nil,chunk_data: false,model: nil,customization_id: nil,acoustic_customization_id: nil,customization_weight: nil,base_model_version: nil,inactivity_timeout: nil,interim_results: nil,keywords: nil,keywords_threshold: nil,max_alternatives: nil,word_alternatives_threshold: nil,word_confidence: nil,timestamps: nil,profanity_filter: nil,smart_formatting: nil,speaker_labels: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
543
|
+
# @!method recognize_using_websocket(content_type: nil,recognize_callback:,audio: nil,chunk_data: false,model: nil,customization_id: nil,acoustic_customization_id: nil,customization_weight: nil,base_model_version: nil,inactivity_timeout: nil,interim_results: nil,keywords: nil,keywords_threshold: nil,max_alternatives: nil,word_alternatives_threshold: nil,word_confidence: nil,timestamps: nil,profanity_filter: nil,smart_formatting: nil,speaker_labels: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
483
544
|
# Sends audio for speech recognition using web sockets.
|
484
545
|
# @param content_type [String] The type of the input: audio/basic, audio/flac, audio/l16, audio/mp3, audio/mpeg, audio/mulaw, audio/ogg, audio/ogg;codecs=opus, audio/ogg;codecs=vorbis, audio/wav, audio/webm, audio/webm;codecs=opus, audio/webm;codecs=vorbis, or multipart/form-data.
|
485
546
|
# @param recognize_callback [RecognizeCallback] The instance handling events returned from the service.
|
@@ -596,6 +657,23 @@ module IBMWatson
|
|
596
657
|
#
|
597
658
|
# The values increase on a monotonic curve. See [Speech Activity
|
598
659
|
# Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
|
660
|
+
# @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
|
661
|
+
# latency, directs the service to produce results even more quickly than it usually
|
662
|
+
# does. Next-generation models produce transcription results faster than
|
663
|
+
# previous-generation models. The `low_latency` parameter causes the models to
|
664
|
+
# produce results even more quickly, though the results might be less accurate when
|
665
|
+
# the parameter is used.
|
666
|
+
#
|
667
|
+
# **Note:** The parameter is beta functionality. It is not available for
|
668
|
+
# previous-generation `Broadband` and `Narrowband` models. It is available only for
|
669
|
+
# some next-generation models.
|
670
|
+
#
|
671
|
+
# * For a list of next-generation models that support low latency, see [Supported
|
672
|
+
# language
|
673
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported)
|
674
|
+
# for next-generation models.
|
675
|
+
# * For more information about the `low_latency` parameter, see [Low
|
676
|
+
# latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
|
599
677
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
600
678
|
def recognize_using_websocket(
|
601
679
|
content_type: nil,
|
@@ -627,7 +705,8 @@ module IBMWatson
|
|
627
705
|
end_of_phrase_silence_time: nil,
|
628
706
|
split_transcript_at_phrase_end: nil,
|
629
707
|
speech_detector_sensitivity: nil,
|
630
|
-
background_audio_suppression: nil
|
708
|
+
background_audio_suppression: nil,
|
709
|
+
low_latency: nil
|
631
710
|
)
|
632
711
|
raise ArgumentError("Audio must be provided") if audio.nil? && !chunk_data
|
633
712
|
raise ArgumentError("Recognize callback must be provided") if recognize_callback.nil?
|
@@ -669,7 +748,8 @@ module IBMWatson
|
|
669
748
|
"end_of_phrase_silence_time" => end_of_phrase_silence_time,
|
670
749
|
"split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
|
671
750
|
"speech_detector_sensitivity" => speech_detector_sensitivity,
|
672
|
-
"background_audio_suppression" => background_audio_suppression
|
751
|
+
"background_audio_suppression" => background_audio_suppression,
|
752
|
+
"low_latency" => low_latency
|
673
753
|
}
|
674
754
|
options.delete_if { |_, v| v.nil? }
|
675
755
|
WebSocketClient.new(audio: audio, chunk_data: chunk_data, options: options, recognize_callback: recognize_callback, service_url: service_url, headers: headers, disable_ssl_verification: @disable_ssl_verification)
|
@@ -697,9 +777,9 @@ module IBMWatson
|
|
697
777
|
# The service sends only a single `GET` request to the callback URL. If the service
|
698
778
|
# does not receive a reply with a response code of 200 and a body that echoes the
|
699
779
|
# challenge string sent by the service within five seconds, it does not allowlist
|
700
|
-
# the URL; it instead sends status code 400 in response to the
|
701
|
-
# callback
|
702
|
-
#
|
780
|
+
# the URL; it instead sends status code 400 in response to the request to register a
|
781
|
+
# callback. If the requested callback URL is already allowlisted, the service
|
782
|
+
# responds to the initial registration request with response code 200.
|
703
783
|
#
|
704
784
|
# If you specify a user secret with the request, the service uses it as a key to
|
705
785
|
# calculate an HMAC-SHA1 signature of the challenge string in its response to the
|
@@ -754,9 +834,10 @@ module IBMWatson
|
|
754
834
|
##
|
755
835
|
# @!method unregister_callback(callback_url:)
|
756
836
|
# Unregister a callback.
|
757
|
-
# Unregisters a callback URL that was previously allowlisted with a
|
758
|
-
# callback
|
759
|
-
# URL can no longer be used with asynchronous recognition
|
837
|
+
# Unregisters a callback URL that was previously allowlisted with a [Register a
|
838
|
+
# callback](#registercallback) request for use with the asynchronous interface. Once
|
839
|
+
# unregistered, the URL can no longer be used with asynchronous recognition
|
840
|
+
# requests.
|
760
841
|
#
|
761
842
|
# **See also:** [Unregistering a callback
|
762
843
|
# URL](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-async#unregister).
|
@@ -787,7 +868,7 @@ module IBMWatson
|
|
787
868
|
end
|
788
869
|
|
789
870
|
##
|
790
|
-
# @!method create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
871
|
+
# @!method create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
791
872
|
# Create a job.
|
792
873
|
# Creates a job for a new asynchronous recognition request. The job is owned by the
|
793
874
|
# instance of the service whose credentials are used to create it. How you learn the
|
@@ -799,17 +880,17 @@ module IBMWatson
|
|
799
880
|
# to subscribe to specific events and to specify a string that is to be included
|
800
881
|
# with each notification for the job.
|
801
882
|
# * By polling the service: Omit the `callback_url`, `events`, and `user_token`
|
802
|
-
# parameters. You must then use the
|
803
|
-
# check the status of the job, using the latter to
|
804
|
-
# is complete.
|
883
|
+
# parameters. You must then use the [Check jobs](#checkjobs) or [Check a
|
884
|
+
# job](#checkjob) methods to check the status of the job, using the latter to
|
885
|
+
# retrieve the results when the job is complete.
|
805
886
|
#
|
806
887
|
# The two approaches are not mutually exclusive. You can poll the service for job
|
807
888
|
# status or obtain results from the service manually even if you include a callback
|
808
889
|
# URL. In both cases, you can include the `results_ttl` parameter to specify how
|
809
890
|
# long the results are to remain available after the job is complete. Using the
|
810
|
-
# HTTPS
|
811
|
-
# them via callback notification over HTTP because it provides
|
812
|
-
# addition to authentication and data integrity.
|
891
|
+
# HTTPS [Check a job](#checkjob) method to retrieve results is more secure than
|
892
|
+
# receiving them via callback notification over HTTP because it provides
|
893
|
+
# confidentiality in addition to authentication and data integrity.
|
813
894
|
#
|
814
895
|
# The method supports the same basic parameters as other HTTP and WebSocket
|
815
896
|
# recognition requests. It also supports the following parameters specific to the
|
@@ -883,18 +964,44 @@ module IBMWatson
|
|
883
964
|
# sampling rate of the audio is lower than the minimum required rate, the request
|
884
965
|
# fails.
|
885
966
|
#
|
886
|
-
# **See also:** [
|
887
|
-
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats
|
967
|
+
# **See also:** [Supported audio
|
968
|
+
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
|
969
|
+
#
|
970
|
+
#
|
971
|
+
# ### Next-generation models
|
972
|
+
#
|
973
|
+
# The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8
|
974
|
+
# kHz) models for many languages. Next-generation models have higher throughput than
|
975
|
+
# the service's previous generation of `Broadband` and `Narrowband` models. When you
|
976
|
+
# use next-generation models, the service can return transcriptions more quickly and
|
977
|
+
# also provide noticeably better transcription accuracy.
|
978
|
+
#
|
979
|
+
# You specify a next-generation model by using the `model` query parameter, as you
|
980
|
+
# do a previous-generation model. Many next-generation models also support the
|
981
|
+
# `low_latency` parameter, which is not available with previous-generation models.
|
982
|
+
#
|
983
|
+
# But next-generation models do not support all of the parameters that are available
|
984
|
+
# for use with previous-generation models. For more information about all parameters
|
985
|
+
# that are supported for use with next-generation models, see [Supported features
|
986
|
+
# for next-generation
|
987
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-features).
|
988
|
+
#
|
989
|
+
#
|
990
|
+
# **See also:** [Next-generation languages and
|
991
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
888
992
|
# @param audio [File] The audio to transcribe.
|
889
993
|
# @param content_type [String] The format (MIME type) of the audio. For more information about specifying an
|
890
994
|
# audio format, see **Audio formats (content types)** in the method description.
|
891
|
-
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
892
|
-
#
|
893
|
-
#
|
995
|
+
# @param model [String] The identifier of the model that is to be used for the recognition request.
|
996
|
+
# (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
|
997
|
+
# `ar-MS_BroadbandModel` instead.) See [Previous-generation languages and
|
998
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models) and
|
999
|
+
# [Next-generation languages and
|
1000
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
|
894
1001
|
# @param callback_url [String] A URL to which callback notifications are to be sent. The URL must already be
|
895
|
-
# successfully allowlisted by using the
|
896
|
-
# include the same callback URL with any number of job creation
|
897
|
-
# parameter to poll the service for job completion and results.
|
1002
|
+
# successfully allowlisted by using the [Register a callback](#registercallback)
|
1003
|
+
# method. You can include the same callback URL with any number of job creation
|
1004
|
+
# requests. Omit the parameter to poll the service for job completion and results.
|
898
1005
|
#
|
899
1006
|
# Use the `user_token` parameter to specify a unique user-specified string with each
|
900
1007
|
# job to differentiate the callback notifications for the jobs.
|
@@ -903,8 +1010,8 @@ module IBMWatson
|
|
903
1010
|
# * `recognitions.started` generates a callback notification when the service begins
|
904
1011
|
# to process the job.
|
905
1012
|
# * `recognitions.completed` generates a callback notification when the job is
|
906
|
-
# complete. You must use the
|
907
|
-
# they time out or are deleted.
|
1013
|
+
# complete. You must use the [Check a job](#checkjob) method to retrieve the results
|
1014
|
+
# before they time out or are deleted.
|
908
1015
|
# * `recognitions.completed_with_results` generates a callback notification when the
|
909
1016
|
# job is complete. The notification includes the results of the request.
|
910
1017
|
# * `recognitions.failed` generates a callback notification if the service
|
@@ -929,8 +1036,9 @@ module IBMWatson
|
|
929
1036
|
# recognition request. The base model of the specified custom language model must
|
930
1037
|
# match the model specified with the `model` parameter. You must make the request
|
931
1038
|
# with credentials for the instance of the service that owns the custom model. By
|
932
|
-
# default, no custom language model is used. See [
|
933
|
-
#
|
1039
|
+
# default, no custom language model is used. See [Using a custom language model for
|
1040
|
+
# speech
|
1041
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse).
|
934
1042
|
#
|
935
1043
|
#
|
936
1044
|
# **Note:** Use this parameter instead of the deprecated `customization_id`
|
@@ -939,14 +1047,16 @@ module IBMWatson
|
|
939
1047
|
# recognition request. The base model of the specified custom acoustic model must
|
940
1048
|
# match the model specified with the `model` parameter. You must make the request
|
941
1049
|
# with credentials for the instance of the service that owns the custom model. By
|
942
|
-
# default, no custom acoustic model is used. See [
|
943
|
-
#
|
1050
|
+
# default, no custom acoustic model is used. See [Using a custom acoustic model for
|
1051
|
+
# speech
|
1052
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acousticUse).
|
944
1053
|
# @param base_model_version [String] The version of the specified base model that is to be used with the recognition
|
945
1054
|
# request. Multiple versions of a base model can exist when a model is updated for
|
946
1055
|
# internal improvements. The parameter is intended primarily for use with custom
|
947
1056
|
# models that have been upgraded for a new base model. The default value depends on
|
948
|
-
# whether the parameter is used with or without a custom model. See [
|
949
|
-
#
|
1057
|
+
# whether the parameter is used with or without a custom model. See [Making speech
|
1058
|
+
# recognition requests with upgraded custom
|
1059
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade-use#custom-upgrade-use-recognition).
|
950
1060
|
# @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
|
951
1061
|
# recognition request, the customization weight tells the service how much weight to
|
952
1062
|
# give to words from the custom language model compared to those from the base model
|
@@ -963,8 +1073,8 @@ module IBMWatson
|
|
963
1073
|
# custom model's domain, but it can negatively affect performance on non-domain
|
964
1074
|
# phrases.
|
965
1075
|
#
|
966
|
-
# See [
|
967
|
-
#
|
1076
|
+
# See [Using customization
|
1077
|
+
# weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
|
968
1078
|
# @param inactivity_timeout [Fixnum] The time in seconds after which, if only silence (no speech) is detected in
|
969
1079
|
# streaming audio, the connection is closed with a 400 error. The parameter is
|
970
1080
|
# useful for stopping audio submission from a live microphone when a user simply
|
@@ -981,56 +1091,61 @@ module IBMWatson
|
|
981
1091
|
# for double-byte languages might be shorter. Keywords are case-insensitive.
|
982
1092
|
#
|
983
1093
|
# See [Keyword
|
984
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1094
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
985
1095
|
# @param keywords_threshold [Float] A confidence value that is the lower bound for spotting a keyword. A word is
|
986
1096
|
# considered to match a keyword if its confidence is greater than or equal to the
|
987
1097
|
# threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold,
|
988
1098
|
# you must also specify one or more keywords. The service performs no keyword
|
989
1099
|
# spotting if you omit either parameter. See [Keyword
|
990
|
-
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1100
|
+
# spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
|
991
1101
|
# @param max_alternatives [Fixnum] The maximum number of alternative transcripts that the service is to return. By
|
992
1102
|
# default, the service returns a single transcript. If you specify a value of `0`,
|
993
1103
|
# the service uses the default value, `1`. See [Maximum
|
994
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1104
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#max-alternatives).
|
995
1105
|
# @param word_alternatives_threshold [Float] A confidence value that is the lower bound for identifying a hypothesis as a
|
996
1106
|
# possible word alternative (also known as "Confusion Networks"). An alternative
|
997
1107
|
# word is considered if its confidence is greater than or equal to the threshold.
|
998
1108
|
# Specify a probability between 0.0 and 1.0. By default, the service computes no
|
999
1109
|
# alternative words. See [Word
|
1000
|
-
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1110
|
+
# alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#word-alternatives).
|
1001
1111
|
# @param word_confidence [Boolean] If `true`, the service returns a confidence measure in the range of 0.0 to 1.0 for
|
1002
1112
|
# each word. By default, the service returns no word confidence scores. See [Word
|
1003
|
-
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1113
|
+
# confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-confidence).
|
1004
1114
|
# @param timestamps [Boolean] If `true`, the service returns time alignment for each word. By default, no
|
1005
1115
|
# timestamps are returned. See [Word
|
1006
|
-
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1116
|
+
# timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-timestamps).
|
1007
1117
|
# @param profanity_filter [Boolean] If `true`, the service filters profanity from all output except for keyword
|
1008
1118
|
# results by replacing inappropriate words with a series of asterisks. Set the
|
1009
1119
|
# parameter to `false` to return results with no censoring. Applies to US English
|
1010
|
-
# transcription only. See [Profanity
|
1011
|
-
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1120
|
+
# and Japanese transcription only. See [Profanity
|
1121
|
+
# filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#profanity-filtering).
|
1012
1122
|
# @param smart_formatting [Boolean] If `true`, the service converts dates, times, series of digits and numbers, phone
|
1013
1123
|
# numbers, currency values, and internet addresses into more readable, conventional
|
1014
1124
|
# representations in the final transcript of a recognition request. For US English,
|
1015
1125
|
# the service also converts certain keyword strings to punctuation symbols. By
|
1016
1126
|
# default, the service performs no smart formatting.
|
1017
1127
|
#
|
1018
|
-
# **
|
1128
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
1129
|
+
# and Spanish transcription only.
|
1019
1130
|
#
|
1020
1131
|
# See [Smart
|
1021
|
-
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1132
|
+
# formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#smart-formatting).
|
1022
1133
|
# @param speaker_labels [Boolean] If `true`, the response includes labels that identify which words were spoken by
|
1023
1134
|
# which participants in a multi-person exchange. By default, the service returns no
|
1024
1135
|
# speaker labels. Setting `speaker_labels` to `true` forces the `timestamps`
|
1025
1136
|
# parameter to be `true`, regardless of whether you specify `false` for the
|
1026
1137
|
# parameter.
|
1027
1138
|
#
|
1028
|
-
# **
|
1029
|
-
#
|
1030
|
-
#
|
1139
|
+
# **Beta:** The parameter is beta functionality.
|
1140
|
+
# * For previous-generation models, the parameter can be used for Australian
|
1141
|
+
# English, US English, German, Japanese, Korean, and Spanish (both broadband and
|
1142
|
+
# narrowband models) and UK English (narrowband model) transcription only.
|
1143
|
+
# * For next-generation models, the parameter can be used for English (Australian,
|
1144
|
+
# Indian, UK, and US), German, Japanese, Korean, and Spanish transcription only.
|
1031
1145
|
#
|
1032
|
-
#
|
1033
|
-
#
|
1146
|
+
# Restrictions and limitations apply to the use of speaker labels for both types of
|
1147
|
+
# models. See [Speaker
|
1148
|
+
# labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
|
1034
1149
|
# @param customization_id [String] **Deprecated.** Use the `language_customization_id` parameter to specify the
|
1035
1150
|
# customization ID (GUID) of a custom language model that is to be used with the
|
1036
1151
|
# recognition request. Do not specify both parameters with a request.
|
@@ -1038,8 +1153,12 @@ module IBMWatson
|
|
1038
1153
|
# specify a grammar, you must also use the `language_customization_id` parameter to
|
1039
1154
|
# specify the name of the custom language model for which the grammar is defined.
|
1040
1155
|
# The service recognizes only strings that are recognized by the specified grammar;
|
1041
|
-
# it does not recognize other custom words from the model's words resource.
|
1042
|
-
#
|
1156
|
+
# it does not recognize other custom words from the model's words resource.
|
1157
|
+
#
|
1158
|
+
# **Beta:** The parameter is beta functionality.
|
1159
|
+
#
|
1160
|
+
# See [Using a grammar for speech
|
1161
|
+
# recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-grammarUse).
|
1043
1162
|
# @param redaction [Boolean] If `true`, the service redacts, or masks, numeric data from final transcripts. The
|
1044
1163
|
# feature redacts any number that has three or more consecutive digits by replacing
|
1045
1164
|
# each digit with an `X` character. It is intended to redact sensitive numeric data,
|
@@ -1051,10 +1170,11 @@ module IBMWatson
|
|
1051
1170
|
# `keywords_threshold` parameters) and returns only a single final transcript
|
1052
1171
|
# (forces the `max_alternatives` parameter to be `1`).
|
1053
1172
|
#
|
1054
|
-
# **
|
1173
|
+
# **Beta:** The parameter is beta functionality. Applies to US English, Japanese,
|
1174
|
+
# and Korean transcription only.
|
1055
1175
|
#
|
1056
1176
|
# See [Numeric
|
1057
|
-
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1177
|
+
# redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#numeric-redaction).
|
1058
1178
|
# @param processing_metrics [Boolean] If `true`, requests processing metrics about the service's transcription of the
|
1059
1179
|
# input audio. The service returns processing metrics at the interval specified by
|
1060
1180
|
# the `processing_metrics_interval` parameter. It also returns processing metrics
|
@@ -1062,7 +1182,7 @@ module IBMWatson
|
|
1062
1182
|
# the service returns no processing metrics.
|
1063
1183
|
#
|
1064
1184
|
# See [Processing
|
1065
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
1185
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing-metrics).
|
1066
1186
|
# @param processing_metrics_interval [Float] Specifies the interval in real wall-clock seconds at which the service is to
|
1067
1187
|
# return processing metrics. The parameter is ignored unless the
|
1068
1188
|
# `processing_metrics` parameter is set to `true`.
|
@@ -1076,13 +1196,13 @@ module IBMWatson
|
|
1076
1196
|
# the service returns processing metrics only for transcription events.
|
1077
1197
|
#
|
1078
1198
|
# See [Processing
|
1079
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
1199
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing-metrics).
|
1080
1200
|
# @param audio_metrics [Boolean] If `true`, requests detailed information about the signal characteristics of the
|
1081
1201
|
# input audio. The service returns audio metrics with the final transcription
|
1082
1202
|
# results. By default, the service returns no audio metrics.
|
1083
1203
|
#
|
1084
1204
|
# See [Audio
|
1085
|
-
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#
|
1205
|
+
# metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio-metrics).
|
1086
1206
|
# @param end_of_phrase_silence_time [Float] If `true`, specifies the duration of the pause interval at which the service
|
1087
1207
|
# splits a transcript into multiple final results. If the service detects pauses or
|
1088
1208
|
# extended silence before it reaches the end of the audio stream, its response can
|
@@ -1099,7 +1219,7 @@ module IBMWatson
|
|
1099
1219
|
# Chinese is 0.6 seconds.
|
1100
1220
|
#
|
1101
1221
|
# See [End of phrase silence
|
1102
|
-
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1222
|
+
# time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#silence-time).
|
1103
1223
|
# @param split_transcript_at_phrase_end [Boolean] If `true`, directs the service to split the transcript into multiple final results
|
1104
1224
|
# based on semantic features of the input, for example, at the conclusion of
|
1105
1225
|
# meaningful phrases such as sentences. The service bases its understanding of
|
@@ -1109,7 +1229,7 @@ module IBMWatson
|
|
1109
1229
|
# interval.
|
1110
1230
|
#
|
1111
1231
|
# See [Split transcript at phrase
|
1112
|
-
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1232
|
+
# end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#split-transcript).
|
1113
1233
|
# @param speech_detector_sensitivity [Float] The sensitivity of speech activity detection that the service is to perform. Use
|
1114
1234
|
# the parameter to suppress word insertions from music, coughing, and other
|
1115
1235
|
# non-speech events. The service biases the audio it passes for speech recognition
|
@@ -1121,8 +1241,8 @@ module IBMWatson
|
|
1121
1241
|
# * 0.5 (the default) provides a reasonable compromise for the level of sensitivity.
|
1122
1242
|
# * 1.0 suppresses no audio (speech detection sensitivity is disabled).
|
1123
1243
|
#
|
1124
|
-
# The values increase on a monotonic curve. See [Speech
|
1125
|
-
#
|
1244
|
+
# The values increase on a monotonic curve. See [Speech detector
|
1245
|
+
# sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity).
|
1126
1246
|
# @param background_audio_suppression [Float] The level to which the service is to suppress background audio based on its volume
|
1127
1247
|
# to prevent it from being transcribed as speech. Use the parameter to suppress side
|
1128
1248
|
# conversations or background noise.
|
@@ -1133,10 +1253,24 @@ module IBMWatson
|
|
1133
1253
|
# * 0.5 provides a reasonable level of audio suppression for general usage.
|
1134
1254
|
# * 1.0 suppresses all audio (no audio is transcribed).
|
1135
1255
|
#
|
1136
|
-
# The values increase on a monotonic curve. See [
|
1137
|
-
#
|
1256
|
+
# The values increase on a monotonic curve. See [Background audio
|
1257
|
+
# suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression).
|
1258
|
+
# @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
|
1259
|
+
# latency, directs the service to produce results even more quickly than it usually
|
1260
|
+
# does. Next-generation models produce transcription results faster than
|
1261
|
+
# previous-generation models. The `low_latency` parameter causes the models to
|
1262
|
+
# produce results even more quickly, though the results might be less accurate when
|
1263
|
+
# the parameter is used.
|
1264
|
+
#
|
1265
|
+
# The parameter is not available for previous-generation `Broadband` and
|
1266
|
+
# `Narrowband` models. It is available only for some next-generation models. For a
|
1267
|
+
# list of next-generation models that support low latency, see [Supported
|
1268
|
+
# next-generation language
|
1269
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported).
|
1270
|
+
# * For more information about the `low_latency` parameter, see [Low
|
1271
|
+
# latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
|
1138
1272
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
1139
|
-
def create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
|
1273
|
+
def create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
|
1140
1274
|
raise ArgumentError.new("audio must be provided") if audio.nil?
|
1141
1275
|
|
1142
1276
|
headers = {
|
@@ -1175,7 +1309,8 @@ module IBMWatson
|
|
1175
1309
|
"end_of_phrase_silence_time" => end_of_phrase_silence_time,
|
1176
1310
|
"split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
|
1177
1311
|
"speech_detector_sensitivity" => speech_detector_sensitivity,
|
1178
|
-
"background_audio_suppression" => background_audio_suppression
|
1312
|
+
"background_audio_suppression" => background_audio_suppression,
|
1313
|
+
"low_latency" => low_latency
|
1179
1314
|
}
|
1180
1315
|
|
1181
1316
|
data = audio
|
@@ -1200,10 +1335,10 @@ module IBMWatson
|
|
1200
1335
|
# credentials with which it is called. The method also returns the creation and
|
1201
1336
|
# update times of each job, and, if a job was created with a callback URL and a user
|
1202
1337
|
# token, the user token for the job. To obtain the results for a job whose status is
|
1203
|
-
# `completed` or not one of the latest 100 outstanding jobs, use the
|
1204
|
-
# method. A job and its results remain available until you delete
|
1205
|
-
#
|
1206
|
-
# first.
|
1338
|
+
# `completed` or not one of the latest 100 outstanding jobs, use the [Check a
|
1339
|
+
# job[(#checkjob) method. A job and its results remain available until you delete
|
1340
|
+
# them with the [Delete a job](#deletejob) method or until the job's time to live
|
1341
|
+
# expires, whichever comes first.
|
1207
1342
|
#
|
1208
1343
|
# **See also:** [Checking the status of the latest
|
1209
1344
|
# jobs](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-async#jobs).
|
@@ -1237,8 +1372,8 @@ module IBMWatson
|
|
1237
1372
|
# You can use the method to retrieve the results of any job, regardless of whether
|
1238
1373
|
# it was submitted with a callback URL and the `recognitions.completed_with_results`
|
1239
1374
|
# event, and you can retrieve the results multiple times for as long as they remain
|
1240
|
-
# available. Use the
|
1241
|
-
# recent jobs associated with the calling credentials.
|
1375
|
+
# available. Use the [Check jobs](#checkjobs) method to request information about
|
1376
|
+
# the most recent jobs associated with the calling credentials.
|
1242
1377
|
#
|
1243
1378
|
# **See also:** [Checking the status and retrieving the results of a
|
1244
1379
|
# job](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-async#job).
|
@@ -1326,28 +1461,28 @@ module IBMWatson
|
|
1326
1461
|
# customizes.
|
1327
1462
|
#
|
1328
1463
|
# To determine whether a base model supports language model customization, use the
|
1329
|
-
#
|
1330
|
-
# to `true`. You can also refer to [Language support
|
1331
|
-
#
|
1464
|
+
# [Get a model](#getmodel) method and check that the attribute
|
1465
|
+
# `custom_language_model` is set to `true`. You can also refer to [Language support
|
1466
|
+
# for
|
1467
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
1332
1468
|
# @param dialect [String] The dialect of the specified language that is to be used with the custom language
|
1333
1469
|
# model. For most languages, the dialect matches the language of the base model by
|
1334
|
-
# default. For example, `en-US` is used for
|
1335
|
-
#
|
1470
|
+
# default. For example, `en-US` is used for the US English language models. All
|
1471
|
+
# dialect values are case-insensitive.
|
1336
1472
|
#
|
1337
|
-
#
|
1338
|
-
#
|
1473
|
+
# The parameter is meaningful only for Spanish language models, for which you can
|
1474
|
+
# always safely omit the parameter to have the service create the correct mapping.
|
1475
|
+
# For Spanish, the service creates a custom language model that is suited for speech
|
1476
|
+
# in one of the following dialects:
|
1339
1477
|
# * `es-ES` for Castilian Spanish (`es-ES` models)
|
1340
1478
|
# * `es-LA` for Latin American Spanish (`es-AR`, `es-CL`, `es-CO`, and `es-PE`
|
1341
1479
|
# models)
|
1342
1480
|
# * `es-US` for Mexican (North American) Spanish (`es-MX` models)
|
1343
1481
|
#
|
1344
|
-
#
|
1345
|
-
#
|
1346
|
-
#
|
1347
|
-
#
|
1348
|
-
# must match the language of the base model. If you specify the `dialect` for
|
1349
|
-
# Spanish language models, its value must match one of the defined mappings as
|
1350
|
-
# indicated (`es-ES`, `es-LA`, or `es-MX`). All dialect values are case-insensitive.
|
1482
|
+
# If you specify the `dialect` parameter for a non-Spanish language model, its value
|
1483
|
+
# must match the language of the base model. If you specify the `dialect` for a
|
1484
|
+
# Spanish language model, its value must match one of the defined mappings (`es-ES`,
|
1485
|
+
# `es-LA`, or `es-MX`).
|
1351
1486
|
# @param description [String] A description of the new custom language model. Use a localized description that
|
1352
1487
|
# matches the language of the custom model.
|
1353
1488
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
@@ -1393,11 +1528,12 @@ module IBMWatson
|
|
1393
1528
|
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageLanguageModels#listModels-language).
|
1394
1529
|
# @param language [String] The identifier of the language for which custom language or custom acoustic models
|
1395
1530
|
# are to be returned. Omit the parameter to see all custom language or custom
|
1396
|
-
# acoustic models that are owned by the requesting credentials.
|
1531
|
+
# acoustic models that are owned by the requesting credentials. (**Note:** The
|
1532
|
+
# identifier `ar-AR` is deprecated; use `ar-MS` instead.)
|
1397
1533
|
#
|
1398
1534
|
# To determine the languages for which customization is available, see [Language
|
1399
1535
|
# support for
|
1400
|
-
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1536
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
1401
1537
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
1402
1538
|
def list_language_models(language: nil)
|
1403
1539
|
headers = {
|
@@ -1501,12 +1637,13 @@ module IBMWatson
|
|
1501
1637
|
# the current load on the service. The method returns an HTTP 200 response code to
|
1502
1638
|
# indicate that the training process has begun.
|
1503
1639
|
#
|
1504
|
-
# You can monitor the status of the training by using the
|
1505
|
-
# model
|
1506
|
-
# seconds. The method returns a `LanguageModel` object that
|
1507
|
-
# `progress` fields. A status of `available` means that the
|
1508
|
-
# and ready to use. The service cannot accept subsequent
|
1509
|
-
# requests to add new resources until the existing request
|
1640
|
+
# You can monitor the status of the training by using the [Get a custom language
|
1641
|
+
# model](#getlanguagemodel) method to poll the model's status. Use a loop to check
|
1642
|
+
# the status every 10 seconds. The method returns a `LanguageModel` object that
|
1643
|
+
# includes `status` and `progress` fields. A status of `available` means that the
|
1644
|
+
# custom model is trained and ready to use. The service cannot accept subsequent
|
1645
|
+
# training requests or requests to add new resources until the existing request
|
1646
|
+
# completes.
|
1510
1647
|
#
|
1511
1648
|
# **See also:** [Train the custom language
|
1512
1649
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#trainModel-language).
|
@@ -1526,14 +1663,18 @@ module IBMWatson
|
|
1526
1663
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
1527
1664
|
# the request. You must make the request with credentials for the instance of the
|
1528
1665
|
# service that owns the custom model.
|
1529
|
-
# @param word_type_to_add [String]
|
1530
|
-
# train the model:
|
1666
|
+
# @param word_type_to_add [String] _For custom models that are based on previous-generation models_, the type of
|
1667
|
+
# words from the custom language model's words resource on which to train the model:
|
1531
1668
|
# * `all` (the default) trains the model on all new words, regardless of whether
|
1532
1669
|
# they were extracted from corpora or grammars or were added or modified by the
|
1533
1670
|
# user.
|
1534
|
-
# * `user` trains the model only on
|
1671
|
+
# * `user` trains the model only on custom words that were added or modified by the
|
1535
1672
|
# user directly. The model is not trained on new words extracted from corpora or
|
1536
1673
|
# grammars.
|
1674
|
+
#
|
1675
|
+
# _For custom models that are based on next-generation models_, the service ignores
|
1676
|
+
# the parameter. The words resource contains only custom words that the user adds or
|
1677
|
+
# modifies directly, so the parameter is unnecessary.
|
1537
1678
|
# @param customization_weight [Float] Specifies a customization weight for the custom language model. The customization
|
1538
1679
|
# weight tells the service how much weight to give to words from the custom language
|
1539
1680
|
# model compared to those from the base model for speech recognition. Specify a
|
@@ -1548,6 +1689,9 @@ module IBMWatson
|
|
1548
1689
|
# The value that you assign is used for all recognition requests that use the model.
|
1549
1690
|
# You can override it for any recognition request by specifying a customization
|
1550
1691
|
# weight for that request.
|
1692
|
+
#
|
1693
|
+
# See [Using customization
|
1694
|
+
# weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
|
1551
1695
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
1552
1696
|
def train_language_model(customization_id:, word_type_to_add: nil, customization_weight: nil)
|
1553
1697
|
raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
|
@@ -1621,15 +1765,19 @@ module IBMWatson
|
|
1621
1765
|
#
|
1622
1766
|
# The method returns an HTTP 200 response code to indicate that the upgrade process
|
1623
1767
|
# has begun successfully. You can monitor the status of the upgrade by using the
|
1624
|
-
#
|
1625
|
-
# returns a `LanguageModel` object that includes `status` and
|
1626
|
-
# a loop to check the status every 10 seconds. While it is
|
1627
|
-
# custom model has the status `upgrading`. When the upgrade is
|
1628
|
-
# resumes the status that it had prior to upgrade. The service
|
1629
|
-
# subsequent requests for the model until the upgrade completes.
|
1768
|
+
# [Get a custom language model](#getlanguagemodel) method to poll the model's
|
1769
|
+
# status. The method returns a `LanguageModel` object that includes `status` and
|
1770
|
+
# `progress` fields. Use a loop to check the status every 10 seconds. While it is
|
1771
|
+
# being upgraded, the custom model has the status `upgrading`. When the upgrade is
|
1772
|
+
# complete, the model resumes the status that it had prior to upgrade. The service
|
1773
|
+
# cannot accept subsequent requests for the model until the upgrade completes.
|
1774
|
+
#
|
1775
|
+
# **Note:** Upgrading is necessary only for custom language models that are based on
|
1776
|
+
# previous-generation models. Only a single version of a custom model that is based
|
1777
|
+
# on a next-generation model is ever available.
|
1630
1778
|
#
|
1631
1779
|
# **See also:** [Upgrading a custom language
|
1632
|
-
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
1780
|
+
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-language).
|
1633
1781
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
1634
1782
|
# the request. You must make the request with credentials for the instance of the
|
1635
1783
|
# service that owns the custom model.
|
@@ -1660,9 +1808,10 @@ module IBMWatson
|
|
1660
1808
|
# @!method list_corpora(customization_id:)
|
1661
1809
|
# List corpora.
|
1662
1810
|
# Lists information about all corpora from a custom language model. The information
|
1663
|
-
# includes the total number of words
|
1664
|
-
#
|
1665
|
-
#
|
1811
|
+
# includes the name, status, and total number of words for each corpus. _For custom
|
1812
|
+
# models that are based on previous-generation models_, it also includes the number
|
1813
|
+
# of out-of-vocabulary (OOV) words from the corpus. You must use credentials for the
|
1814
|
+
# instance of the service that owns a model to list its corpora.
|
1666
1815
|
#
|
1667
1816
|
# **See also:** [Listing corpora for a custom language
|
1668
1817
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageCorpora#listCorpora).
|
@@ -1696,51 +1845,60 @@ module IBMWatson
|
|
1696
1845
|
# Use multiple requests to submit multiple corpus text files. You must use
|
1697
1846
|
# credentials for the instance of the service that owns a model to add a corpus to
|
1698
1847
|
# it. Adding a corpus does not affect the custom language model until you train the
|
1699
|
-
# model for the new data by using the
|
1848
|
+
# model for the new data by using the [Train a custom language
|
1849
|
+
# model](#trainlanguagemodel) method.
|
1700
1850
|
#
|
1701
1851
|
# Submit a plain text file that contains sample sentences from the domain of
|
1702
|
-
# interest to enable the service to
|
1703
|
-
# add that represent the context in which speakers use words from the domain,
|
1704
|
-
# better the service's recognition accuracy.
|
1852
|
+
# interest to enable the service to parse the words in context. The more sentences
|
1853
|
+
# you add that represent the context in which speakers use words from the domain,
|
1854
|
+
# the better the service's recognition accuracy.
|
1705
1855
|
#
|
1706
1856
|
# The call returns an HTTP 201 response code if the corpus is valid. The service
|
1707
|
-
# then asynchronously processes
|
1708
|
-
#
|
1709
|
-
#
|
1710
|
-
#
|
1711
|
-
#
|
1712
|
-
#
|
1713
|
-
#
|
1714
|
-
#
|
1715
|
-
#
|
1716
|
-
#
|
1717
|
-
#
|
1718
|
-
#
|
1719
|
-
#
|
1720
|
-
#
|
1721
|
-
#
|
1857
|
+
# then asynchronously processes and automatically extracts data from the contents of
|
1858
|
+
# the corpus. This operation can take on the order of minutes to complete depending
|
1859
|
+
# on the current load on the service, the total number of words in the corpus, and,
|
1860
|
+
# _for custom models that are based on previous-generation models_, the number of
|
1861
|
+
# new (out-of-vocabulary) words in the corpus. You cannot submit requests to add
|
1862
|
+
# additional resources to the custom model or to train the model until the service's
|
1863
|
+
# analysis of the corpus for the current request completes. Use the [Get a
|
1864
|
+
# corpus](#getcorpus) method to check the status of the analysis.
|
1865
|
+
#
|
1866
|
+
# _For custom models that are based on previous-generation models_, the service
|
1867
|
+
# auto-populates the model's words resource with words from the corpus that are not
|
1868
|
+
# found in its base vocabulary. These words are referred to as out-of-vocabulary
|
1869
|
+
# (OOV) words. After adding a corpus, you must validate the words resource to ensure
|
1870
|
+
# that each OOV word's definition is complete and valid. You can use the [List
|
1871
|
+
# custom words](#listwords) method to examine the words resource. You can use other
|
1872
|
+
# words method to eliminate typos and modify how words are pronounced as needed.
|
1722
1873
|
#
|
1723
1874
|
# To add a corpus file that has the same name as an existing corpus, set the
|
1724
1875
|
# `allow_overwrite` parameter to `true`; otherwise, the request fails. Overwriting
|
1725
1876
|
# an existing corpus causes the service to process the corpus text file and extract
|
1726
|
-
#
|
1727
|
-
#
|
1728
|
-
#
|
1729
|
-
#
|
1877
|
+
# its data anew. _For a custom model that is based on a previous-generation model_,
|
1878
|
+
# the service first removes any OOV words that are associated with the existing
|
1879
|
+
# corpus from the model's words resource unless they were also added by another
|
1880
|
+
# corpus or grammar, or they have been modified in some way with the [Add custom
|
1881
|
+
# words](#addwords) or [Add a custom word](#addword) method.
|
1730
1882
|
#
|
1731
1883
|
# The service limits the overall amount of data that you can add to a custom model
|
1732
|
-
# to a maximum of 10 million total words from all sources combined.
|
1733
|
-
#
|
1734
|
-
#
|
1735
|
-
# directly.
|
1884
|
+
# to a maximum of 10 million total words from all sources combined. _For a custom
|
1885
|
+
# model that is based on a previous-generation model_, you can add no more than 90
|
1886
|
+
# thousand custom (OOV) words to a model. This includes words that the service
|
1887
|
+
# extracts from corpora and grammars, and words that you add directly.
|
1736
1888
|
#
|
1737
1889
|
# **See also:**
|
1738
1890
|
# * [Add a corpus to the custom language
|
1739
1891
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addCorpus)
|
1740
|
-
# * [Working with
|
1741
|
-
#
|
1742
|
-
# * [
|
1743
|
-
#
|
1892
|
+
# * [Working with corpora for previous-generation
|
1893
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingCorpora)
|
1894
|
+
# * [Working with corpora for next-generation
|
1895
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingCorpora-ng)
|
1896
|
+
#
|
1897
|
+
# * [Validating a words resource for previous-generation
|
1898
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel)
|
1899
|
+
#
|
1900
|
+
# * [Validating a words resource for next-generation
|
1901
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng).
|
1744
1902
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
1745
1903
|
# the request. You must make the request with credentials for the instance of the
|
1746
1904
|
# service that owns the custom model.
|
@@ -1763,10 +1921,10 @@ module IBMWatson
|
|
1763
1921
|
# in UTF-8 if it contains non-ASCII characters; the service assumes UTF-8 encoding
|
1764
1922
|
# if it encounters non-ASCII characters.
|
1765
1923
|
#
|
1766
|
-
# Make sure that you know the character encoding of the file. You must use that
|
1924
|
+
# Make sure that you know the character encoding of the file. You must use that same
|
1767
1925
|
# encoding when working with the words in the custom language model. For more
|
1768
|
-
# information, see [Character
|
1769
|
-
#
|
1926
|
+
# information, see [Character encoding for custom
|
1927
|
+
# words](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageWords#charEncoding).
|
1770
1928
|
#
|
1771
1929
|
#
|
1772
1930
|
# With the `curl` command, use the `--data-binary` option to upload the file for the
|
@@ -1815,9 +1973,10 @@ module IBMWatson
|
|
1815
1973
|
# @!method get_corpus(customization_id:, corpus_name:)
|
1816
1974
|
# Get a corpus.
|
1817
1975
|
# Gets information about a corpus from a custom language model. The information
|
1818
|
-
# includes the total number of words
|
1819
|
-
#
|
1820
|
-
#
|
1976
|
+
# includes the name, status, and total number of words for the corpus. _For custom
|
1977
|
+
# models that are based on previous-generation models_, it also includes the number
|
1978
|
+
# of out-of-vocabulary (OOV) words from the corpus. You must use credentials for the
|
1979
|
+
# instance of the service that owns a model to list its corpora.
|
1821
1980
|
#
|
1822
1981
|
# **See also:** [Listing corpora for a custom language
|
1823
1982
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageCorpora#listCorpora).
|
@@ -1850,14 +2009,18 @@ module IBMWatson
|
|
1850
2009
|
##
|
1851
2010
|
# @!method delete_corpus(customization_id:, corpus_name:)
|
1852
2011
|
# Delete a corpus.
|
1853
|
-
# Deletes an existing corpus from a custom language model.
|
1854
|
-
#
|
1855
|
-
# model
|
1856
|
-
#
|
1857
|
-
#
|
1858
|
-
#
|
1859
|
-
#
|
1860
|
-
#
|
2012
|
+
# Deletes an existing corpus from a custom language model. Removing a corpus does
|
2013
|
+
# not affect the custom model until you train the model with the [Train a custom
|
2014
|
+
# language model](#trainlanguagemodel) method. You must use credentials for the
|
2015
|
+
# instance of the service that owns a model to delete its corpora.
|
2016
|
+
#
|
2017
|
+
# _For custom models that are based on previous-generation models_, the service
|
2018
|
+
# removes any out-of-vocabulary (OOV) words that are associated with the corpus from
|
2019
|
+
# the custom model's words resource unless they were also added by another corpus or
|
2020
|
+
# grammar, or they were modified in some way with the [Add custom words](#addwords)
|
2021
|
+
# or [Add a custom word](#addword) method.
|
2022
|
+
#
|
2023
|
+
#
|
1861
2024
|
#
|
1862
2025
|
# **See also:** [Deleting a corpus from a custom language
|
1863
2026
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageCorpora#deleteCorpus).
|
@@ -1895,10 +2058,11 @@ module IBMWatson
|
|
1895
2058
|
# List custom words.
|
1896
2059
|
# Lists information about custom words from a custom language model. You can list
|
1897
2060
|
# all words from the custom model's words resource, only custom words that were
|
1898
|
-
# added or modified by the user, or
|
1899
|
-
#
|
1900
|
-
#
|
1901
|
-
#
|
2061
|
+
# added or modified by the user, or, _for a custom model that is based on a
|
2062
|
+
# previous-generation model_, only out-of-vocabulary (OOV) words that were extracted
|
2063
|
+
# from corpora or are recognized by grammars. You can also indicate the order in
|
2064
|
+
# which the service is to return words; by default, the service lists words in
|
2065
|
+
# ascending alphabetical order. You must use credentials for the instance of the
|
1902
2066
|
# service that owns a model to list information about its words.
|
1903
2067
|
#
|
1904
2068
|
# **See also:** [Listing words from a custom language
|
@@ -1911,6 +2075,10 @@ module IBMWatson
|
|
1911
2075
|
# * `user` shows only custom words that were added or modified by the user directly.
|
1912
2076
|
# * `corpora` shows only OOV that were extracted from corpora.
|
1913
2077
|
# * `grammars` shows only OOV words that are recognized by grammars.
|
2078
|
+
#
|
2079
|
+
# _For a custom model that is based on a next-generation model_, only `all` and
|
2080
|
+
# `user` apply. Both options return the same results. Words from other sources are
|
2081
|
+
# not added to custom models that are based on next-generation models.
|
1914
2082
|
# @param sort [String] Indicates the order in which the words are to be listed, `alphabetical` or by
|
1915
2083
|
# `count`. You can prepend an optional `+` or `-` to an argument to indicate whether
|
1916
2084
|
# the results are to be sorted in ascending or descending order. By default, words
|
@@ -1947,10 +2115,14 @@ module IBMWatson
|
|
1947
2115
|
##
|
1948
2116
|
# @!method add_words(customization_id:, words:)
|
1949
2117
|
# Add custom words.
|
1950
|
-
# Adds one or more custom words to a custom language model.
|
2118
|
+
# Adds one or more custom words to a custom language model. You can use this method
|
2119
|
+
# to add words or to modify existing words in a custom model's words resource. _For
|
2120
|
+
# custom models that are based on previous-generation models_, the service populates
|
1951
2121
|
# the words resource for a custom model with out-of-vocabulary (OOV) words from each
|
1952
|
-
# corpus or grammar that is added to the model. You can use this method to
|
1953
|
-
#
|
2122
|
+
# corpus or grammar that is added to the model. You can use this method to modify
|
2123
|
+
# OOV words in the model's words resource.
|
2124
|
+
#
|
2125
|
+
# _For a custom model that is based on a previous-generation model_, the words
|
1954
2126
|
# resource for a model can contain a maximum of 90 thousand custom (OOV) words. This
|
1955
2127
|
# includes words that the service extracts from corpora and grammars and words that
|
1956
2128
|
# you add directly.
|
@@ -1958,25 +2130,26 @@ module IBMWatson
|
|
1958
2130
|
# You must use credentials for the instance of the service that owns a model to add
|
1959
2131
|
# or modify custom words for the model. Adding or modifying custom words does not
|
1960
2132
|
# affect the custom model until you train the model for the new data by using the
|
1961
|
-
#
|
2133
|
+
# [Train a custom language model](#trainlanguagemodel) method.
|
1962
2134
|
#
|
1963
2135
|
# You add custom words by providing a `CustomWords` object, which is an array of
|
1964
|
-
# `CustomWord` objects, one per word.
|
1965
|
-
#
|
1966
|
-
#
|
1967
|
-
# * The `sounds_like` field provides an array of one or more pronunciations for the
|
1968
|
-
# word. Use the parameter to specify how the word can be pronounced by users. Use
|
1969
|
-
# the parameter for words that are difficult to pronounce, foreign words, acronyms,
|
1970
|
-
# and so on. For example, you might specify that the word `IEEE` can sound like `i
|
1971
|
-
# triple e`. You can specify a maximum of five sounds-like pronunciations for a
|
1972
|
-
# word. If you omit the `sounds_like` field, the service attempts to set the field
|
1973
|
-
# to its pronunciation of the word. It cannot generate a pronunciation for all
|
1974
|
-
# words, so you must review the word's definition to ensure that it is complete and
|
1975
|
-
# valid.
|
2136
|
+
# `CustomWord` objects, one per word. Use the object's `word` parameter to identify
|
2137
|
+
# the word that is to be added. You can also provide one or both of the optional
|
2138
|
+
# `display_as` or `sounds_like` fields for each word.
|
1976
2139
|
# * The `display_as` field provides a different way of spelling the word in a
|
1977
2140
|
# transcript. Use the parameter when you want the word to appear different from its
|
1978
2141
|
# usual representation or from its spelling in training data. For example, you might
|
1979
|
-
# indicate that the word `IBM
|
2142
|
+
# indicate that the word `IBM` is to be displayed as `IBM™`.
|
2143
|
+
# * The `sounds_like` field, _which can be used only with a custom model that is
|
2144
|
+
# based on a previous-generation model_, provides an array of one or more
|
2145
|
+
# pronunciations for the word. Use the parameter to specify how the word can be
|
2146
|
+
# pronounced by users. Use the parameter for words that are difficult to pronounce,
|
2147
|
+
# foreign words, acronyms, and so on. For example, you might specify that the word
|
2148
|
+
# `IEEE` can sound like `i triple e`. You can specify a maximum of five sounds-like
|
2149
|
+
# pronunciations for a word. If you omit the `sounds_like` field, the service
|
2150
|
+
# attempts to set the field to its pronunciation of the word. It cannot generate a
|
2151
|
+
# pronunciation for all words, so you must review the word's definition to ensure
|
2152
|
+
# that it is complete and valid.
|
1980
2153
|
#
|
1981
2154
|
# If you add a custom word that already exists in the words resource for the custom
|
1982
2155
|
# model, the new definition overwrites the existing data for the word. If the
|
@@ -1988,26 +2161,30 @@ module IBMWatson
|
|
1988
2161
|
# time that it takes for the analysis to complete depends on the number of new words
|
1989
2162
|
# that you add but is generally faster than adding a corpus or grammar.
|
1990
2163
|
#
|
1991
|
-
# You can monitor the status of the request by using the
|
1992
|
-
# model
|
1993
|
-
# seconds. The method returns a `Customization` object that
|
1994
|
-
# field. A status of `ready` means that the words have been
|
1995
|
-
# model. The service cannot accept requests to add new data or
|
1996
|
-
# until the existing request completes.
|
1997
|
-
#
|
1998
|
-
# You can use the **List custom words** or **List a custom word** method to review
|
1999
|
-
# the words that you add. Words with an invalid `sounds_like` field include an
|
2000
|
-
# `error` field that describes the problem. You can use other words-related methods
|
2001
|
-
# to correct errors, eliminate typos, and modify how words are pronounced as needed.
|
2164
|
+
# You can monitor the status of the request by using the [Get a custom language
|
2165
|
+
# model](#getlanguagemodel) method to poll the model's status. Use a loop to check
|
2166
|
+
# the status every 10 seconds. The method returns a `Customization` object that
|
2167
|
+
# includes a `status` field. A status of `ready` means that the words have been
|
2168
|
+
# added to the custom model. The service cannot accept requests to add new data or
|
2169
|
+
# to train the model until the existing request completes.
|
2002
2170
|
#
|
2171
|
+
# You can use the [List custom words](#listwords) or [Get a custom word](#getword)
|
2172
|
+
# method to review the words that you add. Words with an invalid `sounds_like` field
|
2173
|
+
# include an `error` field that describes the problem. You can use other
|
2174
|
+
# words-related methods to correct errors, eliminate typos, and modify how words are
|
2175
|
+
# pronounced as needed.
|
2003
2176
|
#
|
2004
2177
|
# **See also:**
|
2005
2178
|
# * [Add words to the custom language
|
2006
2179
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addWords)
|
2007
|
-
# * [Working with custom
|
2008
|
-
#
|
2009
|
-
# * [
|
2010
|
-
#
|
2180
|
+
# * [Working with custom words for previous-generation
|
2181
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingWords)
|
2182
|
+
# * [Working with custom words for next-generation
|
2183
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingWords-ng)
|
2184
|
+
# * [Validating a words resource for previous-generation
|
2185
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel)
|
2186
|
+
# * [Validating a words resource for next-generation
|
2187
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng).
|
2011
2188
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
2012
2189
|
# the request. You must make the request with credentials for the instance of the
|
2013
2190
|
# service that owns the custom model.
|
@@ -2043,47 +2220,57 @@ module IBMWatson
|
|
2043
2220
|
##
|
2044
2221
|
# @!method add_word(customization_id:, word_name:, word: nil, sounds_like: nil, display_as: nil)
|
2045
2222
|
# Add a custom word.
|
2046
|
-
# Adds a custom word to a custom language model.
|
2047
|
-
#
|
2048
|
-
#
|
2049
|
-
#
|
2050
|
-
#
|
2051
|
-
#
|
2223
|
+
# Adds a custom word to a custom language model. You can use this method to add a
|
2224
|
+
# word or to modify an existing word in the words resource. _For custom models that
|
2225
|
+
# are based on previous-generation models_, the service populates the words resource
|
2226
|
+
# for a custom model with out-of-vocabulary (OOV) words from each corpus or grammar
|
2227
|
+
# that is added to the model. You can use this method to modify OOV words in the
|
2228
|
+
# model's words resource.
|
2229
|
+
#
|
2230
|
+
# _For a custom model that is based on a previous-generation models_, the words
|
2231
|
+
# resource for a model can contain a maximum of 90 thousand custom (OOV) words. This
|
2232
|
+
# includes words that the service extracts from corpora and grammars and words that
|
2233
|
+
# you add directly.
|
2052
2234
|
#
|
2053
2235
|
# You must use credentials for the instance of the service that owns a model to add
|
2054
2236
|
# or modify a custom word for the model. Adding or modifying a custom word does not
|
2055
2237
|
# affect the custom model until you train the model for the new data by using the
|
2056
|
-
#
|
2238
|
+
# [Train a custom language model](#trainlanguagemodel) method.
|
2057
2239
|
#
|
2058
2240
|
# Use the `word_name` parameter to specify the custom word that is to be added or
|
2059
2241
|
# modified. Use the `CustomWord` object to provide one or both of the optional
|
2060
|
-
# `
|
2061
|
-
# * The `sounds_like` field provides an array of one or more pronunciations for the
|
2062
|
-
# word. Use the parameter to specify how the word can be pronounced by users. Use
|
2063
|
-
# the parameter for words that are difficult to pronounce, foreign words, acronyms,
|
2064
|
-
# and so on. For example, you might specify that the word `IEEE` can sound like `i
|
2065
|
-
# triple e`. You can specify a maximum of five sounds-like pronunciations for a
|
2066
|
-
# word. If you omit the `sounds_like` field, the service attempts to set the field
|
2067
|
-
# to its pronunciation of the word. It cannot generate a pronunciation for all
|
2068
|
-
# words, so you must review the word's definition to ensure that it is complete and
|
2069
|
-
# valid.
|
2242
|
+
# `display_as` or `sounds_like` fields for the word.
|
2070
2243
|
# * The `display_as` field provides a different way of spelling the word in a
|
2071
2244
|
# transcript. Use the parameter when you want the word to appear different from its
|
2072
2245
|
# usual representation or from its spelling in training data. For example, you might
|
2073
|
-
# indicate that the word `IBM
|
2246
|
+
# indicate that the word `IBM` is to be displayed as `IBM™`.
|
2247
|
+
# * The `sounds_like` field, _which can be used only with a custom model that is
|
2248
|
+
# based on a previous-generation model_, provides an array of one or more
|
2249
|
+
# pronunciations for the word. Use the parameter to specify how the word can be
|
2250
|
+
# pronounced by users. Use the parameter for words that are difficult to pronounce,
|
2251
|
+
# foreign words, acronyms, and so on. For example, you might specify that the word
|
2252
|
+
# `IEEE` can sound like `i triple e`. You can specify a maximum of five sounds-like
|
2253
|
+
# pronunciations for a word. If you omit the `sounds_like` field, the service
|
2254
|
+
# attempts to set the field to its pronunciation of the word. It cannot generate a
|
2255
|
+
# pronunciation for all words, so you must review the word's definition to ensure
|
2256
|
+
# that it is complete and valid.
|
2074
2257
|
#
|
2075
2258
|
# If you add a custom word that already exists in the words resource for the custom
|
2076
2259
|
# model, the new definition overwrites the existing data for the word. If the
|
2077
2260
|
# service encounters an error, it does not add the word to the words resource. Use
|
2078
|
-
# the
|
2261
|
+
# the [Get a custom word](#getword) method to review the word that you add.
|
2079
2262
|
#
|
2080
2263
|
# **See also:**
|
2081
2264
|
# * [Add words to the custom language
|
2082
2265
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageCreate#addWords)
|
2083
|
-
# * [Working with custom
|
2084
|
-
#
|
2085
|
-
# * [
|
2086
|
-
#
|
2266
|
+
# * [Working with custom words for previous-generation
|
2267
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#workingWords)
|
2268
|
+
# * [Working with custom words for next-generation
|
2269
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#workingWords-ng)
|
2270
|
+
# * [Validating a words resource for previous-generation
|
2271
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#validateModel)
|
2272
|
+
# * [Validating a words resource for next-generation
|
2273
|
+
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords-ng#validateModel-ng).
|
2087
2274
|
# @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
|
2088
2275
|
# the request. You must make the request with credentials for the instance of the
|
2089
2276
|
# service that owns the custom model.
|
@@ -2092,14 +2279,16 @@ module IBMWatson
|
|
2092
2279
|
# the tokens of compound words. URL-encode the word if it includes non-ASCII
|
2093
2280
|
# characters. For more information, see [Character
|
2094
2281
|
# encoding](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-corporaWords#charEncoding).
|
2095
|
-
# @param word [String] For the
|
2096
|
-
# be added to or updated in the custom model. Do not include spaces in
|
2097
|
-
# a `-` (dash) or `_` (underscore) to connect the tokens of compound
|
2098
|
-
#
|
2099
|
-
#
|
2100
|
-
#
|
2101
|
-
#
|
2102
|
-
#
|
2282
|
+
# @param word [String] For the [Add custom words](#addwords) method, you must specify the custom word
|
2283
|
+
# that is to be added to or updated in the custom model. Do not include spaces in
|
2284
|
+
# the word. Use a `-` (dash) or `_` (underscore) to connect the tokens of compound
|
2285
|
+
# words.
|
2286
|
+
#
|
2287
|
+
# Omit this parameter for the [Add a custom word](#addword) method.
|
2288
|
+
# @param sounds_like [Array[String]] _For a custom model that is based on a previous-generation model_, an array of
|
2289
|
+
# sounds-like pronunciations for the custom word. Specify how words that are
|
2290
|
+
# difficult to pronounce, foreign words, acronyms, and so on can be pronounced by
|
2291
|
+
# users.
|
2103
2292
|
# * For a word that is not in the service's base vocabulary, omit the parameter to
|
2104
2293
|
# have the service automatically generate a sounds-like pronunciation for the word.
|
2105
2294
|
# * For a word that is in the service's base vocabulary, use the parameter to
|
@@ -2109,6 +2298,10 @@ module IBMWatson
|
|
2109
2298
|
#
|
2110
2299
|
# A word can have at most five sounds-like pronunciations. A pronunciation can
|
2111
2300
|
# include at most 40 characters not including spaces.
|
2301
|
+
#
|
2302
|
+
# _For a custom model that is based on a next-generation model_, omit this field.
|
2303
|
+
# Custom models based on next-generation models do not support the `sounds_like`
|
2304
|
+
# field. The service ignores the field.
|
2112
2305
|
# @param display_as [String] An alternative spelling for the custom word when it appears in a transcript. Use
|
2113
2306
|
# the parameter when you want the word to have a spelling that is different from its
|
2114
2307
|
# usual representation or from its spelling in corpora training data.
|
@@ -2183,11 +2376,12 @@ module IBMWatson
|
|
2183
2376
|
# Delete a custom word.
|
2184
2377
|
# Deletes a custom word from a custom language model. You can remove any word that
|
2185
2378
|
# you added to the custom model's words resource via any means. However, if the word
|
2186
|
-
# also exists in the service's base vocabulary, the service removes
|
2187
|
-
#
|
2379
|
+
# also exists in the service's base vocabulary, the service removes the word only
|
2380
|
+
# from the words resource; the word remains in the base vocabulary. Removing a
|
2188
2381
|
# custom word does not affect the custom model until you train the model with the
|
2189
|
-
#
|
2190
|
-
# instance of the service that owns a model to delete its words.
|
2382
|
+
# [Train a custom language model](#trainlanguagemodel) method. You must use
|
2383
|
+
# credentials for the instance of the service that owns a model to delete its words.
|
2384
|
+
#
|
2191
2385
|
#
|
2192
2386
|
# **See also:** [Deleting a word from a custom language
|
2193
2387
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageWords#deleteWord).
|
@@ -2228,7 +2422,11 @@ module IBMWatson
|
|
2228
2422
|
# Lists information about all grammars from a custom language model. The information
|
2229
2423
|
# includes the total number of out-of-vocabulary (OOV) words, name, and status of
|
2230
2424
|
# each grammar. You must use credentials for the instance of the service that owns a
|
2231
|
-
# model to list its grammars.
|
2425
|
+
# model to list its grammars. Grammars are available for all languages and models
|
2426
|
+
# that support language customization.
|
2427
|
+
#
|
2428
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2429
|
+
# They are not supported for next-generation models.
|
2232
2430
|
#
|
2233
2431
|
# **See also:** [Listing grammars from a custom language
|
2234
2432
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageGrammars#listGrammars).
|
@@ -2262,8 +2460,8 @@ module IBMWatson
|
|
2262
2460
|
# UTF-8 format that defines the grammar. Use multiple requests to submit multiple
|
2263
2461
|
# grammar files. You must use credentials for the instance of the service that owns
|
2264
2462
|
# a model to add a grammar to it. Adding a grammar does not affect the custom
|
2265
|
-
# language model until you train the model for the new data by using the
|
2266
|
-
# custom language model
|
2463
|
+
# language model until you train the model for the new data by using the [Train a
|
2464
|
+
# custom language model](#trainlanguagemodel) method.
|
2267
2465
|
#
|
2268
2466
|
# The call returns an HTTP 201 response code if the grammar is valid. The service
|
2269
2467
|
# then asynchronously processes the contents of the grammar and automatically
|
@@ -2271,27 +2469,33 @@ module IBMWatson
|
|
2271
2469
|
# to complete depending on the size and complexity of the grammar, as well as the
|
2272
2470
|
# current load on the service. You cannot submit requests to add additional
|
2273
2471
|
# resources to the custom model or to train the model until the service's analysis
|
2274
|
-
# of the grammar for the current request completes. Use the
|
2275
|
-
# to check the status of the analysis.
|
2472
|
+
# of the grammar for the current request completes. Use the [Get a
|
2473
|
+
# grammar](#getgrammar) method to check the status of the analysis.
|
2276
2474
|
#
|
2277
2475
|
# The service populates the model's words resource with any word that is recognized
|
2278
2476
|
# by the grammar that is not found in the model's base vocabulary. These are
|
2279
|
-
# referred to as out-of-vocabulary (OOV) words. You can use the
|
2280
|
-
# words
|
2281
|
-
# to eliminate typos and modify how words are pronounced as
|
2477
|
+
# referred to as out-of-vocabulary (OOV) words. You can use the [List custom
|
2478
|
+
# words](#listwords) method to examine the words resource and use other
|
2479
|
+
# words-related methods to eliminate typos and modify how words are pronounced as
|
2480
|
+
# needed.
|
2282
2481
|
#
|
2283
2482
|
# To add a grammar that has the same name as an existing grammar, set the
|
2284
2483
|
# `allow_overwrite` parameter to `true`; otherwise, the request fails. Overwriting
|
2285
2484
|
# an existing grammar causes the service to process the grammar file and extract OOV
|
2286
2485
|
# words anew. Before doing so, it removes any OOV words associated with the existing
|
2287
2486
|
# grammar from the model's words resource unless they were also added by another
|
2288
|
-
# resource or they have been modified in some way with the
|
2289
|
-
#
|
2487
|
+
# resource or they have been modified in some way with the [Add custom
|
2488
|
+
# words](#addwords) or [Add a custom word](#addword) method.
|
2290
2489
|
#
|
2291
2490
|
# The service limits the overall amount of data that you can add to a custom model
|
2292
2491
|
# to a maximum of 10 million total words from all sources combined. Also, you can
|
2293
2492
|
# add no more than 90 thousand OOV words to a model. This includes words that the
|
2294
2493
|
# service extracts from corpora and grammars and words that you add directly.
|
2494
|
+
# Grammars are available for all languages and models that support language
|
2495
|
+
# customization.
|
2496
|
+
#
|
2497
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2498
|
+
# They are not supported for next-generation models.
|
2295
2499
|
#
|
2296
2500
|
# **See also:**
|
2297
2501
|
# * [Understanding
|
@@ -2374,7 +2578,11 @@ module IBMWatson
|
|
2374
2578
|
# Gets information about a grammar from a custom language model. The information
|
2375
2579
|
# includes the total number of out-of-vocabulary (OOV) words, name, and status of
|
2376
2580
|
# the grammar. You must use credentials for the instance of the service that owns a
|
2377
|
-
# model to list its grammars.
|
2581
|
+
# model to list its grammars. Grammars are available for all languages and models
|
2582
|
+
# that support language customization.
|
2583
|
+
#
|
2584
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2585
|
+
# They are not supported for next-generation models.
|
2378
2586
|
#
|
2379
2587
|
# **See also:** [Listing grammars from a custom language
|
2380
2588
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageGrammars#listGrammars).
|
@@ -2410,10 +2618,15 @@ module IBMWatson
|
|
2410
2618
|
# Deletes an existing grammar from a custom language model. The service removes any
|
2411
2619
|
# out-of-vocabulary (OOV) words associated with the grammar from the custom model's
|
2412
2620
|
# words resource unless they were also added by another resource or they were
|
2413
|
-
# modified in some way with the
|
2414
|
-
# method. Removing a grammar does not affect the custom model until
|
2415
|
-
# model with the
|
2416
|
-
# for the instance of the service that owns a model
|
2621
|
+
# modified in some way with the [Add custom words](#addwords) or [Add a custom
|
2622
|
+
# word](#addword) method. Removing a grammar does not affect the custom model until
|
2623
|
+
# you train the model with the [Train a custom language model](#trainlanguagemodel)
|
2624
|
+
# method. You must use credentials for the instance of the service that owns a model
|
2625
|
+
# to delete its grammar. Grammars are available for all languages and models that
|
2626
|
+
# support language customization.
|
2627
|
+
#
|
2628
|
+
# **Note:** Grammars are supported only for use with previous-generation models.
|
2629
|
+
# They are not supported for next-generation models.
|
2417
2630
|
#
|
2418
2631
|
# **See also:** [Deleting a grammar from a custom language
|
2419
2632
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageGrammars#deleteGrammar).
|
@@ -2459,6 +2672,9 @@ module IBMWatson
|
|
2459
2672
|
# do not lose any models, but you cannot create any more until your model count is
|
2460
2673
|
# below the limit.
|
2461
2674
|
#
|
2675
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2676
|
+
# previous-generation models. It is not supported for next-generation models.
|
2677
|
+
#
|
2462
2678
|
# **See also:** [Create a custom acoustic
|
2463
2679
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#createModel-acoustic).
|
2464
2680
|
# @param name [String] A user-defined name for the new custom acoustic model. Use a name that is unique
|
@@ -2468,11 +2684,12 @@ module IBMWatson
|
|
2468
2684
|
# custom model`.
|
2469
2685
|
# @param base_model_name [String] The name of the base language model that is to be customized by the new custom
|
2470
2686
|
# acoustic model. The new custom model can be used only with the base model that it
|
2471
|
-
# customizes.
|
2687
|
+
# customizes. (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
|
2688
|
+
# `ar-MS_BroadbandModel` instead.)
|
2472
2689
|
#
|
2473
2690
|
# To determine whether a base model supports acoustic model customization, refer to
|
2474
2691
|
# [Language support for
|
2475
|
-
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
2692
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
2476
2693
|
# @param description [String] A description of the new custom acoustic model. Use a localized description that
|
2477
2694
|
# matches the language of the custom model.
|
2478
2695
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
@@ -2513,15 +2730,19 @@ module IBMWatson
|
|
2513
2730
|
# all languages. You must use credentials for the instance of the service that owns
|
2514
2731
|
# a model to list information about it.
|
2515
2732
|
#
|
2733
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2734
|
+
# previous-generation models. It is not supported for next-generation models.
|
2735
|
+
#
|
2516
2736
|
# **See also:** [Listing custom acoustic
|
2517
2737
|
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
|
2518
2738
|
# @param language [String] The identifier of the language for which custom language or custom acoustic models
|
2519
2739
|
# are to be returned. Omit the parameter to see all custom language or custom
|
2520
|
-
# acoustic models that are owned by the requesting credentials.
|
2740
|
+
# acoustic models that are owned by the requesting credentials. (**Note:** The
|
2741
|
+
# identifier `ar-AR` is deprecated; use `ar-MS` instead.)
|
2521
2742
|
#
|
2522
2743
|
# To determine the languages for which customization is available, see [Language
|
2523
2744
|
# support for
|
2524
|
-
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
2745
|
+
# customization](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-support#custom-language-support).
|
2525
2746
|
# @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
|
2526
2747
|
def list_acoustic_models(language: nil)
|
2527
2748
|
headers = {
|
@@ -2551,6 +2772,9 @@ module IBMWatson
|
|
2551
2772
|
# Gets information about a specified custom acoustic model. You must use credentials
|
2552
2773
|
# for the instance of the service that owns a model to list information about it.
|
2553
2774
|
#
|
2775
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2776
|
+
# previous-generation models. It is not supported for next-generation models.
|
2777
|
+
#
|
2554
2778
|
# **See also:** [Listing custom acoustic
|
2555
2779
|
# models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
|
2556
2780
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2584,6 +2808,9 @@ module IBMWatson
|
|
2584
2808
|
# processed. You must use credentials for the instance of the service that owns a
|
2585
2809
|
# model to delete it.
|
2586
2810
|
#
|
2811
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2812
|
+
# previous-generation models. It is not supported for next-generation models.
|
2813
|
+
#
|
2587
2814
|
# **See also:** [Deleting a custom acoustic
|
2588
2815
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#deleteModel-acoustic).
|
2589
2816
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2628,14 +2855,14 @@ module IBMWatson
|
|
2628
2855
|
# audio. The method returns an HTTP 200 response code to indicate that the training
|
2629
2856
|
# process has begun.
|
2630
2857
|
#
|
2631
|
-
# You can monitor the status of the training by using the
|
2632
|
-
# model
|
2633
|
-
# minute. The method returns an `AcousticModel` object that
|
2634
|
-
# `progress` fields. A status of `available` indicates that
|
2635
|
-
# trained and ready to use. The service cannot train a model
|
2636
|
-
# another request for the model. The service cannot accept
|
2637
|
-
# requests, or requests to add new audio resources, until the
|
2638
|
-
# request completes.
|
2858
|
+
# You can monitor the status of the training by using the [Get a custom acoustic
|
2859
|
+
# model](#getacousticmodel) method to poll the model's status. Use a loop to check
|
2860
|
+
# the status once a minute. The method returns an `AcousticModel` object that
|
2861
|
+
# includes `status` and `progress` fields. A status of `available` indicates that
|
2862
|
+
# the custom model is trained and ready to use. The service cannot train a model
|
2863
|
+
# while it is handling another request for the model. The service cannot accept
|
2864
|
+
# subsequent training requests, or requests to add new audio resources, until the
|
2865
|
+
# existing training request completes.
|
2639
2866
|
#
|
2640
2867
|
# You can use the optional `custom_language_model_id` parameter to specify the GUID
|
2641
2868
|
# of a separately created custom language model that is to be used during training.
|
@@ -2646,6 +2873,9 @@ module IBMWatson
|
|
2646
2873
|
# same version of the same base model, and the custom language model must be fully
|
2647
2874
|
# trained and available.
|
2648
2875
|
#
|
2876
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2877
|
+
# previous-generation models. It is not supported for next-generation models.
|
2878
|
+
#
|
2649
2879
|
# **See also:**
|
2650
2880
|
# * [Train the custom acoustic
|
2651
2881
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#trainModel-acoustic)
|
@@ -2717,6 +2947,9 @@ module IBMWatson
|
|
2717
2947
|
# request completes. You must use credentials for the instance of the service that
|
2718
2948
|
# owns a model to reset it.
|
2719
2949
|
#
|
2950
|
+
# **Note:** Acoustic model customization is supported only for use with
|
2951
|
+
# previous-generation models. It is not supported for next-generation models.
|
2952
|
+
#
|
2720
2953
|
# **See also:** [Resetting a custom acoustic
|
2721
2954
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#resetModel-acoustic).
|
2722
2955
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2755,14 +2988,14 @@ module IBMWatson
|
|
2755
2988
|
#
|
2756
2989
|
# The method returns an HTTP 200 response code to indicate that the upgrade process
|
2757
2990
|
# has begun successfully. You can monitor the status of the upgrade by using the
|
2758
|
-
#
|
2759
|
-
# returns an `AcousticModel` object that includes `status` and
|
2760
|
-
# Use a loop to check the status once a minute. While it is being
|
2761
|
-
# custom model has the status `upgrading`. When the upgrade is
|
2762
|
-
# resumes the status that it had prior to upgrade. The service
|
2763
|
-
# model while it is handling another request for the model. The
|
2764
|
-
# accept subsequent requests for the model until the existing upgrade
|
2765
|
-
# completes.
|
2991
|
+
# [Get a custom acoustic model](#getacousticmodel) method to poll the model's
|
2992
|
+
# status. The method returns an `AcousticModel` object that includes `status` and
|
2993
|
+
# `progress` fields. Use a loop to check the status once a minute. While it is being
|
2994
|
+
# upgraded, the custom model has the status `upgrading`. When the upgrade is
|
2995
|
+
# complete, the model resumes the status that it had prior to upgrade. The service
|
2996
|
+
# cannot upgrade a model while it is handling another request for the model. The
|
2997
|
+
# service cannot accept subsequent requests for the model until the existing upgrade
|
2998
|
+
# request completes.
|
2766
2999
|
#
|
2767
3000
|
# If the custom acoustic model was trained with a separately created custom language
|
2768
3001
|
# model, you must use the `custom_language_model_id` parameter to specify the GUID
|
@@ -2770,8 +3003,11 @@ module IBMWatson
|
|
2770
3003
|
# the custom acoustic model can be upgraded. Omit the parameter if the custom
|
2771
3004
|
# acoustic model was not trained with a custom language model.
|
2772
3005
|
#
|
3006
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3007
|
+
# previous-generation models. It is not supported for next-generation models.
|
3008
|
+
#
|
2773
3009
|
# **See also:** [Upgrading a custom acoustic
|
2774
|
-
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
3010
|
+
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
|
2775
3011
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
2776
3012
|
# the request. You must make the request with credentials for the instance of the
|
2777
3013
|
# service that owns the custom model.
|
@@ -2785,7 +3021,7 @@ module IBMWatson
|
|
2785
3021
|
# upgrade of a custom acoustic model that is trained with a custom language model,
|
2786
3022
|
# and only if you receive a 400 response code and the message `No input data
|
2787
3023
|
# modified since last training`. See [Upgrading a custom acoustic
|
2788
|
-
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-
|
3024
|
+
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
|
2789
3025
|
# @return [nil]
|
2790
3026
|
def upgrade_acoustic_model(customization_id:, custom_language_model_id: nil, force: nil)
|
2791
3027
|
raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
|
@@ -2825,6 +3061,9 @@ module IBMWatson
|
|
2825
3061
|
# to a request to add it to the custom acoustic model. You must use credentials for
|
2826
3062
|
# the instance of the service that owns a model to list its audio resources.
|
2827
3063
|
#
|
3064
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3065
|
+
# previous-generation models. It is not supported for next-generation models.
|
3066
|
+
#
|
2828
3067
|
# **See also:** [Listing audio resources for a custom acoustic
|
2829
3068
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio).
|
2830
3069
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -2857,8 +3096,8 @@ module IBMWatson
|
|
2857
3096
|
# the acoustic characteristics of the audio that you plan to transcribe. You must
|
2858
3097
|
# use credentials for the instance of the service that owns a model to add an audio
|
2859
3098
|
# resource to it. Adding audio data does not affect the custom acoustic model until
|
2860
|
-
# you train the model for the new data by using the
|
2861
|
-
# model
|
3099
|
+
# you train the model for the new data by using the [Train a custom acoustic
|
3100
|
+
# model](#trainacousticmodel) method.
|
2862
3101
|
#
|
2863
3102
|
# You can add individual audio files or an archive file that contains multiple audio
|
2864
3103
|
# files. Adding multiple audio files via a single archive file is significantly more
|
@@ -2883,11 +3122,14 @@ module IBMWatson
|
|
2883
3122
|
# upgrade the model until the service's analysis of all audio resources for current
|
2884
3123
|
# requests completes.
|
2885
3124
|
#
|
2886
|
-
# To determine the status of the service's analysis of the audio, use the
|
2887
|
-
# audio resource
|
2888
|
-
# customization ID of the custom model and the name of the audio
|
2889
|
-
# returns the status of the resource. Use a loop to check the
|
2890
|
-
# every few seconds until it becomes `ok`.
|
3125
|
+
# To determine the status of the service's analysis of the audio, use the [Get an
|
3126
|
+
# audio resource](#getaudio) method to poll the status of the audio. The method
|
3127
|
+
# accepts the customization ID of the custom model and the name of the audio
|
3128
|
+
# resource, and it returns the status of the resource. Use a loop to check the
|
3129
|
+
# status of the audio every few seconds until it becomes `ok`.
|
3130
|
+
#
|
3131
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3132
|
+
# previous-generation models. It is not supported for next-generation models.
|
2891
3133
|
#
|
2892
3134
|
# **See also:** [Add audio to the custom acoustic
|
2893
3135
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acoustic#addAudio).
|
@@ -2923,8 +3165,8 @@ module IBMWatson
|
|
2923
3165
|
# If the sampling rate of the audio is lower than the minimum required rate, the
|
2924
3166
|
# service labels the audio file as `invalid`.
|
2925
3167
|
#
|
2926
|
-
# **See also:** [
|
2927
|
-
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats
|
3168
|
+
# **See also:** [Supported audio
|
3169
|
+
# formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
|
2928
3170
|
#
|
2929
3171
|
#
|
2930
3172
|
# ### Content types for archive-type resources
|
@@ -2982,7 +3224,7 @@ module IBMWatson
|
|
2982
3224
|
# For an archive-type resource, the media type of the archive file. For more
|
2983
3225
|
# information, see **Content types for archive-type resources** in the method
|
2984
3226
|
# description.
|
2985
|
-
# @param contained_content_type [String]
|
3227
|
+
# @param contained_content_type [String] _For an archive-type resource_, specify the format of the audio files that are
|
2986
3228
|
# contained in the archive file if they are of type `audio/alaw`, `audio/basic`,
|
2987
3229
|
# `audio/l16`, or `audio/mulaw`. Include the `rate`, `channels`, and `endianness`
|
2988
3230
|
# parameters where necessary. In this case, all audio files that are contained in
|
@@ -2996,7 +3238,7 @@ module IBMWatson
|
|
2996
3238
|
# speech recognition. For more information, see **Content types for audio-type
|
2997
3239
|
# resources** in the method description.
|
2998
3240
|
#
|
2999
|
-
#
|
3241
|
+
# _For an audio-type resource_, omit the header.
|
3000
3242
|
# @param allow_overwrite [Boolean] If `true`, the specified audio resource overwrites an existing audio resource with
|
3001
3243
|
# the same name. If `false`, the request fails if an audio resource with the same
|
3002
3244
|
# name already exists. The parameter has no effect if an audio resource with the
|
@@ -3041,9 +3283,9 @@ module IBMWatson
|
|
3041
3283
|
# Gets information about an audio resource from a custom acoustic model. The method
|
3042
3284
|
# returns an `AudioListing` object whose fields depend on the type of audio resource
|
3043
3285
|
# that you specify with the method's `audio_name` parameter:
|
3044
|
-
# *
|
3286
|
+
# * _For an audio-type resource_, the object's fields match those of an
|
3045
3287
|
# `AudioResource` object: `duration`, `name`, `details`, and `status`.
|
3046
|
-
# *
|
3288
|
+
# * _For an archive-type resource_, the object includes a `container` field whose
|
3047
3289
|
# fields match those of an `AudioResource` object. It also includes an `audio`
|
3048
3290
|
# field, which contains an array of `AudioResource` objects that provides
|
3049
3291
|
# information about the audio files that are contained in the archive.
|
@@ -3051,14 +3293,17 @@ module IBMWatson
|
|
3051
3293
|
# The information includes the status of the specified audio resource. The status is
|
3052
3294
|
# important for checking the service's analysis of a resource that you add to the
|
3053
3295
|
# custom model.
|
3054
|
-
# *
|
3055
|
-
# object.
|
3056
|
-
# *
|
3296
|
+
# * _For an audio-type resource_, the `status` field is located in the
|
3297
|
+
# `AudioListing` object.
|
3298
|
+
# * _For an archive-type resource_, the `status` field is located in the
|
3057
3299
|
# `AudioResource` object that is returned in the `container` field.
|
3058
3300
|
#
|
3059
3301
|
# You must use credentials for the instance of the service that owns a model to list
|
3060
3302
|
# its audio resources.
|
3061
3303
|
#
|
3304
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3305
|
+
# previous-generation models. It is not supported for next-generation models.
|
3306
|
+
#
|
3062
3307
|
# **See also:** [Listing audio resources for a custom acoustic
|
3063
3308
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#listAudio).
|
3064
3309
|
# @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
|
@@ -3095,10 +3340,14 @@ module IBMWatson
|
|
3095
3340
|
# not allow deletion of individual files from an archive resource.
|
3096
3341
|
#
|
3097
3342
|
# Removing an audio resource does not affect the custom model until you train the
|
3098
|
-
# model on its updated data by using the
|
3099
|
-
# You can delete an existing audio resource from
|
3100
|
-
# is being added to the model. You must use
|
3101
|
-
# service that owns a model to delete its audio
|
3343
|
+
# model on its updated data by using the [Train a custom acoustic
|
3344
|
+
# model](#trainacousticmodel) method. You can delete an existing audio resource from
|
3345
|
+
# a model while a different resource is being added to the model. You must use
|
3346
|
+
# credentials for the instance of the service that owns a model to delete its audio
|
3347
|
+
# resources.
|
3348
|
+
#
|
3349
|
+
# **Note:** Acoustic model customization is supported only for use with
|
3350
|
+
# previous-generation models. It is not supported for next-generation models.
|
3102
3351
|
#
|
3103
3352
|
# **See also:** [Deleting an audio resource from a custom acoustic
|
3104
3353
|
# model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAudio#deleteAudio).
|