ibm_watson 2.0.2 → 2.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -14,17 +14,20 @@
14
14
  # See the License for the specific language governing permissions and
15
15
  # limitations under the License.
16
16
  #
17
- # IBM OpenAPI SDK Code Generator Version: 3.19.0-be3b4618-20201113-200858
17
+ # IBM OpenAPI SDK Code Generator Version: 3.31.0-902c9336-20210504-161156
18
18
  #
19
- # IBM® will begin sunsetting IBM Watson™ Personality Insights on 1 December
20
- # 2020. For a period of one year from this date, you will still be able to use Watson
21
- # Personality Insights. However, as of 1 December 2021, the offering will no longer be
22
- # available.<br/><br/>As an alternative, we encourage you to consider migrating to IBM
23
- # Watson&trade; Natural Language Understanding, a service on IBM Cloud&reg; that uses deep
24
- # learning to extract data and insights from text such as keywords, categories, sentiment,
25
- # emotion, and syntax to provide insights for your business or industry. For more
26
- # information, see [About Natural Language
27
- # Understanding](https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-about).
19
+ # IBM Watson&trade; Personality Insights is discontinued. Existing instances are
20
+ # supported until 1 December 2021, but as of 1 December 2020, you cannot create new
21
+ # instances. Any instance that exists on 1 December 2021 will be deleted.<br/><br/>No
22
+ # direct replacement exists for Personality Insights. However, you can consider using [IBM
23
+ # Watson&trade; Natural Language
24
+ # Understanding](https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-about)
25
+ # on IBM Cloud&reg; as part of a replacement analytic workflow for your Personality
26
+ # Insights use cases. You can use Natural Language Understanding to extract data and
27
+ # insights from text, such as keywords, categories, sentiment, emotion, and syntax. For
28
+ # more information about the personality models in Personality Insights, see [The science
29
+ # behind the
30
+ # service](https://cloud.ibm.com/docs/personality-insights?topic=personality-insights-science).
28
31
  # {: deprecated}
29
32
  #
30
33
  # The IBM Watson Personality Insights service enables applications to derive insights from
@@ -54,7 +57,6 @@ require "json"
54
57
  require "ibm_cloud_sdk_core"
55
58
  require_relative "./common.rb"
56
59
 
57
- # Module for the Watson APIs
58
60
  module IBMWatson
59
61
  ##
60
62
  # The Personality Insights V3 service.
@@ -14,14 +14,21 @@
14
14
  # See the License for the specific language governing permissions and
15
15
  # limitations under the License.
16
16
  #
17
- # IBM OpenAPI SDK Code Generator Version: 3.19.0-be3b4618-20201113-200858
17
+ # IBM OpenAPI SDK Code Generator Version: 3.31.0-902c9336-20210504-161156
18
18
  #
19
19
  # The IBM Watson&trade; Speech to Text service provides APIs that use IBM's
20
20
  # speech-recognition capabilities to produce transcripts of spoken audio. The service can
21
21
  # transcribe speech from various languages and audio formats. In addition to basic
22
22
  # transcription, the service can produce detailed information about many different aspects
23
- # of the audio. For most languages, the service supports two sampling rates, broadband and
24
- # narrowband. It returns all JSON response content in the UTF-8 character set.
23
+ # of the audio. It returns all JSON response content in the UTF-8 character set.
24
+ #
25
+ # The service supports two types of models: previous-generation models that include the
26
+ # terms `Broadband` and `Narrowband` in their names, and beta next-generation models that
27
+ # include the terms `Multimedia` and `Telephony` in their names. Broadband and multimedia
28
+ # models have minimum sampling rates of 16 kHz. Narrowband and telephony models have
29
+ # minimum sampling rates of 8 kHz. The beta next-generation models currently support fewer
30
+ # languages and features, but they offer high throughput and greater transcription
31
+ # accuracy.
25
32
  #
26
33
  # For speech recognition, the service supports synchronous and asynchronous HTTP
27
34
  # Representational State Transfer (REST) interfaces. It also supports a WebSocket
@@ -37,8 +44,9 @@
37
44
  # can recognize.
38
45
  #
39
46
  # Language model customization and acoustic model customization are generally available
40
- # for production use with all language models that are generally available. Grammars are
41
- # beta functionality for all language models that support language model customization.
47
+ # for production use with all previous-generation models that are generally available.
48
+ # Grammars are beta functionality for all previous-generation models that support language
49
+ # model customization. Next-generation models do not support customization at this time.
42
50
 
43
51
  require "concurrent"
44
52
  require "erb"
@@ -46,7 +54,6 @@ require "json"
46
54
  require "ibm_cloud_sdk_core"
47
55
  require_relative "./common.rb"
48
56
 
49
- # Module for the Watson APIs
50
57
  module IBMWatson
51
58
  ##
52
59
  # The Speech to Text V1 service.
@@ -89,8 +96,8 @@ module IBMWatson
89
96
  # among other things. The ordering of the list of models can change from call to
90
97
  # call; do not rely on an alphabetized or static list of models.
91
98
  #
92
- # **See also:** [Languages and
93
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models#models).
99
+ # **See also:** [Listing
100
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-list).
94
101
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
95
102
  def list_models
96
103
  headers = {
@@ -116,10 +123,11 @@ module IBMWatson
116
123
  # with the service. The information includes the name of the model and its minimum
117
124
  # sampling rate in Hertz, among other things.
118
125
  #
119
- # **See also:** [Languages and
120
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models#models).
126
+ # **See also:** [Listing
127
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-list).
121
128
  # @param model_id [String] The identifier of the model in the form of its name from the output of the **Get a
122
- # model** method.
129
+ # model** method. (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
130
+ # `ar-MS_BroadbandModel` instead.).
123
131
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
124
132
  def get_model(model_id:)
125
133
  raise ArgumentError.new("model_id must be provided") if model_id.nil?
@@ -144,7 +152,7 @@ module IBMWatson
144
152
  #########################
145
153
 
146
154
  ##
147
- # @!method recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
155
+ # @!method recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
148
156
  # Recognize audio.
149
157
  # Sends audio and returns transcription results for a recognition request. You can
150
158
  # pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The
@@ -211,8 +219,40 @@ module IBMWatson
211
219
  # sampling rate of the audio is lower than the minimum required rate, the request
212
220
  # fails.
213
221
  #
214
- # **See also:** [Audio
215
- # formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats#audio-formats).
222
+ # **See also:** [Supported audio
223
+ # formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
224
+ #
225
+ #
226
+ # ### Next-generation models
227
+ #
228
+ # **Note:** The next-generation language models are beta functionality. They
229
+ # support a limited number of languages and features at this time. The supported
230
+ # languages, models, and features will increase with future releases.
231
+ #
232
+ # The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8 kHz)
233
+ # models for many languages. Next-generation models have higher throughput than the
234
+ # service's previous generation of `Broadband` and `Narrowband` models. When you use
235
+ # next-generation models, the service can return transcriptions more quickly and
236
+ # also provide noticeably better transcription accuracy.
237
+ #
238
+ # You specify a next-generation model by using the `model` query parameter, as you
239
+ # do a previous-generation model. Next-generation models support the same request
240
+ # headers as previous-generation models, but they support only the following
241
+ # additional query parameters:
242
+ # * `background_audio_suppression`
243
+ # * `inactivity_timeout`
244
+ # * `profanity_filter`
245
+ # * `redaction`
246
+ # * `smart_formatting`
247
+ # * `speaker_labels`
248
+ # * `speech_detector_sensitivity`
249
+ # * `timestamps`
250
+ #
251
+ # Many next-generation models also support the beta `low_latency` parameter, which
252
+ # is not available with previous-generation models.
253
+ #
254
+ # **See also:** [Next-generation languages and
255
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
216
256
  #
217
257
  #
218
258
  # ### Multipart speech recognition
@@ -235,15 +275,19 @@ module IBMWatson
235
275
  # @param audio [File] The audio to transcribe.
236
276
  # @param content_type [String] The format (MIME type) of the audio. For more information about specifying an
237
277
  # audio format, see **Audio formats (content types)** in the method description.
238
- # @param model [String] The identifier of the model that is to be used for the recognition request. See
239
- # [Languages and
240
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models#models).
278
+ # @param model [String] The identifier of the model that is to be used for the recognition request.
279
+ # (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
280
+ # `ar-MS_BroadbandModel` instead.) See [Languages and
281
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models) and
282
+ # [Next-generation languages and
283
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
241
284
  # @param language_customization_id [String] The customization ID (GUID) of a custom language model that is to be used with the
242
285
  # recognition request. The base model of the specified custom language model must
243
286
  # match the model specified with the `model` parameter. You must make the request
244
287
  # with credentials for the instance of the service that owns the custom model. By
245
- # default, no custom language model is used. See [Custom
246
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#custom-input).
288
+ # default, no custom language model is used. See [Using a custom language model for
289
+ # speech
290
+ # recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse).
247
291
  #
248
292
  #
249
293
  # **Note:** Use this parameter instead of the deprecated `customization_id`
@@ -252,14 +296,16 @@ module IBMWatson
252
296
  # recognition request. The base model of the specified custom acoustic model must
253
297
  # match the model specified with the `model` parameter. You must make the request
254
298
  # with credentials for the instance of the service that owns the custom model. By
255
- # default, no custom acoustic model is used. See [Custom
256
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#custom-input).
299
+ # default, no custom acoustic model is used. See [Using a custom acoustic model for
300
+ # speech
301
+ # recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acousticUse).
257
302
  # @param base_model_version [String] The version of the specified base model that is to be used with the recognition
258
303
  # request. Multiple versions of a base model can exist when a model is updated for
259
304
  # internal improvements. The parameter is intended primarily for use with custom
260
305
  # models that have been upgraded for a new base model. The default value depends on
261
- # whether the parameter is used with or without a custom model. See [Base model
262
- # version](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#version).
306
+ # whether the parameter is used with or without a custom model. See [Making speech
307
+ # recognition requests with upgraded custom
308
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade-use#custom-upgrade-use-recognition).
263
309
  # @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
264
310
  # recognition request, the customization weight tells the service how much weight to
265
311
  # give to words from the custom language model compared to those from the base model
@@ -276,8 +322,8 @@ module IBMWatson
276
322
  # custom model's domain, but it can negatively affect performance on non-domain
277
323
  # phrases.
278
324
  #
279
- # See [Custom
280
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#custom-input).
325
+ # See [Using customization
326
+ # weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
281
327
  # @param inactivity_timeout [Fixnum] The time in seconds after which, if only silence (no speech) is detected in
282
328
  # streaming audio, the connection is closed with a 400 error. The parameter is
283
329
  # useful for stopping audio submission from a live microphone when a user simply
@@ -294,34 +340,34 @@ module IBMWatson
294
340
  # for double-byte languages might be shorter. Keywords are case-insensitive.
295
341
  #
296
342
  # See [Keyword
297
- # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#keyword_spotting).
343
+ # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
298
344
  # @param keywords_threshold [Float] A confidence value that is the lower bound for spotting a keyword. A word is
299
345
  # considered to match a keyword if its confidence is greater than or equal to the
300
346
  # threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold,
301
347
  # you must also specify one or more keywords. The service performs no keyword
302
348
  # spotting if you omit either parameter. See [Keyword
303
- # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#keyword_spotting).
349
+ # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
304
350
  # @param max_alternatives [Fixnum] The maximum number of alternative transcripts that the service is to return. By
305
351
  # default, the service returns a single transcript. If you specify a value of `0`,
306
352
  # the service uses the default value, `1`. See [Maximum
307
- # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#max_alternatives).
353
+ # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#max-alternatives).
308
354
  # @param word_alternatives_threshold [Float] A confidence value that is the lower bound for identifying a hypothesis as a
309
355
  # possible word alternative (also known as "Confusion Networks"). An alternative
310
356
  # word is considered if its confidence is greater than or equal to the threshold.
311
357
  # Specify a probability between 0.0 and 1.0. By default, the service computes no
312
358
  # alternative words. See [Word
313
- # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#word_alternatives).
359
+ # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#word-alternatives).
314
360
  # @param word_confidence [Boolean] If `true`, the service returns a confidence measure in the range of 0.0 to 1.0 for
315
361
  # each word. By default, the service returns no word confidence scores. See [Word
316
- # confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#word_confidence).
362
+ # confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-confidence).
317
363
  # @param timestamps [Boolean] If `true`, the service returns time alignment for each word. By default, no
318
364
  # timestamps are returned. See [Word
319
- # timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#word_timestamps).
365
+ # timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-timestamps).
320
366
  # @param profanity_filter [Boolean] If `true`, the service filters profanity from all output except for keyword
321
367
  # results by replacing inappropriate words with a series of asterisks. Set the
322
368
  # parameter to `false` to return results with no censoring. Applies to US English
323
- # transcription only. See [Profanity
324
- # filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#profanity_filter).
369
+ # and Japanese transcription only. See [Profanity
370
+ # filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#profanity-filtering).
325
371
  # @param smart_formatting [Boolean] If `true`, the service converts dates, times, series of digits and numbers, phone
326
372
  # numbers, currency values, and internet addresses into more readable, conventional
327
373
  # representations in the final transcript of a recognition request. For US English,
@@ -331,19 +377,21 @@ module IBMWatson
331
377
  # **Note:** Applies to US English, Japanese, and Spanish transcription only.
332
378
  #
333
379
  # See [Smart
334
- # formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#smart_formatting).
380
+ # formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#smart-formatting).
335
381
  # @param speaker_labels [Boolean] If `true`, the response includes labels that identify which words were spoken by
336
382
  # which participants in a multi-person exchange. By default, the service returns no
337
383
  # speaker labels. Setting `speaker_labels` to `true` forces the `timestamps`
338
384
  # parameter to be `true`, regardless of whether you specify `false` for the
339
385
  # parameter.
340
- #
341
- # **Note:** Applies to US English, Australian English, German, Japanese, Korean, and
342
- # Spanish (both broadband and narrowband models) and UK English (narrowband model)
343
- # transcription only.
344
- #
345
- # See [Speaker
346
- # labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#speaker_labels).
386
+ # * For previous-generation models, can be used for US English, Australian English,
387
+ # German, Japanese, Korean, and Spanish (both broadband and narrowband models) and
388
+ # UK English (narrowband model) transcription only.
389
+ # * For next-generation models, can be used for English (Australian, UK, and US),
390
+ # German, and Spanish transcription only.
391
+ #
392
+ # Restrictions and limitations apply to the use of speaker labels for both types of
393
+ # models. See [Speaker
394
+ # labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
347
395
  # @param customization_id [String] **Deprecated.** Use the `language_customization_id` parameter to specify the
348
396
  # customization ID (GUID) of a custom language model that is to be used with the
349
397
  # recognition request. Do not specify both parameters with a request.
@@ -352,7 +400,8 @@ module IBMWatson
352
400
  # specify the name of the custom language model for which the grammar is defined.
353
401
  # The service recognizes only strings that are recognized by the specified grammar;
354
402
  # it does not recognize other custom words from the model's words resource. See
355
- # [Grammars](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#grammars-input).
403
+ # [Using a grammar for speech
404
+ # recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-grammarUse).
356
405
  # @param redaction [Boolean] If `true`, the service redacts, or masks, numeric data from final transcripts. The
357
406
  # feature redacts any number that has three or more consecutive digits by replacing
358
407
  # each digit with an `X` character. It is intended to redact sensitive numeric data,
@@ -367,13 +416,13 @@ module IBMWatson
367
416
  # **Note:** Applies to US English, Japanese, and Korean transcription only.
368
417
  #
369
418
  # See [Numeric
370
- # redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#redaction).
419
+ # redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#numeric-redaction).
371
420
  # @param audio_metrics [Boolean] If `true`, requests detailed information about the signal characteristics of the
372
421
  # input audio. The service returns audio metrics with the final transcription
373
422
  # results. By default, the service returns no audio metrics.
374
423
  #
375
424
  # See [Audio
376
- # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio_metrics).
425
+ # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio-metrics).
377
426
  # @param end_of_phrase_silence_time [Float] If `true`, specifies the duration of the pause interval at which the service
378
427
  # splits a transcript into multiple final results. If the service detects pauses or
379
428
  # extended silence before it reaches the end of the audio stream, its response can
@@ -390,7 +439,7 @@ module IBMWatson
390
439
  # Chinese is 0.6 seconds.
391
440
  #
392
441
  # See [End of phrase silence
393
- # time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#silence_time).
442
+ # time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#silence-time).
394
443
  # @param split_transcript_at_phrase_end [Boolean] If `true`, directs the service to split the transcript into multiple final results
395
444
  # based on semantic features of the input, for example, at the conclusion of
396
445
  # meaningful phrases such as sentences. The service bases its understanding of
@@ -400,7 +449,7 @@ module IBMWatson
400
449
  # interval.
401
450
  #
402
451
  # See [Split transcript at phrase
403
- # end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#split_transcript).
452
+ # end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#split-transcript).
404
453
  # @param speech_detector_sensitivity [Float] The sensitivity of speech activity detection that the service is to perform. Use
405
454
  # the parameter to suppress word insertions from music, coughing, and other
406
455
  # non-speech events. The service biases the audio it passes for speech recognition
@@ -412,8 +461,8 @@ module IBMWatson
412
461
  # * 0.5 (the default) provides a reasonable compromise for the level of sensitivity.
413
462
  # * 1.0 suppresses no audio (speech detection sensitivity is disabled).
414
463
  #
415
- # The values increase on a monotonic curve. See [Speech Activity
416
- # Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
464
+ # The values increase on a monotonic curve. See [Speech detector
465
+ # sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity).
417
466
  # @param background_audio_suppression [Float] The level to which the service is to suppress background audio based on its volume
418
467
  # to prevent it from being transcribed as speech. Use the parameter to suppress side
419
468
  # conversations or background noise.
@@ -424,10 +473,27 @@ module IBMWatson
424
473
  # * 0.5 provides a reasonable level of audio suppression for general usage.
425
474
  # * 1.0 suppresses all audio (no audio is transcribed).
426
475
  #
427
- # The values increase on a monotonic curve. See [Speech Activity
428
- # Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
476
+ # The values increase on a monotonic curve. See [Background audio
477
+ # suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression).
478
+ # @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
479
+ # latency, directs the service to produce results even more quickly than it usually
480
+ # does. Next-generation models produce transcription results faster than
481
+ # previous-generation models. The `low_latency` parameter causes the models to
482
+ # produce results even more quickly, though the results might be less accurate when
483
+ # the parameter is used.
484
+ #
485
+ # **Note:** The parameter is beta functionality. It is not available for
486
+ # previous-generation `Broadband` and `Narrowband` models. It is available only for
487
+ # some next-generation models.
488
+ #
489
+ # * For a list of next-generation models that support low latency, see [Supported
490
+ # language
491
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported)
492
+ # for next-generation models.
493
+ # * For more information about the `low_latency` parameter, see [Low
494
+ # latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
429
495
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
430
- def recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
496
+ def recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
431
497
  raise ArgumentError.new("audio must be provided") if audio.nil?
432
498
 
433
499
  headers = {
@@ -460,7 +526,8 @@ module IBMWatson
460
526
  "end_of_phrase_silence_time" => end_of_phrase_silence_time,
461
527
  "split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
462
528
  "speech_detector_sensitivity" => speech_detector_sensitivity,
463
- "background_audio_suppression" => background_audio_suppression
529
+ "background_audio_suppression" => background_audio_suppression,
530
+ "low_latency" => low_latency
464
531
  }
465
532
 
466
533
  data = audio
@@ -479,7 +546,7 @@ module IBMWatson
479
546
  end
480
547
 
481
548
  ##
482
- # @!method recognize_using_websocket(content_type: nil,recognize_callback:,audio: nil,chunk_data: false,model: nil,customization_id: nil,acoustic_customization_id: nil,customization_weight: nil,base_model_version: nil,inactivity_timeout: nil,interim_results: nil,keywords: nil,keywords_threshold: nil,max_alternatives: nil,word_alternatives_threshold: nil,word_confidence: nil,timestamps: nil,profanity_filter: nil,smart_formatting: nil,speaker_labels: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
549
+ # @!method recognize_using_websocket(content_type: nil,recognize_callback:,audio: nil,chunk_data: false,model: nil,customization_id: nil,acoustic_customization_id: nil,customization_weight: nil,base_model_version: nil,inactivity_timeout: nil,interim_results: nil,keywords: nil,keywords_threshold: nil,max_alternatives: nil,word_alternatives_threshold: nil,word_confidence: nil,timestamps: nil,profanity_filter: nil,smart_formatting: nil,speaker_labels: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
483
550
  # Sends audio for speech recognition using web sockets.
484
551
  # @param content_type [String] The type of the input: audio/basic, audio/flac, audio/l16, audio/mp3, audio/mpeg, audio/mulaw, audio/ogg, audio/ogg;codecs=opus, audio/ogg;codecs=vorbis, audio/wav, audio/webm, audio/webm;codecs=opus, audio/webm;codecs=vorbis, or multipart/form-data.
485
552
  # @param recognize_callback [RecognizeCallback] The instance handling events returned from the service.
@@ -596,6 +663,23 @@ module IBMWatson
596
663
  #
597
664
  # The values increase on a monotonic curve. See [Speech Activity
598
665
  # Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
666
+ # @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
667
+ # latency, directs the service to produce results even more quickly than it usually
668
+ # does. Next-generation models produce transcription results faster than
669
+ # previous-generation models. The `low_latency` parameter causes the models to
670
+ # produce results even more quickly, though the results might be less accurate when
671
+ # the parameter is used.
672
+ #
673
+ # **Note:** The parameter is beta functionality. It is not available for
674
+ # previous-generation `Broadband` and `Narrowband` models. It is available only for
675
+ # some next-generation models.
676
+ #
677
+ # * For a list of next-generation models that support low latency, see [Supported
678
+ # language
679
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported)
680
+ # for next-generation models.
681
+ # * For more information about the `low_latency` parameter, see [Low
682
+ # latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
599
683
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
600
684
  def recognize_using_websocket(
601
685
  content_type: nil,
@@ -627,7 +711,8 @@ module IBMWatson
627
711
  end_of_phrase_silence_time: nil,
628
712
  split_transcript_at_phrase_end: nil,
629
713
  speech_detector_sensitivity: nil,
630
- background_audio_suppression: nil
714
+ background_audio_suppression: nil,
715
+ low_latency: nil
631
716
  )
632
717
  raise ArgumentError("Audio must be provided") if audio.nil? && !chunk_data
633
718
  raise ArgumentError("Recognize callback must be provided") if recognize_callback.nil?
@@ -669,7 +754,8 @@ module IBMWatson
669
754
  "end_of_phrase_silence_time" => end_of_phrase_silence_time,
670
755
  "split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
671
756
  "speech_detector_sensitivity" => speech_detector_sensitivity,
672
- "background_audio_suppression" => background_audio_suppression
757
+ "background_audio_suppression" => background_audio_suppression,
758
+ "low_latency" => low_latency
673
759
  }
674
760
  options.delete_if { |_, v| v.nil? }
675
761
  WebSocketClient.new(audio: audio, chunk_data: chunk_data, options: options, recognize_callback: recognize_callback, service_url: service_url, headers: headers, disable_ssl_verification: @disable_ssl_verification)
@@ -787,7 +873,7 @@ module IBMWatson
787
873
  end
788
874
 
789
875
  ##
790
- # @!method create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
876
+ # @!method create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
791
877
  # Create a job.
792
878
  # Creates a job for a new asynchronous recognition request. The job is owned by the
793
879
  # instance of the service whose credentials are used to create it. How you learn the
@@ -883,14 +969,49 @@ module IBMWatson
883
969
  # sampling rate of the audio is lower than the minimum required rate, the request
884
970
  # fails.
885
971
  #
886
- # **See also:** [Audio
887
- # formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats#audio-formats).
972
+ # **See also:** [Supported audio
973
+ # formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
974
+ #
975
+ #
976
+ # ### Next-generation models
977
+ #
978
+ # **Note:** The next-generation language models are beta functionality. They
979
+ # support a limited number of languages and features at this time. The supported
980
+ # languages, models, and features will increase with future releases.
981
+ #
982
+ # The service supports next-generation `Multimedia` (16 kHz) and `Telephony` (8 kHz)
983
+ # models for many languages. Next-generation models have higher throughput than the
984
+ # service's previous generation of `Broadband` and `Narrowband` models. When you use
985
+ # next-generation models, the service can return transcriptions more quickly and
986
+ # also provide noticeably better transcription accuracy.
987
+ #
988
+ # You specify a next-generation model by using the `model` query parameter, as you
989
+ # do a previous-generation model. Next-generation models support the same request
990
+ # headers as previous-generation models, but they support only the following
991
+ # additional query parameters:
992
+ # * `background_audio_suppression`
993
+ # * `inactivity_timeout`
994
+ # * `profanity_filter`
995
+ # * `redaction`
996
+ # * `smart_formatting`
997
+ # * `speaker_labels`
998
+ # * `speech_detector_sensitivity`
999
+ # * `timestamps`
1000
+ #
1001
+ # Many next-generation models also support the beta `low_latency` parameter, which
1002
+ # is not available with previous-generation models.
1003
+ #
1004
+ # **See also:** [Next-generation languages and
1005
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
888
1006
  # @param audio [File] The audio to transcribe.
889
1007
  # @param content_type [String] The format (MIME type) of the audio. For more information about specifying an
890
1008
  # audio format, see **Audio formats (content types)** in the method description.
891
- # @param model [String] The identifier of the model that is to be used for the recognition request. See
892
- # [Languages and
893
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models#models).
1009
+ # @param model [String] The identifier of the model that is to be used for the recognition request.
1010
+ # (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
1011
+ # `ar-MS_BroadbandModel` instead.) See [Languages and
1012
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models) and
1013
+ # [Next-generation languages and
1014
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng).
894
1015
  # @param callback_url [String] A URL to which callback notifications are to be sent. The URL must already be
895
1016
  # successfully allowlisted by using the **Register a callback** method. You can
896
1017
  # include the same callback URL with any number of job creation requests. Omit the
@@ -929,8 +1050,9 @@ module IBMWatson
929
1050
  # recognition request. The base model of the specified custom language model must
930
1051
  # match the model specified with the `model` parameter. You must make the request
931
1052
  # with credentials for the instance of the service that owns the custom model. By
932
- # default, no custom language model is used. See [Custom
933
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#custom-input).
1053
+ # default, no custom language model is used. See [Using a custom language model for
1054
+ # speech
1055
+ # recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse).
934
1056
  #
935
1057
  #
936
1058
  # **Note:** Use this parameter instead of the deprecated `customization_id`
@@ -939,14 +1061,16 @@ module IBMWatson
939
1061
  # recognition request. The base model of the specified custom acoustic model must
940
1062
  # match the model specified with the `model` parameter. You must make the request
941
1063
  # with credentials for the instance of the service that owns the custom model. By
942
- # default, no custom acoustic model is used. See [Custom
943
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#custom-input).
1064
+ # default, no custom acoustic model is used. See [Using a custom acoustic model for
1065
+ # speech
1066
+ # recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-acousticUse).
944
1067
  # @param base_model_version [String] The version of the specified base model that is to be used with the recognition
945
1068
  # request. Multiple versions of a base model can exist when a model is updated for
946
1069
  # internal improvements. The parameter is intended primarily for use with custom
947
1070
  # models that have been upgraded for a new base model. The default value depends on
948
- # whether the parameter is used with or without a custom model. See [Base model
949
- # version](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#version).
1071
+ # whether the parameter is used with or without a custom model. See [Making speech
1072
+ # recognition requests with upgraded custom
1073
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade-use#custom-upgrade-use-recognition).
950
1074
  # @param customization_weight [Float] If you specify the customization ID (GUID) of a custom language model with the
951
1075
  # recognition request, the customization weight tells the service how much weight to
952
1076
  # give to words from the custom language model compared to those from the base model
@@ -963,8 +1087,8 @@ module IBMWatson
963
1087
  # custom model's domain, but it can negatively affect performance on non-domain
964
1088
  # phrases.
965
1089
  #
966
- # See [Custom
967
- # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#custom-input).
1090
+ # See [Using customization
1091
+ # weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
968
1092
  # @param inactivity_timeout [Fixnum] The time in seconds after which, if only silence (no speech) is detected in
969
1093
  # streaming audio, the connection is closed with a 400 error. The parameter is
970
1094
  # useful for stopping audio submission from a live microphone when a user simply
@@ -981,34 +1105,34 @@ module IBMWatson
981
1105
  # for double-byte languages might be shorter. Keywords are case-insensitive.
982
1106
  #
983
1107
  # See [Keyword
984
- # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#keyword_spotting).
1108
+ # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
985
1109
  # @param keywords_threshold [Float] A confidence value that is the lower bound for spotting a keyword. A word is
986
1110
  # considered to match a keyword if its confidence is greater than or equal to the
987
1111
  # threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold,
988
1112
  # you must also specify one or more keywords. The service performs no keyword
989
1113
  # spotting if you omit either parameter. See [Keyword
990
- # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#keyword_spotting).
1114
+ # spotting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#keyword-spotting).
991
1115
  # @param max_alternatives [Fixnum] The maximum number of alternative transcripts that the service is to return. By
992
1116
  # default, the service returns a single transcript. If you specify a value of `0`,
993
1117
  # the service uses the default value, `1`. See [Maximum
994
- # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#max_alternatives).
1118
+ # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#max-alternatives).
995
1119
  # @param word_alternatives_threshold [Float] A confidence value that is the lower bound for identifying a hypothesis as a
996
1120
  # possible word alternative (also known as "Confusion Networks"). An alternative
997
1121
  # word is considered if its confidence is greater than or equal to the threshold.
998
1122
  # Specify a probability between 0.0 and 1.0. By default, the service computes no
999
1123
  # alternative words. See [Word
1000
- # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#word_alternatives).
1124
+ # alternatives](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-spotting#word-alternatives).
1001
1125
  # @param word_confidence [Boolean] If `true`, the service returns a confidence measure in the range of 0.0 to 1.0 for
1002
1126
  # each word. By default, the service returns no word confidence scores. See [Word
1003
- # confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#word_confidence).
1127
+ # confidence](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-confidence).
1004
1128
  # @param timestamps [Boolean] If `true`, the service returns time alignment for each word. By default, no
1005
1129
  # timestamps are returned. See [Word
1006
- # timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#word_timestamps).
1130
+ # timestamps](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metadata#word-timestamps).
1007
1131
  # @param profanity_filter [Boolean] If `true`, the service filters profanity from all output except for keyword
1008
1132
  # results by replacing inappropriate words with a series of asterisks. Set the
1009
1133
  # parameter to `false` to return results with no censoring. Applies to US English
1010
- # transcription only. See [Profanity
1011
- # filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#profanity_filter).
1134
+ # and Japanese transcription only. See [Profanity
1135
+ # filtering](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#profanity-filtering).
1012
1136
  # @param smart_formatting [Boolean] If `true`, the service converts dates, times, series of digits and numbers, phone
1013
1137
  # numbers, currency values, and internet addresses into more readable, conventional
1014
1138
  # representations in the final transcript of a recognition request. For US English,
@@ -1018,19 +1142,21 @@ module IBMWatson
1018
1142
  # **Note:** Applies to US English, Japanese, and Spanish transcription only.
1019
1143
  #
1020
1144
  # See [Smart
1021
- # formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#smart_formatting).
1145
+ # formatting](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#smart-formatting).
1022
1146
  # @param speaker_labels [Boolean] If `true`, the response includes labels that identify which words were spoken by
1023
1147
  # which participants in a multi-person exchange. By default, the service returns no
1024
1148
  # speaker labels. Setting `speaker_labels` to `true` forces the `timestamps`
1025
1149
  # parameter to be `true`, regardless of whether you specify `false` for the
1026
1150
  # parameter.
1027
- #
1028
- # **Note:** Applies to US English, Australian English, German, Japanese, Korean, and
1029
- # Spanish (both broadband and narrowband models) and UK English (narrowband model)
1030
- # transcription only.
1031
- #
1032
- # See [Speaker
1033
- # labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#speaker_labels).
1151
+ # * For previous-generation models, can be used for US English, Australian English,
1152
+ # German, Japanese, Korean, and Spanish (both broadband and narrowband models) and
1153
+ # UK English (narrowband model) transcription only.
1154
+ # * For next-generation models, can be used for English (Australian, UK, and US),
1155
+ # German, and Spanish transcription only.
1156
+ #
1157
+ # Restrictions and limitations apply to the use of speaker labels for both types of
1158
+ # models. See [Speaker
1159
+ # labels](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-speaker-labels).
1034
1160
  # @param customization_id [String] **Deprecated.** Use the `language_customization_id` parameter to specify the
1035
1161
  # customization ID (GUID) of a custom language model that is to be used with the
1036
1162
  # recognition request. Do not specify both parameters with a request.
@@ -1039,7 +1165,8 @@ module IBMWatson
1039
1165
  # specify the name of the custom language model for which the grammar is defined.
1040
1166
  # The service recognizes only strings that are recognized by the specified grammar;
1041
1167
  # it does not recognize other custom words from the model's words resource. See
1042
- # [Grammars](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#grammars-input).
1168
+ # [Using a grammar for speech
1169
+ # recognition](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-grammarUse).
1043
1170
  # @param redaction [Boolean] If `true`, the service redacts, or masks, numeric data from final transcripts. The
1044
1171
  # feature redacts any number that has three or more consecutive digits by replacing
1045
1172
  # each digit with an `X` character. It is intended to redact sensitive numeric data,
@@ -1054,7 +1181,7 @@ module IBMWatson
1054
1181
  # **Note:** Applies to US English, Japanese, and Korean transcription only.
1055
1182
  #
1056
1183
  # See [Numeric
1057
- # redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#redaction).
1184
+ # redaction](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-formatting#numeric-redaction).
1058
1185
  # @param processing_metrics [Boolean] If `true`, requests processing metrics about the service's transcription of the
1059
1186
  # input audio. The service returns processing metrics at the interval specified by
1060
1187
  # the `processing_metrics_interval` parameter. It also returns processing metrics
@@ -1062,7 +1189,7 @@ module IBMWatson
1062
1189
  # the service returns no processing metrics.
1063
1190
  #
1064
1191
  # See [Processing
1065
- # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing_metrics).
1192
+ # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing-metrics).
1066
1193
  # @param processing_metrics_interval [Float] Specifies the interval in real wall-clock seconds at which the service is to
1067
1194
  # return processing metrics. The parameter is ignored unless the
1068
1195
  # `processing_metrics` parameter is set to `true`.
@@ -1076,13 +1203,13 @@ module IBMWatson
1076
1203
  # the service returns processing metrics only for transcription events.
1077
1204
  #
1078
1205
  # See [Processing
1079
- # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing_metrics).
1206
+ # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#processing-metrics).
1080
1207
  # @param audio_metrics [Boolean] If `true`, requests detailed information about the signal characteristics of the
1081
1208
  # input audio. The service returns audio metrics with the final transcription
1082
1209
  # results. By default, the service returns no audio metrics.
1083
1210
  #
1084
1211
  # See [Audio
1085
- # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio_metrics).
1212
+ # metrics](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-metrics#audio-metrics).
1086
1213
  # @param end_of_phrase_silence_time [Float] If `true`, specifies the duration of the pause interval at which the service
1087
1214
  # splits a transcript into multiple final results. If the service detects pauses or
1088
1215
  # extended silence before it reaches the end of the audio stream, its response can
@@ -1099,7 +1226,7 @@ module IBMWatson
1099
1226
  # Chinese is 0.6 seconds.
1100
1227
  #
1101
1228
  # See [End of phrase silence
1102
- # time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#silence_time).
1229
+ # time](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#silence-time).
1103
1230
  # @param split_transcript_at_phrase_end [Boolean] If `true`, directs the service to split the transcript into multiple final results
1104
1231
  # based on semantic features of the input, for example, at the conclusion of
1105
1232
  # meaningful phrases such as sentences. The service bases its understanding of
@@ -1109,7 +1236,7 @@ module IBMWatson
1109
1236
  # interval.
1110
1237
  #
1111
1238
  # See [Split transcript at phrase
1112
- # end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-output#split_transcript).
1239
+ # end](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-parsing#split-transcript).
1113
1240
  # @param speech_detector_sensitivity [Float] The sensitivity of speech activity detection that the service is to perform. Use
1114
1241
  # the parameter to suppress word insertions from music, coughing, and other
1115
1242
  # non-speech events. The service biases the audio it passes for speech recognition
@@ -1121,8 +1248,8 @@ module IBMWatson
1121
1248
  # * 0.5 (the default) provides a reasonable compromise for the level of sensitivity.
1122
1249
  # * 1.0 suppresses no audio (speech detection sensitivity is disabled).
1123
1250
  #
1124
- # The values increase on a monotonic curve. See [Speech Activity
1125
- # Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
1251
+ # The values increase on a monotonic curve. See [Speech detector
1252
+ # sensitivity](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-sensitivity).
1126
1253
  # @param background_audio_suppression [Float] The level to which the service is to suppress background audio based on its volume
1127
1254
  # to prevent it from being transcribed as speech. Use the parameter to suppress side
1128
1255
  # conversations or background noise.
@@ -1133,10 +1260,27 @@ module IBMWatson
1133
1260
  # * 0.5 provides a reasonable level of audio suppression for general usage.
1134
1261
  # * 1.0 suppresses all audio (no audio is transcribed).
1135
1262
  #
1136
- # The values increase on a monotonic curve. See [Speech Activity
1137
- # Detection](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-input#detection).
1263
+ # The values increase on a monotonic curve. See [Background audio
1264
+ # suppression](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-detection#detection-parameters-suppression).
1265
+ # @param low_latency [Boolean] If `true` for next-generation `Multimedia` and `Telephony` models that support low
1266
+ # latency, directs the service to produce results even more quickly than it usually
1267
+ # does. Next-generation models produce transcription results faster than
1268
+ # previous-generation models. The `low_latency` parameter causes the models to
1269
+ # produce results even more quickly, though the results might be less accurate when
1270
+ # the parameter is used.
1271
+ #
1272
+ # **Note:** The parameter is beta functionality. It is not available for
1273
+ # previous-generation `Broadband` and `Narrowband` models. It is available only for
1274
+ # some next-generation models.
1275
+ #
1276
+ # * For a list of next-generation models that support low latency, see [Supported
1277
+ # language
1278
+ # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models-ng#models-ng-supported)
1279
+ # for next-generation models.
1280
+ # * For more information about the `low_latency` parameter, see [Low
1281
+ # latency](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-interim#low-latency).
1138
1282
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1139
- def create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil)
1283
+ def create_job(audio:, content_type: nil, model: nil, callback_url: nil, events: nil, user_token: nil, results_ttl: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil, processing_metrics: nil, processing_metrics_interval: nil, audio_metrics: nil, end_of_phrase_silence_time: nil, split_transcript_at_phrase_end: nil, speech_detector_sensitivity: nil, background_audio_suppression: nil, low_latency: nil)
1140
1284
  raise ArgumentError.new("audio must be provided") if audio.nil?
1141
1285
 
1142
1286
  headers = {
@@ -1175,7 +1319,8 @@ module IBMWatson
1175
1319
  "end_of_phrase_silence_time" => end_of_phrase_silence_time,
1176
1320
  "split_transcript_at_phrase_end" => split_transcript_at_phrase_end,
1177
1321
  "speech_detector_sensitivity" => speech_detector_sensitivity,
1178
- "background_audio_suppression" => background_audio_suppression
1322
+ "background_audio_suppression" => background_audio_suppression,
1323
+ "low_latency" => low_latency
1179
1324
  }
1180
1325
 
1181
1326
  data = audio
@@ -1393,7 +1538,8 @@ module IBMWatson
1393
1538
  # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageLanguageModels#listModels-language).
1394
1539
  # @param language [String] The identifier of the language for which custom language or custom acoustic models
1395
1540
  # are to be returned. Omit the parameter to see all custom language or custom
1396
- # acoustic models that are owned by the requesting credentials.
1541
+ # acoustic models that are owned by the requesting credentials. (**Note:** The
1542
+ # identifier `ar-AR` is deprecated; use `ar-MS` instead.)
1397
1543
  #
1398
1544
  # To determine the languages for which customization is available, see [Language
1399
1545
  # support for
@@ -1548,6 +1694,9 @@ module IBMWatson
1548
1694
  # The value that you assign is used for all recognition requests that use the model.
1549
1695
  # You can override it for any recognition request by specifying a customization
1550
1696
  # weight for that request.
1697
+ #
1698
+ # See [Using customization
1699
+ # weight](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-languageUse#weight).
1551
1700
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1552
1701
  def train_language_model(customization_id:, word_type_to_add: nil, customization_weight: nil)
1553
1702
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
@@ -1629,7 +1778,7 @@ module IBMWatson
1629
1778
  # subsequent requests for the model until the upgrade completes.
1630
1779
  #
1631
1780
  # **See also:** [Upgrading a custom language
1632
- # model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-customUpgrade#upgradeLanguage).
1781
+ # model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-language).
1633
1782
  # @param customization_id [String] The customization ID (GUID) of the custom language model that is to be used for
1634
1783
  # the request. You must make the request with credentials for the instance of the
1635
1784
  # service that owns the custom model.
@@ -2468,7 +2617,8 @@ module IBMWatson
2468
2617
  # custom model`.
2469
2618
  # @param base_model_name [String] The name of the base language model that is to be customized by the new custom
2470
2619
  # acoustic model. The new custom model can be used only with the base model that it
2471
- # customizes.
2620
+ # customizes. (**Note:** The model `ar-AR_BroadbandModel` is deprecated; use
2621
+ # `ar-MS_BroadbandModel` instead.)
2472
2622
  #
2473
2623
  # To determine whether a base model supports acoustic model customization, refer to
2474
2624
  # [Language support for
@@ -2517,7 +2667,8 @@ module IBMWatson
2517
2667
  # models](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-manageAcousticModels#listModels-acoustic).
2518
2668
  # @param language [String] The identifier of the language for which custom language or custom acoustic models
2519
2669
  # are to be returned. Omit the parameter to see all custom language or custom
2520
- # acoustic models that are owned by the requesting credentials.
2670
+ # acoustic models that are owned by the requesting credentials. (**Note:** The
2671
+ # identifier `ar-AR` is deprecated; use `ar-MS` instead.)
2521
2672
  #
2522
2673
  # To determine the languages for which customization is available, see [Language
2523
2674
  # support for
@@ -2771,7 +2922,7 @@ module IBMWatson
2771
2922
  # acoustic model was not trained with a custom language model.
2772
2923
  #
2773
2924
  # **See also:** [Upgrading a custom acoustic
2774
- # model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-customUpgrade#upgradeAcoustic).
2925
+ # model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
2775
2926
  # @param customization_id [String] The customization ID (GUID) of the custom acoustic model that is to be used for
2776
2927
  # the request. You must make the request with credentials for the instance of the
2777
2928
  # service that owns the custom model.
@@ -2785,7 +2936,7 @@ module IBMWatson
2785
2936
  # upgrade of a custom acoustic model that is trained with a custom language model,
2786
2937
  # and only if you receive a 400 response code and the message `No input data
2787
2938
  # modified since last training`. See [Upgrading a custom acoustic
2788
- # model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-customUpgrade#upgradeAcoustic).
2939
+ # model](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-custom-upgrade#custom-upgrade-acoustic).
2789
2940
  # @return [nil]
2790
2941
  def upgrade_acoustic_model(customization_id:, custom_language_model_id: nil, force: nil)
2791
2942
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
@@ -2923,8 +3074,8 @@ module IBMWatson
2923
3074
  # If the sampling rate of the audio is lower than the minimum required rate, the
2924
3075
  # service labels the audio file as `invalid`.
2925
3076
  #
2926
- # **See also:** [Audio
2927
- # formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats#audio-formats).
3077
+ # **See also:** [Supported audio
3078
+ # formats](https://cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-audio-formats).
2928
3079
  #
2929
3080
  #
2930
3081
  # ### Content types for archive-type resources