ibm_watson 2.0.2 → 2.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -14,7 +14,7 @@
14
14
  # See the License for the specific language governing permissions and
15
15
  # limitations under the License.
16
16
  #
17
- # IBM OpenAPI SDK Code Generator Version: 3.19.0-be3b4618-20201113-200858
17
+ # IBM OpenAPI SDK Code Generator Version: 3.38.0-07189efd-20210827-205025
18
18
  #
19
19
  # The IBM Watson™ Text to Speech service provides APIs that use IBM's
20
20
  # speech-synthesis capabilities to synthesize text into natural-sounding speech in a
@@ -33,8 +33,15 @@
33
33
  # that, when combined, sound like the word. A phonetic translation is based on the SSML
34
34
  # phoneme format for representing a word. You can specify a phonetic translation in
35
35
  # standard International Phonetic Alphabet (IPA) representation or in the proprietary IBM
36
- # Symbolic Phonetic Representation (SPR). The Arabic, Chinese, Dutch, and Korean languages
37
- # support only IPA.
36
+ # Symbolic Phonetic Representation (SPR).
37
+ #
38
+ # The service also offers a Tune by Example feature that lets you define custom prompts.
39
+ # You can also define speaker models to improve the quality of your custom prompts. The
40
+ # service support custom prompts only for US English custom models and voices.
41
+ #
42
+ # **IBM Cloud®.** The Arabic, Chinese, Dutch, Australian English, and Korean languages
43
+ # and voices are supported only for IBM Cloud. For phonetic translation, they support only
44
+ # IPA, not SPR.
38
45
 
39
46
  require "concurrent"
40
47
  require "erb"
@@ -42,7 +49,6 @@ require "json"
42
49
  require "ibm_cloud_sdk_core"
43
50
  require_relative "./common.rb"
44
51
 
45
- # Module for the Watson APIs
46
52
  module IBMWatson
47
53
  ##
48
54
  # The Text to Speech V1 service.
@@ -83,8 +89,8 @@ module IBMWatson
83
89
  # Lists all voices available for use with the service. The information includes the
84
90
  # name, language, gender, and other details about the voice. The ordering of the
85
91
  # list of voices can change from call to call; do not rely on an alphabetized or
86
- # static list of voices. To see information about a specific voice, use the **Get a
87
- # voice** method.
92
+ # static list of voices. To see information about a specific voice, use the [Get a
93
+ # voice](#getvoice).
88
94
  #
89
95
  # **See also:** [Listing all available
90
96
  # voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoices).
@@ -112,12 +118,42 @@ module IBMWatson
112
118
  # Gets information about the specified voice. The information includes the name,
113
119
  # language, gender, and other details about the voice. Specify a customization ID to
114
120
  # obtain information for a custom model that is defined for the language of the
115
- # specified voice. To list information about all available voices, use the **List
116
- # voices** method.
121
+ # specified voice. To list information about all available voices, use the [List
122
+ # voices](#listvoices) method.
117
123
  #
118
124
  # **See also:** [Listing a specific
119
125
  # voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoice).
120
- # @param voice [String] The voice for which information is to be returned.
126
+ #
127
+ #
128
+ # ### Important voice updates for IBM Cloud
129
+ #
130
+ # The service's voices underwent significant change on 2 December 2020.
131
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
132
+ # instead of concatenative.
133
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
134
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
135
+ # `ar-MS` identifier instead.
136
+ # * The standard concatenative voices for the following languages are now
137
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
138
+ # French, German, Italian, Japanese, and Spanish (all dialects).
139
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
140
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
141
+ # of the service's neural voices.
142
+ # * All of the service's voices are now customizable and generally available (GA)
143
+ # for production use.
144
+ #
145
+ # The deprecated voices and features will continue to function for at least one year
146
+ # but might be removed at a future date. You are encouraged to migrate to the
147
+ # equivalent neural voices at your earliest convenience. For more information about
148
+ # all voice updates, see the [2 December 2020 service
149
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
150
+ # in the release notes for IBM Cloud.
151
+ # @param voice [String] The voice for which information is to be returned. For more information about
152
+ # specifying a voice, see **Important voice updates for IBM Cloud** in the method
153
+ # description.
154
+ #
155
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
156
+ # languages and voices are supported only for IBM Cloud.
121
157
  # @param customization_id [String] The customization ID (GUID) of a custom model for which information is to be
122
158
  # returned. You must make the request with credentials for the instance of the
123
159
  # service that owns the custom model. Omit the parameter to see information about
@@ -209,9 +245,33 @@ module IBMWatson
209
245
  # The default sampling rate is 22,050 Hz.
210
246
  #
211
247
  # For more information about specifying an audio format, including additional
212
- # details about some of the formats, see [Audio
213
- # formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audioFormats#audioFormats).
248
+ # details about some of the formats, see [Using audio
249
+ # formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audio-formats).
250
+ #
251
+ #
252
+ # ### Important voice updates for IBM Cloud
253
+ #
254
+ # The service's voices underwent significant change on 2 December 2020.
255
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
256
+ # instead of concatenative.
257
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
258
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
259
+ # `ar-MS` identifier instead.
260
+ # * The standard concatenative voices for the following languages are now
261
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
262
+ # French, German, Italian, Japanese, and Spanish (all dialects).
263
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
264
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
265
+ # of the service's neural voices.
266
+ # * All of the service's voices are now customizable and generally available (GA)
267
+ # for production use.
214
268
  #
269
+ # The deprecated voices and features will continue to function for at least one year
270
+ # but might be removed at a future date. You are encouraged to migrate to the
271
+ # equivalent neural voices at your earliest convenience. For more information about
272
+ # all voice updates, see the [2 December 2020 service
273
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
274
+ # in the release notes for IBM Cloud.
215
275
  #
216
276
  # ### Warning messages
217
277
  #
@@ -226,7 +286,14 @@ module IBMWatson
226
286
  # the `accept` parameter to specify the audio format. For more information about
227
287
  # specifying an audio format, see **Audio formats (accept types)** in the method
228
288
  # description.
229
- # @param voice [String] The voice to use for synthesis.
289
+ # @param voice [String] The voice to use for synthesis. For more information about specifying a voice, see
290
+ # **Important voice updates for IBM Cloud** in the method description.
291
+ #
292
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
293
+ # languages and voices are supported only for IBM Cloud.
294
+ #
295
+ # **See also:** See also [Using languages and
296
+ # voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices).
230
297
  # @param customization_id [String] The customization ID (GUID) of a custom model to use for the synthesis. If a
231
298
  # custom model is specified, it works only if it matches the language of the
232
299
  # indicated voice. You must make the request with credentials for the instance of
@@ -277,13 +344,42 @@ module IBMWatson
277
344
  #
278
345
  # **See also:** [Querying a word from a
279
346
  # language](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryLanguage).
347
+ #
348
+ #
349
+ # ### Important voice updates for IBM Cloud
350
+ #
351
+ # The service's voices underwent significant change on 2 December 2020.
352
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
353
+ # instead of concatenative.
354
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
355
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
356
+ # `ar-MS` identifier instead.
357
+ # * The standard concatenative voices for the following languages are now
358
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
359
+ # French, German, Italian, Japanese, and Spanish (all dialects).
360
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
361
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
362
+ # of the service's neural voices.
363
+ # * All of the service's voices are now customizable and generally available (GA)
364
+ # for production use.
365
+ #
366
+ # The deprecated voices and features will continue to function for at least one year
367
+ # but might be removed at a future date. You are encouraged to migrate to the
368
+ # equivalent neural voices at your earliest convenience. For more information about
369
+ # all voice updates, see the [2 December 2020 service
370
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
371
+ # in the release notes for IBM Cloud.
280
372
  # @param text [String] The word for which the pronunciation is requested.
281
373
  # @param voice [String] A voice that specifies the language in which the pronunciation is to be returned.
282
374
  # All voices for the same language (for example, `en-US`) return the same
283
- # translation.
375
+ # translation. For more information about specifying a voice, see **Important voice
376
+ # updates for IBM Cloud** in the method description.
377
+ #
378
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
379
+ # languages and voices are supported only for IBM Cloud.
284
380
  # @param format [String] The phoneme format in which to return the pronunciation. The Arabic, Chinese,
285
- # Dutch, and Korean languages support only IPA. Omit the parameter to obtain the
286
- # pronunciation in the default format.
381
+ # Dutch, Australian English, and Korean languages support only IPA. Omit the
382
+ # parameter to obtain the pronunciation in the default format.
287
383
  # @param customization_id [String] The customization ID (GUID) of a custom model for which the pronunciation is to be
288
384
  # returned. The language of a specified custom model must match the language of the
289
385
  # specified voice. If the word is not defined in the specified custom model, the
@@ -332,11 +428,40 @@ module IBMWatson
332
428
  #
333
429
  # **See also:** [Creating a custom
334
430
  # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsCreate).
431
+ #
432
+ #
433
+ # ### Important voice updates for IBM Cloud
434
+ #
435
+ # The service's voices underwent significant change on 2 December 2020.
436
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
437
+ # instead of concatenative.
438
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
439
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
440
+ # `ar-MS` identifier instead.
441
+ # * The standard concatenative voices for the following languages are now
442
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
443
+ # French, German, Italian, Japanese, and Spanish (all dialects).
444
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
445
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
446
+ # of the service's neural voices.
447
+ # * All of the service's voices are now customizable and generally available (GA)
448
+ # for production use.
449
+ #
450
+ # The deprecated voices and features will continue to function for at least one year
451
+ # but might be removed at a future date. You are encouraged to migrate to the
452
+ # equivalent neural voices at your earliest convenience. For more information about
453
+ # all voice updates, see the [2 December 2020 service
454
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
455
+ # in the release notes for IBM Cloud.
335
456
  # @param name [String] The name of the new custom model.
336
457
  # @param language [String] The language of the new custom model. You create a custom model for a specific
337
- # language, not for a specific voice. A custom model can be used with any voice,
338
- # standard or neural, for its specified language. Omit the parameter to use the the
339
- # default language, `en-US`.
458
+ # language, not for a specific voice. A custom model can be used with any voice for
459
+ # its specified language. Omit the parameter to use the the default language,
460
+ # `en-US`. **Note:** The `ar-AR` language identifier cannot be used to create a
461
+ # custom model. Use the `ar-MS` identifier instead.
462
+ #
463
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
464
+ # languages and voices are supported only for IBM Cloud.
340
465
  # @param description [String] A description of the new custom model. Specifying a description is recommended.
341
466
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
342
467
  def create_custom_model(name:, language: nil, description: nil)
@@ -370,10 +495,10 @@ module IBMWatson
370
495
  # List custom models.
371
496
  # Lists metadata such as the name and description for all custom models that are
372
497
  # owned by an instance of the service. Specify a language to list the custom models
373
- # for that language only. To see the words in addition to the metadata for a
374
- # specific custom model, use the **List a custom model** method. You must use
375
- # credentials for the instance of the service that owns a model to list information
376
- # about it.
498
+ # for that language only. To see the words and prompts in addition to the metadata
499
+ # for a specific custom model, use the [Get a custom model](#getcustommodel) method.
500
+ # You must use credentials for the instance of the service that owns a model to list
501
+ # information about it.
377
502
  #
378
503
  # **See also:** [Querying all custom
379
504
  # models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).
@@ -473,8 +598,9 @@ module IBMWatson
473
598
  # Get a custom model.
474
599
  # Gets all information about a specified custom model. In addition to metadata such
475
600
  # as the name and description of the custom model, the output includes the words and
476
- # their translations as defined in the model. To see just the metadata for a model,
477
- # use the **List custom models** method.
601
+ # their translations that are defined for the model, as well as any prompts that are
602
+ # defined for the model. To see just the metadata for a model, use the [List custom
603
+ # models](#listcustommodels) method.
478
604
  #
479
605
  # **See also:** [Querying a custom
480
606
  # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQuery).
@@ -565,14 +691,14 @@ module IBMWatson
565
691
  # customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
566
692
  # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
567
693
  # credentials for the instance of the service that owns the custom model.
568
- # @param words [Array[Word]] The **Add custom words** method accepts an array of `Word` objects. Each object
569
- # provides a word that is to be added or updated for the custom model and the word's
570
- # translation.
571
- #
572
- # The **List custom words** method returns an array of `Word` objects. Each object
573
- # shows a word and its translation from the custom model. The words are listed in
574
- # alphabetical order, with uppercase letters listed before lowercase letters. The
575
- # array is empty if the custom model contains no words.
694
+ # @param words [Array[Word]] The [Add custom words](#addwords) method accepts an array of `Word` objects. Each
695
+ # object provides a word that is to be added or updated for the custom model and the
696
+ # word's translation.
697
+ #
698
+ # The [List custom words](#listwords) method returns an array of `Word` objects.
699
+ # Each object shows a word and its translation from the custom model. The words are
700
+ # listed in alphabetical order, with uppercase letters listed before lowercase
701
+ # letters. The array is empty if the custom model contains no words.
576
702
  # @return [nil]
577
703
  def add_words(customization_id:, words:)
578
704
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
@@ -666,9 +792,9 @@ module IBMWatson
666
792
  # @param word [String] The word that is to be added or updated for the custom model.
667
793
  # @param translation [String] The phonetic or sounds-like translation for the word. A phonetic translation is
668
794
  # based on the SSML format for representing the phonetic string of a word either as
669
- # an IPA translation or as an IBM SPR translation. The Arabic, Chinese, Dutch, and
670
- # Korean languages support only IPA. A sounds-like is one or more words that, when
671
- # combined, sound like the word.
795
+ # an IPA translation or as an IBM SPR translation. The Arabic, Chinese, Dutch,
796
+ # Australian English, and Korean languages support only IPA. A sounds-like is one or
797
+ # more words that, when combined, sound like the word.
672
798
  # @param part_of_speech [String] **Japanese only.** The part of speech for the word. The service uses the value to
673
799
  # produce the correct intonation for the word. You can create only a single entry,
674
800
  # with or without a single part of speech, for any word; you cannot create multiple
@@ -772,6 +898,462 @@ module IBMWatson
772
898
  nil
773
899
  end
774
900
  #########################
901
+ # Custom prompts
902
+ #########################
903
+
904
+ ##
905
+ # @!method list_custom_prompts(customization_id:)
906
+ # List custom prompts.
907
+ # Lists information about all custom prompts that are defined for a custom model.
908
+ # The information includes the prompt ID, prompt text, status, and optional speaker
909
+ # ID for each prompt of the custom model. You must use credentials for the instance
910
+ # of the service that owns the custom model. The same information about all of the
911
+ # prompts for a custom model is also provided by the [Get a custom
912
+ # model](#getcustommodel) method. That method provides complete details about a
913
+ # specified custom model, including its language, owner, custom words, and more.
914
+ # Custom prompts are supported only for use with US English custom models and
915
+ # voices.
916
+ #
917
+ # **See also:** [Listing custom
918
+ # prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).
919
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
920
+ # credentials for the instance of the service that owns the custom model.
921
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
922
+ def list_custom_prompts(customization_id:)
923
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
924
+
925
+ headers = {
926
+ }
927
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_custom_prompts")
928
+ headers.merge!(sdk_headers)
929
+
930
+ method_url = "/v1/customizations/%s/prompts" % [ERB::Util.url_encode(customization_id)]
931
+
932
+ response = request(
933
+ method: "GET",
934
+ url: method_url,
935
+ headers: headers,
936
+ accept_json: true
937
+ )
938
+ response
939
+ end
940
+
941
+ ##
942
+ # @!method add_custom_prompt(customization_id:, prompt_id:, metadata:, file:)
943
+ # Add a custom prompt.
944
+ # Adds a custom prompt to a custom model. A prompt is defined by the text that is to
945
+ # be spoken, the audio for that text, a unique user-specified ID for the prompt, and
946
+ # an optional speaker ID. The information is used to generate prosodic data that is
947
+ # not visible to the user. This data is used by the service to produce the
948
+ # synthesized audio upon request. You must use credentials for the instance of the
949
+ # service that owns a custom model to add a prompt to it. You can add a maximum of
950
+ # 1000 custom prompts to a single custom model.
951
+ #
952
+ # You are recommended to assign meaningful values for prompt IDs. For example, use
953
+ # `goodbye` to identify a prompt that speaks a farewell message. Prompt IDs must be
954
+ # unique within a given custom model. You cannot define two prompts with the same
955
+ # name for the same custom model. If you provide the ID of an existing prompt, the
956
+ # previously uploaded prompt is replaced by the new information. The existing prompt
957
+ # is reprocessed by using the new text and audio and, if provided, new speaker
958
+ # model, and the prosody data associated with the prompt is updated.
959
+ #
960
+ # The quality of a prompt is undefined if the language of a prompt does not match
961
+ # the language of its custom model. This is consistent with any text or SSML that is
962
+ # specified for a speech synthesis request. The service makes a best-effort attempt
963
+ # to render the specified text for the prompt; it does not validate that the
964
+ # language of the text matches the language of the model.
965
+ #
966
+ # Adding a prompt is an asynchronous operation. Although it accepts less audio than
967
+ # speaker enrollment, the service must align the audio with the provided text. The
968
+ # time that it takes to process a prompt depends on the prompt itself. The
969
+ # processing time for a reasonably sized prompt generally matches the length of the
970
+ # audio (for example, it takes 20 seconds to process a 20-second prompt).
971
+ #
972
+ # For shorter prompts, you can wait for a reasonable amount of time and then check
973
+ # the status of the prompt with the [Get a custom prompt](#getcustomprompt) method.
974
+ # For longer prompts, consider using that method to poll the service every few
975
+ # seconds to determine when the prompt becomes available. No prompt can be used for
976
+ # speech synthesis if it is in the `processing` or `failed` state. Only prompts that
977
+ # are in the `available` state can be used for speech synthesis.
978
+ #
979
+ # When it processes a request, the service attempts to align the text and the audio
980
+ # that are provided for the prompt. The text that is passed with a prompt must match
981
+ # the spoken audio as closely as possible. Optimally, the text and audio match
982
+ # exactly. The service does its best to align the specified text with the audio, and
983
+ # it can often compensate for mismatches between the two. But if the service cannot
984
+ # effectively align the text and the audio, possibly because the magnitude of
985
+ # mismatches between the two is too great, processing of the prompt fails.
986
+ #
987
+ # ### Evaluating a prompt
988
+ #
989
+ # Always listen to and evaluate a prompt to determine its quality before using it
990
+ # in production. To evaluate a prompt, include only the single prompt in a speech
991
+ # synthesis request by using the following SSML extension, in this case for a prompt
992
+ # whose ID is `goodbye`:
993
+ #
994
+ # `<ibm:prompt id="goodbye"/>`
995
+ #
996
+ # In some cases, you might need to rerecord and resubmit a prompt as many as five
997
+ # times to address the following possible problems:
998
+ # * The service might fail to detect a mismatch between the prompts text and audio.
999
+ # The longer the prompt, the greater the chance for misalignment between its text
1000
+ # and audio. Therefore, multiple shorter prompts are preferable to a single long
1001
+ # prompt.
1002
+ # * The text of a prompt might include a word that the service does not recognize.
1003
+ # In this case, you can create a custom word and pronunciation pair to tell the
1004
+ # service how to pronounce the word. You must then re-create the prompt.
1005
+ # * The quality of the input audio might be insufficient or the services processing
1006
+ # of the audio might fail to detect the intended prosody. Submitting new audio for
1007
+ # the prompt can correct these issues.
1008
+ #
1009
+ # If a prompt that is created without a speaker ID does not adequately reflect the
1010
+ # intended prosody, enrolling the speaker and providing a speaker ID for the prompt
1011
+ # is one recommended means of potentially improving the quality of the prompt. This
1012
+ # is especially important for shorter prompts such as "good-bye" or "thank you,"
1013
+ # where less audio data makes it more difficult to match the prosody of the speaker.
1014
+ # Custom prompts are supported only for use with US English custom models and
1015
+ # voices.
1016
+ #
1017
+ # **See also:**
1018
+ # * [Add a custom
1019
+ # prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-add-prompt)
1020
+ # * [Evaluate a custom
1021
+ # prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-evaluate-prompt)
1022
+ # * [Rules for creating custom
1023
+ # prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-prompts).
1024
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
1025
+ # credentials for the instance of the service that owns the custom model.
1026
+ # @param prompt_id [String] The identifier of the prompt that is to be added to the custom model:
1027
+ # * Include a maximum of 49 characters in the ID.
1028
+ # * Include only alphanumeric characters and `_` (underscores) in the ID.
1029
+ # * Do not include XML sensitive characters (double quotes, single quotes,
1030
+ # ampersands, angle brackets, and slashes) in the ID.
1031
+ # * To add a new prompt, the ID must be unique for the specified custom model.
1032
+ # Otherwise, the new information for the prompt overwrites the existing prompt that
1033
+ # has that ID.
1034
+ # @param metadata [PromptMetadata] Information about the prompt that is to be added to a custom model. The following
1035
+ # example of a `PromptMetadata` object includes both the required prompt text and an
1036
+ # optional speaker model ID:
1037
+ #
1038
+ # `{ "prompt_text": "Thank you and good-bye!", "speaker_id":
1039
+ # "823068b2-ed4e-11ea-b6e0-7b6456aa95cc" }`.
1040
+ # @param file [File] An audio file that speaks the text of the prompt with intonation and prosody that
1041
+ # matches how you would like the prompt to be spoken.
1042
+ # * The prompt audio must be in WAV format and must have a minimum sampling rate of
1043
+ # 16 kHz. The service accepts audio with higher sampling rates. The service
1044
+ # transcodes all audio to 16 kHz before processing it.
1045
+ # * The length of the prompt audio is limited to 30 seconds.
1046
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1047
+ def add_custom_prompt(customization_id:, prompt_id:, metadata:, file:)
1048
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
1049
+
1050
+ raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?
1051
+
1052
+ raise ArgumentError.new("metadata must be provided") if metadata.nil?
1053
+
1054
+ raise ArgumentError.new("file must be provided") if file.nil?
1055
+
1056
+ headers = {
1057
+ }
1058
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_custom_prompt")
1059
+ headers.merge!(sdk_headers)
1060
+
1061
+ form_data = {}
1062
+
1063
+ form_data[:metadata] = HTTP::FormData::Part.new(metadata.to_s, content_type: "application/json")
1064
+
1065
+ unless file.instance_of?(StringIO) || file.instance_of?(File)
1066
+ file = file.respond_to?(:to_json) ? StringIO.new(file.to_json) : StringIO.new(file)
1067
+ end
1068
+ form_data[:file] = HTTP::FormData::File.new(file, content_type: "audio/wav", filename: file.respond_to?(:path) ? file.path : nil)
1069
+
1070
+ method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]
1071
+
1072
+ response = request(
1073
+ method: "POST",
1074
+ url: method_url,
1075
+ headers: headers,
1076
+ form: form_data,
1077
+ accept_json: true
1078
+ )
1079
+ response
1080
+ end
1081
+
1082
+ ##
1083
+ # @!method get_custom_prompt(customization_id:, prompt_id:)
1084
+ # Get a custom prompt.
1085
+ # Gets information about a specified custom prompt for a specified custom model. The
1086
+ # information includes the prompt ID, prompt text, status, and optional speaker ID
1087
+ # for each prompt of the custom model. You must use credentials for the instance of
1088
+ # the service that owns the custom model. Custom prompts are supported only for use
1089
+ # with US English custom models and voices.
1090
+ #
1091
+ # **See also:** [Listing custom
1092
+ # prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).
1093
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
1094
+ # credentials for the instance of the service that owns the custom model.
1095
+ # @param prompt_id [String] The identifier (name) of the prompt.
1096
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1097
+ def get_custom_prompt(customization_id:, prompt_id:)
1098
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
1099
+
1100
+ raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?
1101
+
1102
+ headers = {
1103
+ }
1104
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_custom_prompt")
1105
+ headers.merge!(sdk_headers)
1106
+
1107
+ method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]
1108
+
1109
+ response = request(
1110
+ method: "GET",
1111
+ url: method_url,
1112
+ headers: headers,
1113
+ accept_json: true
1114
+ )
1115
+ response
1116
+ end
1117
+
1118
+ ##
1119
+ # @!method delete_custom_prompt(customization_id:, prompt_id:)
1120
+ # Delete a custom prompt.
1121
+ # Deletes an existing custom prompt from a custom model. The service deletes the
1122
+ # prompt with the specified ID. You must use credentials for the instance of the
1123
+ # service that owns the custom model from which the prompt is to be deleted.
1124
+ #
1125
+ # **Caution:** Deleting a custom prompt elicits a 400 response code from synthesis
1126
+ # requests that attempt to use the prompt. Make sure that you do not attempt to use
1127
+ # a deleted prompt in a production application. Custom prompts are supported only
1128
+ # for use with US English custom models and voices.
1129
+ #
1130
+ # **See also:** [Deleting a custom
1131
+ # prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-delete).
1132
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
1133
+ # credentials for the instance of the service that owns the custom model.
1134
+ # @param prompt_id [String] The identifier (name) of the prompt that is to be deleted.
1135
+ # @return [nil]
1136
+ def delete_custom_prompt(customization_id:, prompt_id:)
1137
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
1138
+
1139
+ raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?
1140
+
1141
+ headers = {
1142
+ }
1143
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_custom_prompt")
1144
+ headers.merge!(sdk_headers)
1145
+
1146
+ method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]
1147
+
1148
+ request(
1149
+ method: "DELETE",
1150
+ url: method_url,
1151
+ headers: headers,
1152
+ accept_json: false
1153
+ )
1154
+ nil
1155
+ end
1156
+ #########################
1157
+ # Speaker models
1158
+ #########################
1159
+
1160
+ ##
1161
+ # @!method list_speaker_models
1162
+ # List speaker models.
1163
+ # Lists information about all speaker models that are defined for a service
1164
+ # instance. The information includes the speaker ID and speaker name of each defined
1165
+ # speaker. You must use credentials for the instance of a service to list its
1166
+ # speakers. Speaker models and the custom prompts with which they are used are
1167
+ # supported only for use with US English custom models and voices.
1168
+ #
1169
+ # **See also:** [Listing speaker
1170
+ # models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list).
1171
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1172
+ def list_speaker_models
1173
+ headers = {
1174
+ }
1175
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_speaker_models")
1176
+ headers.merge!(sdk_headers)
1177
+
1178
+ method_url = "/v1/speakers"
1179
+
1180
+ response = request(
1181
+ method: "GET",
1182
+ url: method_url,
1183
+ headers: headers,
1184
+ accept_json: true
1185
+ )
1186
+ response
1187
+ end
1188
+
1189
+ ##
1190
+ # @!method create_speaker_model(speaker_name:, audio:)
1191
+ # Create a speaker model.
1192
+ # Creates a new speaker model, which is an optional enrollment token for users who
1193
+ # are to add prompts to custom models. A speaker model contains information about a
1194
+ # user's voice. The service extracts this information from a WAV audio sample that
1195
+ # you pass as the body of the request. Associating a speaker model with a prompt is
1196
+ # optional, but the information that is extracted from the speaker model helps the
1197
+ # service learn about the speaker's voice.
1198
+ #
1199
+ # A speaker model can make an appreciable difference in the quality of prompts,
1200
+ # especially short prompts with relatively little audio, that are associated with
1201
+ # that speaker. A speaker model can help the service produce a prompt with more
1202
+ # confidence; the lack of a speaker model can potentially compromise the quality of
1203
+ # a prompt.
1204
+ #
1205
+ # The gender of the speaker who creates a speaker model does not need to match the
1206
+ # gender of a voice that is used with prompts that are associated with that speaker
1207
+ # model. For example, a speaker model that is created by a male speaker can be
1208
+ # associated with prompts that are spoken by female voices.
1209
+ #
1210
+ # You create a speaker model for a given instance of the service. The new speaker
1211
+ # model is owned by the service instance whose credentials are used to create it.
1212
+ # That same speaker can then be used to create prompts for all custom models within
1213
+ # that service instance. No language is associated with a speaker model, but each
1214
+ # custom model has a single specified language. You can add prompts only to US
1215
+ # English models.
1216
+ #
1217
+ # You specify a name for the speaker when you create it. The name must be unique
1218
+ # among all speaker names for the owning service instance. To re-create a speaker
1219
+ # model for an existing speaker name, you must first delete the existing speaker
1220
+ # model that has that name.
1221
+ #
1222
+ # Speaker enrollment is a synchronous operation. Although it accepts more audio data
1223
+ # than a prompt, the process of adding a speaker is very fast. The service simply
1224
+ # extracts information about the speakers voice from the audio. Unlike prompts,
1225
+ # speaker models neither need nor accept a transcription of the audio. When the call
1226
+ # returns, the audio is fully processed and the speaker enrollment is complete.
1227
+ #
1228
+ # The service returns a speaker ID with the request. A speaker ID is globally unique
1229
+ # identifier (GUID) that you use to identify the speaker in subsequent requests to
1230
+ # the service. Speaker models and the custom prompts with which they are used are
1231
+ # supported only for use with US English custom models and voices.
1232
+ #
1233
+ # **See also:**
1234
+ # * [Create a speaker
1235
+ # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-speaker-model)
1236
+ # * [Rules for creating speaker
1237
+ # models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-speakers).
1238
+ # @param speaker_name [String] The name of the speaker that is to be added to the service instance.
1239
+ # * Include a maximum of 49 characters in the name.
1240
+ # * Include only alphanumeric characters and `_` (underscores) in the name.
1241
+ # * Do not include XML sensitive characters (double quotes, single quotes,
1242
+ # ampersands, angle brackets, and slashes) in the name.
1243
+ # * Do not use the name of an existing speaker that is already defined for the
1244
+ # service instance.
1245
+ # @param audio [File] An enrollment audio file that contains a sample of the speakers voice.
1246
+ # * The enrollment audio must be in WAV format and must have a minimum sampling rate
1247
+ # of 16 kHz. The service accepts audio with higher sampling rates. It transcodes all
1248
+ # audio to 16 kHz before processing it.
1249
+ # * The length of the enrollment audio is limited to 1 minute. Speaking one or two
1250
+ # paragraphs of text that include five to ten sentences is recommended.
1251
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1252
+ def create_speaker_model(speaker_name:, audio:)
1253
+ raise ArgumentError.new("speaker_name must be provided") if speaker_name.nil?
1254
+
1255
+ raise ArgumentError.new("audio must be provided") if audio.nil?
1256
+
1257
+ headers = {
1258
+ }
1259
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "create_speaker_model")
1260
+ headers.merge!(sdk_headers)
1261
+
1262
+ params = {
1263
+ "speaker_name" => speaker_name
1264
+ }
1265
+
1266
+ data = audio
1267
+ headers["Content-Type"] = "audio/wav"
1268
+
1269
+ method_url = "/v1/speakers"
1270
+
1271
+ response = request(
1272
+ method: "POST",
1273
+ url: method_url,
1274
+ headers: headers,
1275
+ params: params,
1276
+ data: data,
1277
+ accept_json: true
1278
+ )
1279
+ response
1280
+ end
1281
+
1282
+ ##
1283
+ # @!method get_speaker_model(speaker_id:)
1284
+ # Get a speaker model.
1285
+ # Gets information about all prompts that are defined by a specified speaker for all
1286
+ # custom models that are owned by a service instance. The information is grouped by
1287
+ # the customization IDs of the custom models. For each custom model, the information
1288
+ # lists information about each prompt that is defined for that custom model by the
1289
+ # speaker. You must use credentials for the instance of the service that owns a
1290
+ # speaker model to list its prompts. Speaker models and the custom prompts with
1291
+ # which they are used are supported only for use with US English custom models and
1292
+ # voices.
1293
+ #
1294
+ # **See also:** [Listing the custom prompts for a speaker
1295
+ # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list-prompts).
1296
+ # @param speaker_id [String] The speaker ID (GUID) of the speaker model. You must make the request with service
1297
+ # credentials for the instance of the service that owns the speaker model.
1298
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1299
+ def get_speaker_model(speaker_id:)
1300
+ raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil?
1301
+
1302
+ headers = {
1303
+ }
1304
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_speaker_model")
1305
+ headers.merge!(sdk_headers)
1306
+
1307
+ method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)]
1308
+
1309
+ response = request(
1310
+ method: "GET",
1311
+ url: method_url,
1312
+ headers: headers,
1313
+ accept_json: true
1314
+ )
1315
+ response
1316
+ end
1317
+
1318
+ ##
1319
+ # @!method delete_speaker_model(speaker_id:)
1320
+ # Delete a speaker model.
1321
+ # Deletes an existing speaker model from the service instance. The service deletes
1322
+ # the enrolled speaker with the specified speaker ID. You must use credentials for
1323
+ # the instance of the service that owns a speaker model to delete the speaker.
1324
+ #
1325
+ # Any prompts that are associated with the deleted speaker are not affected by the
1326
+ # speaker's deletion. The prosodic data that defines the quality of a prompt is
1327
+ # established when the prompt is created. A prompt is static and remains unaffected
1328
+ # by deletion of its associated speaker. However, the prompt cannot be resubmitted
1329
+ # or updated with its original speaker once that speaker is deleted. Speaker models
1330
+ # and the custom prompts with which they are used are supported only for use with US
1331
+ # English custom models and voices.
1332
+ #
1333
+ # **See also:** [Deleting a speaker
1334
+ # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-delete).
1335
+ # @param speaker_id [String] The speaker ID (GUID) of the speaker model. You must make the request with service
1336
+ # credentials for the instance of the service that owns the speaker model.
1337
+ # @return [nil]
1338
+ def delete_speaker_model(speaker_id:)
1339
+ raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil?
1340
+
1341
+ headers = {
1342
+ }
1343
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_speaker_model")
1344
+ headers.merge!(sdk_headers)
1345
+
1346
+ method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)]
1347
+
1348
+ request(
1349
+ method: "DELETE",
1350
+ url: method_url,
1351
+ headers: headers,
1352
+ accept_json: false
1353
+ )
1354
+ nil
1355
+ end
1356
+ #########################
775
1357
  # User data
776
1358
  #########################
777
1359