ibm_watson 2.0.2 → 2.1.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -14,7 +14,7 @@
14
14
  # See the License for the specific language governing permissions and
15
15
  # limitations under the License.
16
16
  #
17
- # IBM OpenAPI SDK Code Generator Version: 3.19.0-be3b4618-20201113-200858
17
+ # IBM OpenAPI SDK Code Generator Version: 3.38.0-07189efd-20210827-205025
18
18
  #
19
19
  # The IBM Watson™ Text to Speech service provides APIs that use IBM's
20
20
  # speech-synthesis capabilities to synthesize text into natural-sounding speech in a
@@ -33,8 +33,15 @@
33
33
  # that, when combined, sound like the word. A phonetic translation is based on the SSML
34
34
  # phoneme format for representing a word. You can specify a phonetic translation in
35
35
  # standard International Phonetic Alphabet (IPA) representation or in the proprietary IBM
36
- # Symbolic Phonetic Representation (SPR). The Arabic, Chinese, Dutch, and Korean languages
37
- # support only IPA.
36
+ # Symbolic Phonetic Representation (SPR).
37
+ #
38
+ # The service also offers a Tune by Example feature that lets you define custom prompts.
39
+ # You can also define speaker models to improve the quality of your custom prompts. The
40
+ # service support custom prompts only for US English custom models and voices.
41
+ #
42
+ # **IBM Cloud®.** The Arabic, Chinese, Dutch, Australian English, and Korean languages
43
+ # and voices are supported only for IBM Cloud. For phonetic translation, they support only
44
+ # IPA, not SPR.
38
45
 
39
46
  require "concurrent"
40
47
  require "erb"
@@ -42,7 +49,6 @@ require "json"
42
49
  require "ibm_cloud_sdk_core"
43
50
  require_relative "./common.rb"
44
51
 
45
- # Module for the Watson APIs
46
52
  module IBMWatson
47
53
  ##
48
54
  # The Text to Speech V1 service.
@@ -83,8 +89,8 @@ module IBMWatson
83
89
  # Lists all voices available for use with the service. The information includes the
84
90
  # name, language, gender, and other details about the voice. The ordering of the
85
91
  # list of voices can change from call to call; do not rely on an alphabetized or
86
- # static list of voices. To see information about a specific voice, use the **Get a
87
- # voice** method.
92
+ # static list of voices. To see information about a specific voice, use the [Get a
93
+ # voice](#getvoice).
88
94
  #
89
95
  # **See also:** [Listing all available
90
96
  # voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoices).
@@ -112,12 +118,42 @@ module IBMWatson
112
118
  # Gets information about the specified voice. The information includes the name,
113
119
  # language, gender, and other details about the voice. Specify a customization ID to
114
120
  # obtain information for a custom model that is defined for the language of the
115
- # specified voice. To list information about all available voices, use the **List
116
- # voices** method.
121
+ # specified voice. To list information about all available voices, use the [List
122
+ # voices](#listvoices) method.
117
123
  #
118
124
  # **See also:** [Listing a specific
119
125
  # voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices#listVoice).
120
- # @param voice [String] The voice for which information is to be returned.
126
+ #
127
+ #
128
+ # ### Important voice updates for IBM Cloud
129
+ #
130
+ # The service's voices underwent significant change on 2 December 2020.
131
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
132
+ # instead of concatenative.
133
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
134
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
135
+ # `ar-MS` identifier instead.
136
+ # * The standard concatenative voices for the following languages are now
137
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
138
+ # French, German, Italian, Japanese, and Spanish (all dialects).
139
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
140
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
141
+ # of the service's neural voices.
142
+ # * All of the service's voices are now customizable and generally available (GA)
143
+ # for production use.
144
+ #
145
+ # The deprecated voices and features will continue to function for at least one year
146
+ # but might be removed at a future date. You are encouraged to migrate to the
147
+ # equivalent neural voices at your earliest convenience. For more information about
148
+ # all voice updates, see the [2 December 2020 service
149
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
150
+ # in the release notes for IBM Cloud.
151
+ # @param voice [String] The voice for which information is to be returned. For more information about
152
+ # specifying a voice, see **Important voice updates for IBM Cloud** in the method
153
+ # description.
154
+ #
155
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
156
+ # languages and voices are supported only for IBM Cloud.
121
157
  # @param customization_id [String] The customization ID (GUID) of a custom model for which information is to be
122
158
  # returned. You must make the request with credentials for the instance of the
123
159
  # service that owns the custom model. Omit the parameter to see information about
@@ -209,9 +245,33 @@ module IBMWatson
209
245
  # The default sampling rate is 22,050 Hz.
210
246
  #
211
247
  # For more information about specifying an audio format, including additional
212
- # details about some of the formats, see [Audio
213
- # formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audioFormats#audioFormats).
248
+ # details about some of the formats, see [Using audio
249
+ # formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audio-formats).
250
+ #
251
+ #
252
+ # ### Important voice updates for IBM Cloud
253
+ #
254
+ # The service's voices underwent significant change on 2 December 2020.
255
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
256
+ # instead of concatenative.
257
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
258
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
259
+ # `ar-MS` identifier instead.
260
+ # * The standard concatenative voices for the following languages are now
261
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
262
+ # French, German, Italian, Japanese, and Spanish (all dialects).
263
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
264
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
265
+ # of the service's neural voices.
266
+ # * All of the service's voices are now customizable and generally available (GA)
267
+ # for production use.
214
268
  #
269
+ # The deprecated voices and features will continue to function for at least one year
270
+ # but might be removed at a future date. You are encouraged to migrate to the
271
+ # equivalent neural voices at your earliest convenience. For more information about
272
+ # all voice updates, see the [2 December 2020 service
273
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
274
+ # in the release notes for IBM Cloud.
215
275
  #
216
276
  # ### Warning messages
217
277
  #
@@ -226,7 +286,14 @@ module IBMWatson
226
286
  # the `accept` parameter to specify the audio format. For more information about
227
287
  # specifying an audio format, see **Audio formats (accept types)** in the method
228
288
  # description.
229
- # @param voice [String] The voice to use for synthesis.
289
+ # @param voice [String] The voice to use for synthesis. For more information about specifying a voice, see
290
+ # **Important voice updates for IBM Cloud** in the method description.
291
+ #
292
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
293
+ # languages and voices are supported only for IBM Cloud.
294
+ #
295
+ # **See also:** See also [Using languages and
296
+ # voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices).
230
297
  # @param customization_id [String] The customization ID (GUID) of a custom model to use for the synthesis. If a
231
298
  # custom model is specified, it works only if it matches the language of the
232
299
  # indicated voice. You must make the request with credentials for the instance of
@@ -277,13 +344,42 @@ module IBMWatson
277
344
  #
278
345
  # **See also:** [Querying a word from a
279
346
  # language](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryLanguage).
347
+ #
348
+ #
349
+ # ### Important voice updates for IBM Cloud
350
+ #
351
+ # The service's voices underwent significant change on 2 December 2020.
352
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
353
+ # instead of concatenative.
354
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
355
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
356
+ # `ar-MS` identifier instead.
357
+ # * The standard concatenative voices for the following languages are now
358
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
359
+ # French, German, Italian, Japanese, and Spanish (all dialects).
360
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
361
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
362
+ # of the service's neural voices.
363
+ # * All of the service's voices are now customizable and generally available (GA)
364
+ # for production use.
365
+ #
366
+ # The deprecated voices and features will continue to function for at least one year
367
+ # but might be removed at a future date. You are encouraged to migrate to the
368
+ # equivalent neural voices at your earliest convenience. For more information about
369
+ # all voice updates, see the [2 December 2020 service
370
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
371
+ # in the release notes for IBM Cloud.
280
372
  # @param text [String] The word for which the pronunciation is requested.
281
373
  # @param voice [String] A voice that specifies the language in which the pronunciation is to be returned.
282
374
  # All voices for the same language (for example, `en-US`) return the same
283
- # translation.
375
+ # translation. For more information about specifying a voice, see **Important voice
376
+ # updates for IBM Cloud** in the method description.
377
+ #
378
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
379
+ # languages and voices are supported only for IBM Cloud.
284
380
  # @param format [String] The phoneme format in which to return the pronunciation. The Arabic, Chinese,
285
- # Dutch, and Korean languages support only IPA. Omit the parameter to obtain the
286
- # pronunciation in the default format.
381
+ # Dutch, Australian English, and Korean languages support only IPA. Omit the
382
+ # parameter to obtain the pronunciation in the default format.
287
383
  # @param customization_id [String] The customization ID (GUID) of a custom model for which the pronunciation is to be
288
384
  # returned. The language of a specified custom model must match the language of the
289
385
  # specified voice. If the word is not defined in the specified custom model, the
@@ -332,11 +428,40 @@ module IBMWatson
332
428
  #
333
429
  # **See also:** [Creating a custom
334
430
  # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsCreate).
431
+ #
432
+ #
433
+ # ### Important voice updates for IBM Cloud
434
+ #
435
+ # The service's voices underwent significant change on 2 December 2020.
436
+ # * The Arabic, Chinese, Dutch, Australian English, and Korean voices are now neural
437
+ # instead of concatenative.
438
+ # * The `ar-AR_OmarVoice` voice is deprecated. Use `ar-MS_OmarVoice` voice instead.
439
+ # * The `ar-AR` language identifier cannot be used to create a custom model. Use the
440
+ # `ar-MS` identifier instead.
441
+ # * The standard concatenative voices for the following languages are now
442
+ # deprecated: Brazilian Portuguese, United Kingdom and United States English,
443
+ # French, German, Italian, Japanese, and Spanish (all dialects).
444
+ # * The features expressive SSML, voice transformation SSML, and use of the `volume`
445
+ # attribute of the `<prosody>` element are deprecated and are not supported with any
446
+ # of the service's neural voices.
447
+ # * All of the service's voices are now customizable and generally available (GA)
448
+ # for production use.
449
+ #
450
+ # The deprecated voices and features will continue to function for at least one year
451
+ # but might be removed at a future date. You are encouraged to migrate to the
452
+ # equivalent neural voices at your earliest convenience. For more information about
453
+ # all voice updates, see the [2 December 2020 service
454
+ # update](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-release-notes#December2020)
455
+ # in the release notes for IBM Cloud.
335
456
  # @param name [String] The name of the new custom model.
336
457
  # @param language [String] The language of the new custom model. You create a custom model for a specific
337
- # language, not for a specific voice. A custom model can be used with any voice,
338
- # standard or neural, for its specified language. Omit the parameter to use the the
339
- # default language, `en-US`.
458
+ # language, not for a specific voice. A custom model can be used with any voice for
459
+ # its specified language. Omit the parameter to use the the default language,
460
+ # `en-US`. **Note:** The `ar-AR` language identifier cannot be used to create a
461
+ # custom model. Use the `ar-MS` identifier instead.
462
+ #
463
+ # **IBM Cloud:** The Arabic, Chinese, Dutch, Australian English, and Korean
464
+ # languages and voices are supported only for IBM Cloud.
340
465
  # @param description [String] A description of the new custom model. Specifying a description is recommended.
341
466
  # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
342
467
  def create_custom_model(name:, language: nil, description: nil)
@@ -370,10 +495,10 @@ module IBMWatson
370
495
  # List custom models.
371
496
  # Lists metadata such as the name and description for all custom models that are
372
497
  # owned by an instance of the service. Specify a language to list the custom models
373
- # for that language only. To see the words in addition to the metadata for a
374
- # specific custom model, use the **List a custom model** method. You must use
375
- # credentials for the instance of the service that owns a model to list information
376
- # about it.
498
+ # for that language only. To see the words and prompts in addition to the metadata
499
+ # for a specific custom model, use the [Get a custom model](#getcustommodel) method.
500
+ # You must use credentials for the instance of the service that owns a model to list
501
+ # information about it.
377
502
  #
378
503
  # **See also:** [Querying all custom
379
504
  # models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).
@@ -473,8 +598,9 @@ module IBMWatson
473
598
  # Get a custom model.
474
599
  # Gets all information about a specified custom model. In addition to metadata such
475
600
  # as the name and description of the custom model, the output includes the words and
476
- # their translations as defined in the model. To see just the metadata for a model,
477
- # use the **List custom models** method.
601
+ # their translations that are defined for the model, as well as any prompts that are
602
+ # defined for the model. To see just the metadata for a model, use the [List custom
603
+ # models](#listcustommodels) method.
478
604
  #
479
605
  # **See also:** [Querying a custom
480
606
  # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQuery).
@@ -565,14 +691,14 @@ module IBMWatson
565
691
  # customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
566
692
  # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
567
693
  # credentials for the instance of the service that owns the custom model.
568
- # @param words [Array[Word]] The **Add custom words** method accepts an array of `Word` objects. Each object
569
- # provides a word that is to be added or updated for the custom model and the word's
570
- # translation.
571
- #
572
- # The **List custom words** method returns an array of `Word` objects. Each object
573
- # shows a word and its translation from the custom model. The words are listed in
574
- # alphabetical order, with uppercase letters listed before lowercase letters. The
575
- # array is empty if the custom model contains no words.
694
+ # @param words [Array[Word]] The [Add custom words](#addwords) method accepts an array of `Word` objects. Each
695
+ # object provides a word that is to be added or updated for the custom model and the
696
+ # word's translation.
697
+ #
698
+ # The [List custom words](#listwords) method returns an array of `Word` objects.
699
+ # Each object shows a word and its translation from the custom model. The words are
700
+ # listed in alphabetical order, with uppercase letters listed before lowercase
701
+ # letters. The array is empty if the custom model contains no words.
576
702
  # @return [nil]
577
703
  def add_words(customization_id:, words:)
578
704
  raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
@@ -666,9 +792,9 @@ module IBMWatson
666
792
  # @param word [String] The word that is to be added or updated for the custom model.
667
793
  # @param translation [String] The phonetic or sounds-like translation for the word. A phonetic translation is
668
794
  # based on the SSML format for representing the phonetic string of a word either as
669
- # an IPA translation or as an IBM SPR translation. The Arabic, Chinese, Dutch, and
670
- # Korean languages support only IPA. A sounds-like is one or more words that, when
671
- # combined, sound like the word.
795
+ # an IPA translation or as an IBM SPR translation. The Arabic, Chinese, Dutch,
796
+ # Australian English, and Korean languages support only IPA. A sounds-like is one or
797
+ # more words that, when combined, sound like the word.
672
798
  # @param part_of_speech [String] **Japanese only.** The part of speech for the word. The service uses the value to
673
799
  # produce the correct intonation for the word. You can create only a single entry,
674
800
  # with or without a single part of speech, for any word; you cannot create multiple
@@ -772,6 +898,462 @@ module IBMWatson
772
898
  nil
773
899
  end
774
900
  #########################
901
+ # Custom prompts
902
+ #########################
903
+
904
+ ##
905
+ # @!method list_custom_prompts(customization_id:)
906
+ # List custom prompts.
907
+ # Lists information about all custom prompts that are defined for a custom model.
908
+ # The information includes the prompt ID, prompt text, status, and optional speaker
909
+ # ID for each prompt of the custom model. You must use credentials for the instance
910
+ # of the service that owns the custom model. The same information about all of the
911
+ # prompts for a custom model is also provided by the [Get a custom
912
+ # model](#getcustommodel) method. That method provides complete details about a
913
+ # specified custom model, including its language, owner, custom words, and more.
914
+ # Custom prompts are supported only for use with US English custom models and
915
+ # voices.
916
+ #
917
+ # **See also:** [Listing custom
918
+ # prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).
919
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
920
+ # credentials for the instance of the service that owns the custom model.
921
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
922
+ def list_custom_prompts(customization_id:)
923
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
924
+
925
+ headers = {
926
+ }
927
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_custom_prompts")
928
+ headers.merge!(sdk_headers)
929
+
930
+ method_url = "/v1/customizations/%s/prompts" % [ERB::Util.url_encode(customization_id)]
931
+
932
+ response = request(
933
+ method: "GET",
934
+ url: method_url,
935
+ headers: headers,
936
+ accept_json: true
937
+ )
938
+ response
939
+ end
940
+
941
+ ##
942
+ # @!method add_custom_prompt(customization_id:, prompt_id:, metadata:, file:)
943
+ # Add a custom prompt.
944
+ # Adds a custom prompt to a custom model. A prompt is defined by the text that is to
945
+ # be spoken, the audio for that text, a unique user-specified ID for the prompt, and
946
+ # an optional speaker ID. The information is used to generate prosodic data that is
947
+ # not visible to the user. This data is used by the service to produce the
948
+ # synthesized audio upon request. You must use credentials for the instance of the
949
+ # service that owns a custom model to add a prompt to it. You can add a maximum of
950
+ # 1000 custom prompts to a single custom model.
951
+ #
952
+ # You are recommended to assign meaningful values for prompt IDs. For example, use
953
+ # `goodbye` to identify a prompt that speaks a farewell message. Prompt IDs must be
954
+ # unique within a given custom model. You cannot define two prompts with the same
955
+ # name for the same custom model. If you provide the ID of an existing prompt, the
956
+ # previously uploaded prompt is replaced by the new information. The existing prompt
957
+ # is reprocessed by using the new text and audio and, if provided, new speaker
958
+ # model, and the prosody data associated with the prompt is updated.
959
+ #
960
+ # The quality of a prompt is undefined if the language of a prompt does not match
961
+ # the language of its custom model. This is consistent with any text or SSML that is
962
+ # specified for a speech synthesis request. The service makes a best-effort attempt
963
+ # to render the specified text for the prompt; it does not validate that the
964
+ # language of the text matches the language of the model.
965
+ #
966
+ # Adding a prompt is an asynchronous operation. Although it accepts less audio than
967
+ # speaker enrollment, the service must align the audio with the provided text. The
968
+ # time that it takes to process a prompt depends on the prompt itself. The
969
+ # processing time for a reasonably sized prompt generally matches the length of the
970
+ # audio (for example, it takes 20 seconds to process a 20-second prompt).
971
+ #
972
+ # For shorter prompts, you can wait for a reasonable amount of time and then check
973
+ # the status of the prompt with the [Get a custom prompt](#getcustomprompt) method.
974
+ # For longer prompts, consider using that method to poll the service every few
975
+ # seconds to determine when the prompt becomes available. No prompt can be used for
976
+ # speech synthesis if it is in the `processing` or `failed` state. Only prompts that
977
+ # are in the `available` state can be used for speech synthesis.
978
+ #
979
+ # When it processes a request, the service attempts to align the text and the audio
980
+ # that are provided for the prompt. The text that is passed with a prompt must match
981
+ # the spoken audio as closely as possible. Optimally, the text and audio match
982
+ # exactly. The service does its best to align the specified text with the audio, and
983
+ # it can often compensate for mismatches between the two. But if the service cannot
984
+ # effectively align the text and the audio, possibly because the magnitude of
985
+ # mismatches between the two is too great, processing of the prompt fails.
986
+ #
987
+ # ### Evaluating a prompt
988
+ #
989
+ # Always listen to and evaluate a prompt to determine its quality before using it
990
+ # in production. To evaluate a prompt, include only the single prompt in a speech
991
+ # synthesis request by using the following SSML extension, in this case for a prompt
992
+ # whose ID is `goodbye`:
993
+ #
994
+ # `<ibm:prompt id="goodbye"/>`
995
+ #
996
+ # In some cases, you might need to rerecord and resubmit a prompt as many as five
997
+ # times to address the following possible problems:
998
+ # * The service might fail to detect a mismatch between the prompts text and audio.
999
+ # The longer the prompt, the greater the chance for misalignment between its text
1000
+ # and audio. Therefore, multiple shorter prompts are preferable to a single long
1001
+ # prompt.
1002
+ # * The text of a prompt might include a word that the service does not recognize.
1003
+ # In this case, you can create a custom word and pronunciation pair to tell the
1004
+ # service how to pronounce the word. You must then re-create the prompt.
1005
+ # * The quality of the input audio might be insufficient or the services processing
1006
+ # of the audio might fail to detect the intended prosody. Submitting new audio for
1007
+ # the prompt can correct these issues.
1008
+ #
1009
+ # If a prompt that is created without a speaker ID does not adequately reflect the
1010
+ # intended prosody, enrolling the speaker and providing a speaker ID for the prompt
1011
+ # is one recommended means of potentially improving the quality of the prompt. This
1012
+ # is especially important for shorter prompts such as "good-bye" or "thank you,"
1013
+ # where less audio data makes it more difficult to match the prosody of the speaker.
1014
+ # Custom prompts are supported only for use with US English custom models and
1015
+ # voices.
1016
+ #
1017
+ # **See also:**
1018
+ # * [Add a custom
1019
+ # prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-add-prompt)
1020
+ # * [Evaluate a custom
1021
+ # prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-evaluate-prompt)
1022
+ # * [Rules for creating custom
1023
+ # prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-prompts).
1024
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
1025
+ # credentials for the instance of the service that owns the custom model.
1026
+ # @param prompt_id [String] The identifier of the prompt that is to be added to the custom model:
1027
+ # * Include a maximum of 49 characters in the ID.
1028
+ # * Include only alphanumeric characters and `_` (underscores) in the ID.
1029
+ # * Do not include XML sensitive characters (double quotes, single quotes,
1030
+ # ampersands, angle brackets, and slashes) in the ID.
1031
+ # * To add a new prompt, the ID must be unique for the specified custom model.
1032
+ # Otherwise, the new information for the prompt overwrites the existing prompt that
1033
+ # has that ID.
1034
+ # @param metadata [PromptMetadata] Information about the prompt that is to be added to a custom model. The following
1035
+ # example of a `PromptMetadata` object includes both the required prompt text and an
1036
+ # optional speaker model ID:
1037
+ #
1038
+ # `{ "prompt_text": "Thank you and good-bye!", "speaker_id":
1039
+ # "823068b2-ed4e-11ea-b6e0-7b6456aa95cc" }`.
1040
+ # @param file [File] An audio file that speaks the text of the prompt with intonation and prosody that
1041
+ # matches how you would like the prompt to be spoken.
1042
+ # * The prompt audio must be in WAV format and must have a minimum sampling rate of
1043
+ # 16 kHz. The service accepts audio with higher sampling rates. The service
1044
+ # transcodes all audio to 16 kHz before processing it.
1045
+ # * The length of the prompt audio is limited to 30 seconds.
1046
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1047
+ def add_custom_prompt(customization_id:, prompt_id:, metadata:, file:)
1048
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
1049
+
1050
+ raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?
1051
+
1052
+ raise ArgumentError.new("metadata must be provided") if metadata.nil?
1053
+
1054
+ raise ArgumentError.new("file must be provided") if file.nil?
1055
+
1056
+ headers = {
1057
+ }
1058
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "add_custom_prompt")
1059
+ headers.merge!(sdk_headers)
1060
+
1061
+ form_data = {}
1062
+
1063
+ form_data[:metadata] = HTTP::FormData::Part.new(metadata.to_s, content_type: "application/json")
1064
+
1065
+ unless file.instance_of?(StringIO) || file.instance_of?(File)
1066
+ file = file.respond_to?(:to_json) ? StringIO.new(file.to_json) : StringIO.new(file)
1067
+ end
1068
+ form_data[:file] = HTTP::FormData::File.new(file, content_type: "audio/wav", filename: file.respond_to?(:path) ? file.path : nil)
1069
+
1070
+ method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]
1071
+
1072
+ response = request(
1073
+ method: "POST",
1074
+ url: method_url,
1075
+ headers: headers,
1076
+ form: form_data,
1077
+ accept_json: true
1078
+ )
1079
+ response
1080
+ end
1081
+
1082
+ ##
1083
+ # @!method get_custom_prompt(customization_id:, prompt_id:)
1084
+ # Get a custom prompt.
1085
+ # Gets information about a specified custom prompt for a specified custom model. The
1086
+ # information includes the prompt ID, prompt text, status, and optional speaker ID
1087
+ # for each prompt of the custom model. You must use credentials for the instance of
1088
+ # the service that owns the custom model. Custom prompts are supported only for use
1089
+ # with US English custom models and voices.
1090
+ #
1091
+ # **See also:** [Listing custom
1092
+ # prompts](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-list).
1093
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
1094
+ # credentials for the instance of the service that owns the custom model.
1095
+ # @param prompt_id [String] The identifier (name) of the prompt.
1096
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1097
+ def get_custom_prompt(customization_id:, prompt_id:)
1098
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
1099
+
1100
+ raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?
1101
+
1102
+ headers = {
1103
+ }
1104
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_custom_prompt")
1105
+ headers.merge!(sdk_headers)
1106
+
1107
+ method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]
1108
+
1109
+ response = request(
1110
+ method: "GET",
1111
+ url: method_url,
1112
+ headers: headers,
1113
+ accept_json: true
1114
+ )
1115
+ response
1116
+ end
1117
+
1118
+ ##
1119
+ # @!method delete_custom_prompt(customization_id:, prompt_id:)
1120
+ # Delete a custom prompt.
1121
+ # Deletes an existing custom prompt from a custom model. The service deletes the
1122
+ # prompt with the specified ID. You must use credentials for the instance of the
1123
+ # service that owns the custom model from which the prompt is to be deleted.
1124
+ #
1125
+ # **Caution:** Deleting a custom prompt elicits a 400 response code from synthesis
1126
+ # requests that attempt to use the prompt. Make sure that you do not attempt to use
1127
+ # a deleted prompt in a production application. Custom prompts are supported only
1128
+ # for use with US English custom models and voices.
1129
+ #
1130
+ # **See also:** [Deleting a custom
1131
+ # prompt](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-custom-prompts#tbe-custom-prompts-delete).
1132
+ # @param customization_id [String] The customization ID (GUID) of the custom model. You must make the request with
1133
+ # credentials for the instance of the service that owns the custom model.
1134
+ # @param prompt_id [String] The identifier (name) of the prompt that is to be deleted.
1135
+ # @return [nil]
1136
+ def delete_custom_prompt(customization_id:, prompt_id:)
1137
+ raise ArgumentError.new("customization_id must be provided") if customization_id.nil?
1138
+
1139
+ raise ArgumentError.new("prompt_id must be provided") if prompt_id.nil?
1140
+
1141
+ headers = {
1142
+ }
1143
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_custom_prompt")
1144
+ headers.merge!(sdk_headers)
1145
+
1146
+ method_url = "/v1/customizations/%s/prompts/%s" % [ERB::Util.url_encode(customization_id), ERB::Util.url_encode(prompt_id)]
1147
+
1148
+ request(
1149
+ method: "DELETE",
1150
+ url: method_url,
1151
+ headers: headers,
1152
+ accept_json: false
1153
+ )
1154
+ nil
1155
+ end
1156
+ #########################
1157
+ # Speaker models
1158
+ #########################
1159
+
1160
+ ##
1161
+ # @!method list_speaker_models
1162
+ # List speaker models.
1163
+ # Lists information about all speaker models that are defined for a service
1164
+ # instance. The information includes the speaker ID and speaker name of each defined
1165
+ # speaker. You must use credentials for the instance of a service to list its
1166
+ # speakers. Speaker models and the custom prompts with which they are used are
1167
+ # supported only for use with US English custom models and voices.
1168
+ #
1169
+ # **See also:** [Listing speaker
1170
+ # models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list).
1171
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1172
+ def list_speaker_models
1173
+ headers = {
1174
+ }
1175
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "list_speaker_models")
1176
+ headers.merge!(sdk_headers)
1177
+
1178
+ method_url = "/v1/speakers"
1179
+
1180
+ response = request(
1181
+ method: "GET",
1182
+ url: method_url,
1183
+ headers: headers,
1184
+ accept_json: true
1185
+ )
1186
+ response
1187
+ end
1188
+
1189
+ ##
1190
+ # @!method create_speaker_model(speaker_name:, audio:)
1191
+ # Create a speaker model.
1192
+ # Creates a new speaker model, which is an optional enrollment token for users who
1193
+ # are to add prompts to custom models. A speaker model contains information about a
1194
+ # user's voice. The service extracts this information from a WAV audio sample that
1195
+ # you pass as the body of the request. Associating a speaker model with a prompt is
1196
+ # optional, but the information that is extracted from the speaker model helps the
1197
+ # service learn about the speaker's voice.
1198
+ #
1199
+ # A speaker model can make an appreciable difference in the quality of prompts,
1200
+ # especially short prompts with relatively little audio, that are associated with
1201
+ # that speaker. A speaker model can help the service produce a prompt with more
1202
+ # confidence; the lack of a speaker model can potentially compromise the quality of
1203
+ # a prompt.
1204
+ #
1205
+ # The gender of the speaker who creates a speaker model does not need to match the
1206
+ # gender of a voice that is used with prompts that are associated with that speaker
1207
+ # model. For example, a speaker model that is created by a male speaker can be
1208
+ # associated with prompts that are spoken by female voices.
1209
+ #
1210
+ # You create a speaker model for a given instance of the service. The new speaker
1211
+ # model is owned by the service instance whose credentials are used to create it.
1212
+ # That same speaker can then be used to create prompts for all custom models within
1213
+ # that service instance. No language is associated with a speaker model, but each
1214
+ # custom model has a single specified language. You can add prompts only to US
1215
+ # English models.
1216
+ #
1217
+ # You specify a name for the speaker when you create it. The name must be unique
1218
+ # among all speaker names for the owning service instance. To re-create a speaker
1219
+ # model for an existing speaker name, you must first delete the existing speaker
1220
+ # model that has that name.
1221
+ #
1222
+ # Speaker enrollment is a synchronous operation. Although it accepts more audio data
1223
+ # than a prompt, the process of adding a speaker is very fast. The service simply
1224
+ # extracts information about the speakers voice from the audio. Unlike prompts,
1225
+ # speaker models neither need nor accept a transcription of the audio. When the call
1226
+ # returns, the audio is fully processed and the speaker enrollment is complete.
1227
+ #
1228
+ # The service returns a speaker ID with the request. A speaker ID is globally unique
1229
+ # identifier (GUID) that you use to identify the speaker in subsequent requests to
1230
+ # the service. Speaker models and the custom prompts with which they are used are
1231
+ # supported only for use with US English custom models and voices.
1232
+ #
1233
+ # **See also:**
1234
+ # * [Create a speaker
1235
+ # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-create#tbe-create-speaker-model)
1236
+ # * [Rules for creating speaker
1237
+ # models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-rules#tbe-rules-speakers).
1238
+ # @param speaker_name [String] The name of the speaker that is to be added to the service instance.
1239
+ # * Include a maximum of 49 characters in the name.
1240
+ # * Include only alphanumeric characters and `_` (underscores) in the name.
1241
+ # * Do not include XML sensitive characters (double quotes, single quotes,
1242
+ # ampersands, angle brackets, and slashes) in the name.
1243
+ # * Do not use the name of an existing speaker that is already defined for the
1244
+ # service instance.
1245
+ # @param audio [File] An enrollment audio file that contains a sample of the speakers voice.
1246
+ # * The enrollment audio must be in WAV format and must have a minimum sampling rate
1247
+ # of 16 kHz. The service accepts audio with higher sampling rates. It transcodes all
1248
+ # audio to 16 kHz before processing it.
1249
+ # * The length of the enrollment audio is limited to 1 minute. Speaking one or two
1250
+ # paragraphs of text that include five to ten sentences is recommended.
1251
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1252
+ def create_speaker_model(speaker_name:, audio:)
1253
+ raise ArgumentError.new("speaker_name must be provided") if speaker_name.nil?
1254
+
1255
+ raise ArgumentError.new("audio must be provided") if audio.nil?
1256
+
1257
+ headers = {
1258
+ }
1259
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "create_speaker_model")
1260
+ headers.merge!(sdk_headers)
1261
+
1262
+ params = {
1263
+ "speaker_name" => speaker_name
1264
+ }
1265
+
1266
+ data = audio
1267
+ headers["Content-Type"] = "audio/wav"
1268
+
1269
+ method_url = "/v1/speakers"
1270
+
1271
+ response = request(
1272
+ method: "POST",
1273
+ url: method_url,
1274
+ headers: headers,
1275
+ params: params,
1276
+ data: data,
1277
+ accept_json: true
1278
+ )
1279
+ response
1280
+ end
1281
+
1282
+ ##
1283
+ # @!method get_speaker_model(speaker_id:)
1284
+ # Get a speaker model.
1285
+ # Gets information about all prompts that are defined by a specified speaker for all
1286
+ # custom models that are owned by a service instance. The information is grouped by
1287
+ # the customization IDs of the custom models. For each custom model, the information
1288
+ # lists information about each prompt that is defined for that custom model by the
1289
+ # speaker. You must use credentials for the instance of the service that owns a
1290
+ # speaker model to list its prompts. Speaker models and the custom prompts with
1291
+ # which they are used are supported only for use with US English custom models and
1292
+ # voices.
1293
+ #
1294
+ # **See also:** [Listing the custom prompts for a speaker
1295
+ # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-list-prompts).
1296
+ # @param speaker_id [String] The speaker ID (GUID) of the speaker model. You must make the request with service
1297
+ # credentials for the instance of the service that owns the speaker model.
1298
+ # @return [IBMCloudSdkCore::DetailedResponse] A `IBMCloudSdkCore::DetailedResponse` object representing the response.
1299
+ def get_speaker_model(speaker_id:)
1300
+ raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil?
1301
+
1302
+ headers = {
1303
+ }
1304
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "get_speaker_model")
1305
+ headers.merge!(sdk_headers)
1306
+
1307
+ method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)]
1308
+
1309
+ response = request(
1310
+ method: "GET",
1311
+ url: method_url,
1312
+ headers: headers,
1313
+ accept_json: true
1314
+ )
1315
+ response
1316
+ end
1317
+
1318
+ ##
1319
+ # @!method delete_speaker_model(speaker_id:)
1320
+ # Delete a speaker model.
1321
+ # Deletes an existing speaker model from the service instance. The service deletes
1322
+ # the enrolled speaker with the specified speaker ID. You must use credentials for
1323
+ # the instance of the service that owns a speaker model to delete the speaker.
1324
+ #
1325
+ # Any prompts that are associated with the deleted speaker are not affected by the
1326
+ # speaker's deletion. The prosodic data that defines the quality of a prompt is
1327
+ # established when the prompt is created. A prompt is static and remains unaffected
1328
+ # by deletion of its associated speaker. However, the prompt cannot be resubmitted
1329
+ # or updated with its original speaker once that speaker is deleted. Speaker models
1330
+ # and the custom prompts with which they are used are supported only for use with US
1331
+ # English custom models and voices.
1332
+ #
1333
+ # **See also:** [Deleting a speaker
1334
+ # model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-tbe-speaker-models#tbe-speaker-models-delete).
1335
+ # @param speaker_id [String] The speaker ID (GUID) of the speaker model. You must make the request with service
1336
+ # credentials for the instance of the service that owns the speaker model.
1337
+ # @return [nil]
1338
+ def delete_speaker_model(speaker_id:)
1339
+ raise ArgumentError.new("speaker_id must be provided") if speaker_id.nil?
1340
+
1341
+ headers = {
1342
+ }
1343
+ sdk_headers = Common.new.get_sdk_headers("text_to_speech", "V1", "delete_speaker_model")
1344
+ headers.merge!(sdk_headers)
1345
+
1346
+ method_url = "/v1/speakers/%s" % [ERB::Util.url_encode(speaker_id)]
1347
+
1348
+ request(
1349
+ method: "DELETE",
1350
+ url: method_url,
1351
+ headers: headers,
1352
+ accept_json: false
1353
+ )
1354
+ nil
1355
+ end
1356
+ #########################
775
1357
  # User data
776
1358
  #########################
777
1359