google-cloud-speech 0.31.0 → 0.31.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +5 -5
- data/lib/google/cloud/speech.rb +4 -4
- data/lib/google/cloud/speech/v1.rb +4 -4
- data/lib/google/cloud/speech/v1/doc/google/cloud/speech/v1/cloud_speech.rb +97 -97
- data/lib/google/cloud/speech/v1/doc/google/longrunning/operations.rb +9 -9
- data/lib/google/cloud/speech/v1/doc/google/protobuf/any.rb +8 -8
- data/lib/google/cloud/speech/v1/doc/google/protobuf/duration.rb +3 -3
- data/lib/google/cloud/speech/v1/doc/google/rpc/status.rb +11 -11
- data/lib/google/cloud/speech/v1/speech_client.rb +2 -2
- data/lib/google/cloud/speech/v1p1beta1.rb +4 -4
- data/lib/google/cloud/speech/v1p1beta1/doc/google/cloud/speech/v1p1beta1/cloud_speech.rb +107 -107
- data/lib/google/cloud/speech/v1p1beta1/doc/google/longrunning/operations.rb +9 -9
- data/lib/google/cloud/speech/v1p1beta1/doc/google/protobuf/any.rb +8 -8
- data/lib/google/cloud/speech/v1p1beta1/doc/google/protobuf/duration.rb +3 -3
- data/lib/google/cloud/speech/v1p1beta1/doc/google/rpc/status.rb +11 -11
- data/lib/google/cloud/speech/v1p1beta1/speech_client.rb +2 -2
- metadata +2 -4
- data/lib/google/cloud/speech/v1/doc/overview.rb +0 -99
- data/lib/google/cloud/speech/v1p1beta1/doc/overview.rb +0 -99
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 6c020aae8d167db11676cfec941720c95d808833a976c7dff4cc5ed70adb76bb
|
4
|
+
data.tar.gz: ba7bd5abf5fbfe047a62e2208d14df69a295782edbe8a5d96ecdd161ebe07bde
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: cc0c3b397c928b5c9f5b12c113aa03e9547aee071e5f5ff7e6591d2dd1b24566210b6c1581c0652a039851faab139175282a9a8735d213f0a3831db41701013e
|
7
|
+
data.tar.gz: 394fe29602211aa0a6a847631ebe7d26cd0996f9ba9d4a1c31a5144394b1726bc9ecb289e0e6eaf6e4a23eed5235b4cbdc08b1e701ac7f7fa0a4027ffa748791
|
data/README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1
|
-
# Ruby Client for Cloud Speech API ([Alpha](https://github.com/
|
1
|
+
# Ruby Client for Cloud Speech API ([Alpha](https://github.com/googleapis/google-cloud-ruby#versioning))
|
2
2
|
|
3
3
|
[Cloud Speech API][Product Documentation]:
|
4
4
|
Converts audio to text by applying powerful neural network models.
|
@@ -12,7 +12,7 @@ steps:
|
|
12
12
|
1. [Select or create a Cloud Platform project.](https://console.cloud.google.com/project)
|
13
13
|
2. [Enable billing for your project.](https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project)
|
14
14
|
3. [Enable the Cloud Speech API.](https://console.cloud.google.com/apis/library/speech.googleapis.com)
|
15
|
-
4. [Setup Authentication.](https://
|
15
|
+
4. [Setup Authentication.](https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud/master/guides/authentication)
|
16
16
|
|
17
17
|
### Installation
|
18
18
|
```
|
@@ -50,17 +50,17 @@ response = speech_client.recognize(config, audio)
|
|
50
50
|
to see other available methods on the client.
|
51
51
|
- Read the [Cloud Speech API Product documentation][Product Documentation]
|
52
52
|
to learn more about the product and see How-to Guides.
|
53
|
-
- View this [repository's main README](https://github.com/
|
53
|
+
- View this [repository's main README](https://github.com/googleapis/google-cloud-ruby/blob/master/README.md)
|
54
54
|
to see the full list of Cloud APIs that we cover.
|
55
55
|
|
56
|
-
[Client Library Documentation]: https://
|
56
|
+
[Client Library Documentation]: https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud-speech/latest/google/cloud/speech/v1
|
57
57
|
[Product Documentation]: https://cloud.google.com/speech
|
58
58
|
|
59
59
|
## Enabling Logging
|
60
60
|
|
61
61
|
To enable logging for this library, set the logger for the underlying [gRPC](https://github.com/grpc/grpc/tree/master/src/ruby) library.
|
62
62
|
The logger that you set may be a Ruby stdlib [`Logger`](https://ruby-doc.org/stdlib-2.5.0/libdoc/logger/rdoc/Logger.html) as shown below,
|
63
|
-
or a [`Google::Cloud::Logging::Logger`](https://
|
63
|
+
or a [`Google::Cloud::Logging::Logger`](https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud-logging/latest/google/cloud/logging/logger)
|
64
64
|
that will write logs to [Stackdriver Logging](https://cloud.google.com/logging/). See [grpc/logconfig.rb](https://github.com/grpc/grpc/blob/master/src/ruby/lib/grpc/logconfig.rb)
|
65
65
|
and the gRPC [spec_helper.rb](https://github.com/grpc/grpc/blob/master/src/ruby/spec/spec_helper.rb) for additional information.
|
66
66
|
|
data/lib/google/cloud/speech.rb
CHANGED
@@ -21,7 +21,7 @@ module Google
|
|
21
21
|
# rubocop:disable LineLength
|
22
22
|
|
23
23
|
##
|
24
|
-
# # Ruby Client for Cloud Speech API ([Alpha](https://github.com/
|
24
|
+
# # Ruby Client for Cloud Speech API ([Alpha](https://github.com/googleapis/google-cloud-ruby#versioning))
|
25
25
|
#
|
26
26
|
# [Cloud Speech API][Product Documentation]:
|
27
27
|
# Converts audio to text by applying powerful neural network models.
|
@@ -34,7 +34,7 @@ module Google
|
|
34
34
|
# 1. [Select or create a Cloud Platform project.](https://console.cloud.google.com/project)
|
35
35
|
# 2. [Enable billing for your project.](https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project)
|
36
36
|
# 3. [Enable the Cloud Speech API.](https://console.cloud.google.com/apis/library/speech.googleapis.com)
|
37
|
-
# 4. [Setup Authentication.](https://
|
37
|
+
# 4. [Setup Authentication.](https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud/master/guides/authentication)
|
38
38
|
#
|
39
39
|
# ### Installation
|
40
40
|
# ```
|
@@ -70,7 +70,7 @@ module Google
|
|
70
70
|
# ### Next Steps
|
71
71
|
# - Read the [Cloud Speech API Product documentation][Product Documentation]
|
72
72
|
# to learn more about the product and see How-to Guides.
|
73
|
-
# - View this [repository's main README](https://github.com/
|
73
|
+
# - View this [repository's main README](https://github.com/googleapis/google-cloud-ruby/blob/master/README.md)
|
74
74
|
# to see the full list of Cloud APIs that we cover.
|
75
75
|
#
|
76
76
|
# [Product Documentation]: https://cloud.google.com/speech
|
@@ -79,7 +79,7 @@ module Google
|
|
79
79
|
#
|
80
80
|
# To enable logging for this library, set the logger for the underlying [gRPC](https://github.com/grpc/grpc/tree/master/src/ruby) library.
|
81
81
|
# The logger that you set may be a Ruby stdlib [`Logger`](https://ruby-doc.org/stdlib-2.5.0/libdoc/logger/rdoc/Logger.html) as shown below,
|
82
|
-
# or a [`Google::Cloud::Logging::Logger`](https://
|
82
|
+
# or a [`Google::Cloud::Logging::Logger`](https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud-logging/latest/google/cloud/logging/logger)
|
83
83
|
# that will write logs to [Stackdriver Logging](https://cloud.google.com/logging/). See [grpc/logconfig.rb](https://github.com/grpc/grpc/blob/master/src/ruby/lib/grpc/logconfig.rb)
|
84
84
|
# and the gRPC [spec_helper.rb](https://github.com/grpc/grpc/blob/master/src/ruby/spec/spec_helper.rb) for additional information.
|
85
85
|
#
|
@@ -22,7 +22,7 @@ module Google
|
|
22
22
|
# rubocop:disable LineLength
|
23
23
|
|
24
24
|
##
|
25
|
-
# # Ruby Client for Cloud Speech API ([Alpha](https://github.com/
|
25
|
+
# # Ruby Client for Cloud Speech API ([Alpha](https://github.com/googleapis/google-cloud-ruby#versioning))
|
26
26
|
#
|
27
27
|
# [Cloud Speech API][Product Documentation]:
|
28
28
|
# Converts audio to text by applying powerful neural network models.
|
@@ -35,7 +35,7 @@ module Google
|
|
35
35
|
# 1. [Select or create a Cloud Platform project.](https://console.cloud.google.com/project)
|
36
36
|
# 2. [Enable billing for your project.](https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project)
|
37
37
|
# 3. [Enable the Cloud Speech API.](https://console.cloud.google.com/apis/library/speech.googleapis.com)
|
38
|
-
# 4. [Setup Authentication.](https://
|
38
|
+
# 4. [Setup Authentication.](https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud/master/guides/authentication)
|
39
39
|
#
|
40
40
|
# ### Installation
|
41
41
|
# ```
|
@@ -64,7 +64,7 @@ module Google
|
|
64
64
|
# ### Next Steps
|
65
65
|
# - Read the [Cloud Speech API Product documentation][Product Documentation]
|
66
66
|
# to learn more about the product and see How-to Guides.
|
67
|
-
# - View this [repository's main README](https://github.com/
|
67
|
+
# - View this [repository's main README](https://github.com/googleapis/google-cloud-ruby/blob/master/README.md)
|
68
68
|
# to see the full list of Cloud APIs that we cover.
|
69
69
|
#
|
70
70
|
# [Product Documentation]: https://cloud.google.com/speech
|
@@ -73,7 +73,7 @@ module Google
|
|
73
73
|
#
|
74
74
|
# To enable logging for this library, set the logger for the underlying [gRPC](https://github.com/grpc/grpc/tree/master/src/ruby) library.
|
75
75
|
# The logger that you set may be a Ruby stdlib [`Logger`](https://ruby-doc.org/stdlib-2.5.0/libdoc/logger/rdoc/Logger.html) as shown below,
|
76
|
-
# or a [`Google::Cloud::Logging::Logger`](https://
|
76
|
+
# or a [`Google::Cloud::Logging::Logger`](https://googleapis.github.io/google-cloud-ruby/#/docs/google-cloud-logging/latest/google/cloud/logging/logger)
|
77
77
|
# that will write logs to [Stackdriver Logging](https://cloud.google.com/logging/). See [grpc/logconfig.rb](https://github.com/grpc/grpc/blob/master/src/ruby/lib/grpc/logconfig.rb)
|
78
78
|
# and the gRPC [spec_helper.rb](https://github.com/grpc/grpc/blob/master/src/ruby/spec/spec_helper.rb) for additional information.
|
79
79
|
#
|
@@ -17,7 +17,7 @@ module Google
|
|
17
17
|
module Cloud
|
18
18
|
module Speech
|
19
19
|
module V1
|
20
|
-
# The top-level message sent by the client for the
|
20
|
+
# The top-level message sent by the client for the `Recognize` method.
|
21
21
|
# @!attribute [rw] config
|
22
22
|
# @return [Google::Cloud::Speech::V1::RecognitionConfig]
|
23
23
|
# *Required* Provides information to the recognizer that specifies how to
|
@@ -27,7 +27,7 @@ module Google
|
|
27
27
|
# *Required* The audio data to be recognized.
|
28
28
|
class RecognizeRequest; end
|
29
29
|
|
30
|
-
# The top-level message sent by the client for the
|
30
|
+
# The top-level message sent by the client for the `LongRunningRecognize`
|
31
31
|
# method.
|
32
32
|
# @!attribute [rw] config
|
33
33
|
# @return [Google::Cloud::Speech::V1::RecognitionConfig]
|
@@ -38,24 +38,24 @@ module Google
|
|
38
38
|
# *Required* The audio data to be recognized.
|
39
39
|
class LongRunningRecognizeRequest; end
|
40
40
|
|
41
|
-
# The top-level message sent by the client for the
|
42
|
-
# Multiple
|
43
|
-
# must contain a
|
44
|
-
# All subsequent messages must contain
|
45
|
-
#
|
41
|
+
# The top-level message sent by the client for the `StreamingRecognize` method.
|
42
|
+
# Multiple `StreamingRecognizeRequest` messages are sent. The first message
|
43
|
+
# must contain a `streaming_config` message and must not contain `audio` data.
|
44
|
+
# All subsequent messages must contain `audio` data and must not contain a
|
45
|
+
# `streaming_config` message.
|
46
46
|
# @!attribute [rw] streaming_config
|
47
47
|
# @return [Google::Cloud::Speech::V1::StreamingRecognitionConfig]
|
48
48
|
# Provides information to the recognizer that specifies how to process the
|
49
|
-
# request. The first
|
50
|
-
#
|
49
|
+
# request. The first `StreamingRecognizeRequest` message must contain a
|
50
|
+
# `streaming_config` message.
|
51
51
|
# @!attribute [rw] audio_content
|
52
52
|
# @return [String]
|
53
53
|
# The audio data to be recognized. Sequential chunks of audio data are sent
|
54
|
-
# in sequential
|
55
|
-
#
|
56
|
-
# and all subsequent
|
57
|
-
#
|
58
|
-
#
|
54
|
+
# in sequential `StreamingRecognizeRequest` messages. The first
|
55
|
+
# `StreamingRecognizeRequest` message must not contain `audio_content` data
|
56
|
+
# and all subsequent `StreamingRecognizeRequest` messages must contain
|
57
|
+
# `audio_content` data. The audio bytes must be encoded as specified in
|
58
|
+
# `RecognitionConfig`. Note: as with all bytes fields, protobuffers use a
|
59
59
|
# pure binary representation (not base64). See
|
60
60
|
# [content limits](https://cloud.google.com/speech-to-text/quotas#content).
|
61
61
|
class StreamingRecognizeRequest; end
|
@@ -68,40 +68,40 @@ module Google
|
|
68
68
|
# process the request.
|
69
69
|
# @!attribute [rw] single_utterance
|
70
70
|
# @return [true, false]
|
71
|
-
# *Optional* If
|
71
|
+
# *Optional* If `false` or omitted, the recognizer will perform continuous
|
72
72
|
# recognition (continuing to wait for and process audio even if the user
|
73
73
|
# pauses speaking) until the client closes the input stream (gRPC API) or
|
74
74
|
# until the maximum time limit has been reached. May return multiple
|
75
|
-
#
|
75
|
+
# `StreamingRecognitionResult`s with the `is_final` flag set to `true`.
|
76
76
|
#
|
77
|
-
# If
|
77
|
+
# If `true`, the recognizer will detect a single spoken utterance. When it
|
78
78
|
# detects that the user has paused or stopped speaking, it will return an
|
79
|
-
#
|
80
|
-
# more than one
|
81
|
-
#
|
79
|
+
# `END_OF_SINGLE_UTTERANCE` event and cease recognition. It will return no
|
80
|
+
# more than one `StreamingRecognitionResult` with the `is_final` flag set to
|
81
|
+
# `true`.
|
82
82
|
# @!attribute [rw] interim_results
|
83
83
|
# @return [true, false]
|
84
|
-
# *Optional* If
|
84
|
+
# *Optional* If `true`, interim results (tentative hypotheses) may be
|
85
85
|
# returned as they become available (these interim results are indicated with
|
86
|
-
# the
|
87
|
-
# If
|
86
|
+
# the `is_final=false` flag).
|
87
|
+
# If `false` or omitted, only `is_final=true` result(s) are returned.
|
88
88
|
class StreamingRecognitionConfig; end
|
89
89
|
|
90
90
|
# Provides information to the recognizer that specifies how to process the
|
91
91
|
# request.
|
92
92
|
# @!attribute [rw] encoding
|
93
93
|
# @return [Google::Cloud::Speech::V1::RecognitionConfig::AudioEncoding]
|
94
|
-
# Encoding of audio data sent in all
|
95
|
-
# This field is optional for
|
94
|
+
# Encoding of audio data sent in all `RecognitionAudio` messages.
|
95
|
+
# This field is optional for `FLAC` and `WAV` audio files and required
|
96
96
|
# for all other audio formats. For details, see {Google::Cloud::Speech::V1::RecognitionConfig::AudioEncoding AudioEncoding}.
|
97
97
|
# @!attribute [rw] sample_rate_hertz
|
98
98
|
# @return [Integer]
|
99
99
|
# Sample rate in Hertz of the audio data sent in all
|
100
|
-
#
|
100
|
+
# `RecognitionAudio` messages. Valid values are: 8000-48000.
|
101
101
|
# 16000 is optimal. For best results, set the sampling rate of the audio
|
102
102
|
# source to 16000 Hz. If that's not possible, use the native sample rate of
|
103
103
|
# the audio source (instead of re-sampling).
|
104
|
-
# This field is optional for
|
104
|
+
# This field is optional for `FLAC` and `WAV` audio files and required
|
105
105
|
# for all other audio formats. For details, see {Google::Cloud::Speech::V1::RecognitionConfig::AudioEncoding AudioEncoding}.
|
106
106
|
# @!attribute [rw] language_code
|
107
107
|
# @return [String]
|
@@ -113,16 +113,16 @@ module Google
|
|
113
113
|
# @!attribute [rw] max_alternatives
|
114
114
|
# @return [Integer]
|
115
115
|
# *Optional* Maximum number of recognition hypotheses to be returned.
|
116
|
-
# Specifically, the maximum number of
|
117
|
-
# within each
|
118
|
-
# The server may return fewer than
|
119
|
-
# Valid values are
|
116
|
+
# Specifically, the maximum number of `SpeechRecognitionAlternative` messages
|
117
|
+
# within each `SpeechRecognitionResult`.
|
118
|
+
# The server may return fewer than `max_alternatives`.
|
119
|
+
# Valid values are `0`-`30`. A value of `0` or `1` will return a maximum of
|
120
120
|
# one. If omitted, will return a maximum of one.
|
121
121
|
# @!attribute [rw] profanity_filter
|
122
122
|
# @return [true, false]
|
123
|
-
# *Optional* If set to
|
123
|
+
# *Optional* If set to `true`, the server will attempt to filter out
|
124
124
|
# profanities, replacing all but the initial character in each filtered word
|
125
|
-
# with asterisks, e.g. "f***". If set to
|
125
|
+
# with asterisks, e.g. "f***". If set to `false` or omitted, profanities
|
126
126
|
# won't be filtered out.
|
127
127
|
# @!attribute [rw] speech_contexts
|
128
128
|
# @return [Array<Google::Cloud::Speech::V1::SpeechContext>]
|
@@ -131,10 +131,10 @@ module Google
|
|
131
131
|
# information, see [Phrase Hints](https://cloud.google.com/speech-to-text/docs/basics#phrase-hints).
|
132
132
|
# @!attribute [rw] enable_word_time_offsets
|
133
133
|
# @return [true, false]
|
134
|
-
# *Optional* If
|
134
|
+
# *Optional* If `true`, the top result includes a list of words and
|
135
135
|
# the start and end time offsets (timestamps) for those words. If
|
136
|
-
#
|
137
|
-
#
|
136
|
+
# `false`, no word-level time offset information is returned. The default is
|
137
|
+
# `false`.
|
138
138
|
# @!attribute [rw] enable_automatic_punctuation
|
139
139
|
# @return [true, false]
|
140
140
|
# *Optional* If 'true', adds punctuation to recognition result hypotheses.
|
@@ -181,15 +181,15 @@ module Google
|
|
181
181
|
# @!attribute [rw] use_enhanced
|
182
182
|
# @return [true, false]
|
183
183
|
# *Optional* Set to true to use an enhanced model for speech recognition.
|
184
|
-
# You must also set the
|
185
|
-
#
|
186
|
-
#
|
184
|
+
# You must also set the `model` field to a valid, enhanced model. If
|
185
|
+
# `use_enhanced` is set to true and the `model` field is not set, then
|
186
|
+
# `use_enhanced` is ignored. If `use_enhanced` is true and an enhanced
|
187
187
|
# version of the specified model does not exist, then the speech is
|
188
188
|
# recognized using the standard version of the specified model.
|
189
189
|
#
|
190
190
|
# Enhanced speech models require that you opt-in to data logging using
|
191
191
|
# instructions in the [documentation](https://cloud.google.com/speech-to-text/enable-data-logging).
|
192
|
-
# If you set
|
192
|
+
# If you set `use_enhanced` to true and you have not enabled audio logging,
|
193
193
|
# then you will receive an error.
|
194
194
|
class RecognitionConfig
|
195
195
|
# The encoding of the audio data sent in the request.
|
@@ -197,18 +197,18 @@ module Google
|
|
197
197
|
# All encodings support only 1 channel (mono) audio.
|
198
198
|
#
|
199
199
|
# For best results, the audio source should be captured and transmitted using
|
200
|
-
# a lossless encoding (
|
200
|
+
# a lossless encoding (`FLAC` or `LINEAR16`). The accuracy of the speech
|
201
201
|
# recognition can be reduced if lossy codecs are used to capture or transmit
|
202
202
|
# audio, particularly if background noise is present. Lossy codecs include
|
203
|
-
#
|
203
|
+
# `MULAW`, `AMR`, `AMR_WB`, `OGG_OPUS`, and `SPEEX_WITH_HEADER_BYTE`.
|
204
204
|
#
|
205
|
-
# The
|
206
|
-
# included audio content. You can request recognition for
|
207
|
-
# contain either
|
208
|
-
# If you send
|
209
|
-
# your request, you do not need to specify an
|
205
|
+
# The `FLAC` and `WAV` audio file formats include a header that describes the
|
206
|
+
# included audio content. You can request recognition for `WAV` files that
|
207
|
+
# contain either `LINEAR16` or `MULAW` encoded audio.
|
208
|
+
# If you send `FLAC` or `WAV` audio file format in
|
209
|
+
# your request, you do not need to specify an `AudioEncoding`; the audio
|
210
210
|
# encoding format is determined from the file header. If you specify
|
211
|
-
# an
|
211
|
+
# an `AudioEncoding` when you send send `FLAC` or `WAV` audio, the
|
212
212
|
# encoding configuration must match the encoding described in the audio
|
213
213
|
# header; otherwise the request returns an
|
214
214
|
# {Google::Rpc::Code::INVALID_ARGUMENT} error code.
|
@@ -219,33 +219,33 @@ module Google
|
|
219
219
|
# Uncompressed 16-bit signed little-endian samples (Linear PCM).
|
220
220
|
LINEAR16 = 1
|
221
221
|
|
222
|
-
#
|
222
|
+
# `FLAC` (Free Lossless Audio
|
223
223
|
# Codec) is the recommended encoding because it is
|
224
224
|
# lossless--therefore recognition is not compromised--and
|
225
|
-
# requires only about half the bandwidth of
|
225
|
+
# requires only about half the bandwidth of `LINEAR16`. `FLAC` stream
|
226
226
|
# encoding supports 16-bit and 24-bit samples, however, not all fields in
|
227
|
-
#
|
227
|
+
# `STREAMINFO` are supported.
|
228
228
|
FLAC = 2
|
229
229
|
|
230
230
|
# 8-bit samples that compand 14-bit audio samples using G.711 PCMU/mu-law.
|
231
231
|
MULAW = 3
|
232
232
|
|
233
|
-
# Adaptive Multi-Rate Narrowband codec.
|
233
|
+
# Adaptive Multi-Rate Narrowband codec. `sample_rate_hertz` must be 8000.
|
234
234
|
AMR = 4
|
235
235
|
|
236
|
-
# Adaptive Multi-Rate Wideband codec.
|
236
|
+
# Adaptive Multi-Rate Wideband codec. `sample_rate_hertz` must be 16000.
|
237
237
|
AMR_WB = 5
|
238
238
|
|
239
239
|
# Opus encoded audio frames in Ogg container
|
240
240
|
# ([OggOpus](https://wiki.xiph.org/OggOpus)).
|
241
|
-
#
|
241
|
+
# `sample_rate_hertz` must be one of 8000, 12000, 16000, 24000, or 48000.
|
242
242
|
OGG_OPUS = 6
|
243
243
|
|
244
244
|
# Although the use of lossy encodings is not recommended, if a very low
|
245
|
-
# bitrate encoding is required,
|
245
|
+
# bitrate encoding is required, `OGG_OPUS` is highly preferred over
|
246
246
|
# Speex encoding. The [Speex](https://speex.org/) encoding supported by
|
247
247
|
# Cloud Speech API has a header byte in each block, as in MIME type
|
248
|
-
#
|
248
|
+
# `audio/x-speex-with-header-byte`.
|
249
249
|
# It is a variant of the RTP Speex encoding defined in
|
250
250
|
# [RFC 5574](https://tools.ietf.org/html/rfc5574).
|
251
251
|
# The stream is a sequence of blocks, one block per RTP packet. Each block
|
@@ -253,7 +253,7 @@ module Google
|
|
253
253
|
# by one or more frames of Speex data, padded to an integral number of
|
254
254
|
# bytes (octets) as specified in RFC 5574. In other words, each RTP header
|
255
255
|
# is replaced with a single byte containing the block length. Only Speex
|
256
|
-
# wideband is supported.
|
256
|
+
# wideband is supported. `sample_rate_hertz` must be 16000.
|
257
257
|
SPEEX_WITH_HEADER_BYTE = 7
|
258
258
|
end
|
259
259
|
end
|
@@ -270,28 +270,28 @@ module Google
|
|
270
270
|
# [usage limits](https://cloud.google.com/speech-to-text/quotas#content).
|
271
271
|
class SpeechContext; end
|
272
272
|
|
273
|
-
# Contains audio data in the encoding specified in the
|
274
|
-
# Either
|
273
|
+
# Contains audio data in the encoding specified in the `RecognitionConfig`.
|
274
|
+
# Either `content` or `uri` must be supplied. Supplying both or neither
|
275
275
|
# returns {Google::Rpc::Code::INVALID_ARGUMENT}. See
|
276
276
|
# [content limits](https://cloud.google.com/speech-to-text/quotas#content).
|
277
277
|
# @!attribute [rw] content
|
278
278
|
# @return [String]
|
279
279
|
# The audio data bytes encoded as specified in
|
280
|
-
#
|
280
|
+
# `RecognitionConfig`. Note: as with all bytes fields, protobuffers use a
|
281
281
|
# pure binary representation, whereas JSON representations use base64.
|
282
282
|
# @!attribute [rw] uri
|
283
283
|
# @return [String]
|
284
284
|
# URI that points to a file that contains audio data bytes as specified in
|
285
|
-
#
|
285
|
+
# `RecognitionConfig`. The file must not be compressed (for example, gzip).
|
286
286
|
# Currently, only Google Cloud Storage URIs are
|
287
287
|
# supported, which must be specified in the following format:
|
288
|
-
#
|
288
|
+
# `gs://bucket_name/object_name` (other URI formats return
|
289
289
|
# {Google::Rpc::Code::INVALID_ARGUMENT}). For more information, see
|
290
290
|
# [Request URIs](https://cloud.google.com/storage/docs/reference-uris).
|
291
291
|
class RecognitionAudio; end
|
292
292
|
|
293
|
-
# The only message returned to the client by the
|
294
|
-
# contains the result as zero or more sequential
|
293
|
+
# The only message returned to the client by the `Recognize` method. It
|
294
|
+
# contains the result as zero or more sequential `SpeechRecognitionResult`
|
295
295
|
# messages.
|
296
296
|
# @!attribute [rw] results
|
297
297
|
# @return [Array<Google::Cloud::Speech::V1::SpeechRecognitionResult>]
|
@@ -299,10 +299,10 @@ module Google
|
|
299
299
|
# sequential portions of audio.
|
300
300
|
class RecognizeResponse; end
|
301
301
|
|
302
|
-
# The only message returned to the client by the
|
303
|
-
# It contains the result as zero or more sequential
|
304
|
-
# messages. It is included in the
|
305
|
-
# returned by the
|
302
|
+
# The only message returned to the client by the `LongRunningRecognize` method.
|
303
|
+
# It contains the result as zero or more sequential `SpeechRecognitionResult`
|
304
|
+
# messages. It is included in the `result.response` field of the `Operation`
|
305
|
+
# returned by the `GetOperation` call of the `google::longrunning::Operations`
|
306
306
|
# service.
|
307
307
|
# @!attribute [rw] results
|
308
308
|
# @return [Array<Google::Cloud::Speech::V1::SpeechRecognitionResult>]
|
@@ -310,9 +310,9 @@ module Google
|
|
310
310
|
# sequential portions of audio.
|
311
311
|
class LongRunningRecognizeResponse; end
|
312
312
|
|
313
|
-
# Describes the progress of a long-running
|
314
|
-
# included in the
|
315
|
-
#
|
313
|
+
# Describes the progress of a long-running `LongRunningRecognize` call. It is
|
314
|
+
# included in the `metadata` field of the `Operation` returned by the
|
315
|
+
# `GetOperation` call of the `google::longrunning::Operations` service.
|
316
316
|
# @!attribute [rw] progress_percent
|
317
317
|
# @return [Integer]
|
318
318
|
# Approximate percentage of audio processed thus far. Guaranteed to be 100
|
@@ -325,13 +325,13 @@ module Google
|
|
325
325
|
# Time of the most recent processing update.
|
326
326
|
class LongRunningRecognizeMetadata; end
|
327
327
|
|
328
|
-
#
|
329
|
-
#
|
328
|
+
# `StreamingRecognizeResponse` is the only message returned to the client by
|
329
|
+
# `StreamingRecognize`. A series of zero or more `StreamingRecognizeResponse`
|
330
330
|
# messages are streamed back to the client. If there is no recognizable
|
331
|
-
# audio, and
|
331
|
+
# audio, and `single_utterance` is set to false, then no messages are streamed
|
332
332
|
# back to the client.
|
333
333
|
#
|
334
|
-
# Here's an example of a series of ten
|
334
|
+
# Here's an example of a series of ten `StreamingRecognizeResponse`s that might
|
335
335
|
# be returned while processing audio:
|
336
336
|
#
|
337
337
|
# 1. results { alternatives { transcript: "tube" } stability: 0.01 }
|
@@ -359,21 +359,21 @@ module Google
|
|
359
359
|
# Notes:
|
360
360
|
#
|
361
361
|
# * Only two of the above responses #4 and #7 contain final results; they are
|
362
|
-
# indicated by
|
362
|
+
# indicated by `is_final: true`. Concatenating these together generates the
|
363
363
|
# full transcript: "to be or not to be that is the question".
|
364
364
|
#
|
365
|
-
# * The others contain interim
|
366
|
-
#
|
365
|
+
# * The others contain interim `results`. #3 and #6 contain two interim
|
366
|
+
# `results`: the first portion has a high stability and is less likely to
|
367
367
|
# change; the second portion has a low stability and is very likely to
|
368
|
-
# change. A UI designer might choose to show only high stability
|
368
|
+
# change. A UI designer might choose to show only high stability `results`.
|
369
369
|
#
|
370
|
-
# * The specific
|
370
|
+
# * The specific `stability` and `confidence` values shown above are only for
|
371
371
|
# illustrative purposes. Actual values may vary.
|
372
372
|
#
|
373
373
|
# * In each response, only one of these fields will be set:
|
374
|
-
#
|
375
|
-
#
|
376
|
-
# one or more (repeated)
|
374
|
+
# `error`,
|
375
|
+
# `speech_event_type`, or
|
376
|
+
# one or more (repeated) `results`.
|
377
377
|
# @!attribute [rw] error
|
378
378
|
# @return [Google::Rpc::Status]
|
379
379
|
# Output only. If set, returns a {Google::Rpc::Status} message that
|
@@ -382,8 +382,8 @@ module Google
|
|
382
382
|
# @return [Array<Google::Cloud::Speech::V1::StreamingRecognitionResult>]
|
383
383
|
# Output only. This repeated list contains zero or more results that
|
384
384
|
# correspond to consecutive portions of the audio currently being processed.
|
385
|
-
# It contains zero or one
|
386
|
-
# followed by zero or more
|
385
|
+
# It contains zero or one `is_final=true` result (the newly settled portion),
|
386
|
+
# followed by zero or more `is_final=false` results (the interim results).
|
387
387
|
# @!attribute [rw] speech_event_type
|
388
388
|
# @return [Google::Cloud::Speech::V1::StreamingRecognizeResponse::SpeechEventType]
|
389
389
|
# Output only. Indicates the type of speech event.
|
@@ -399,7 +399,7 @@ module Google
|
|
399
399
|
# additional results). The client should stop sending additional audio
|
400
400
|
# data, half-close the gRPC connection, and wait for any additional results
|
401
401
|
# until the server closes the gRPC connection. This event is only sent if
|
402
|
-
#
|
402
|
+
# `single_utterance` was set to `true`, and is not used otherwise.
|
403
403
|
END_OF_SINGLE_UTTERANCE = 1
|
404
404
|
end
|
405
405
|
end
|
@@ -409,14 +409,14 @@ module Google
|
|
409
409
|
# @!attribute [rw] alternatives
|
410
410
|
# @return [Array<Google::Cloud::Speech::V1::SpeechRecognitionAlternative>]
|
411
411
|
# Output only. May contain one or more recognition hypotheses (up to the
|
412
|
-
# maximum specified in
|
412
|
+
# maximum specified in `max_alternatives`).
|
413
413
|
# These alternatives are ordered in terms of accuracy, with the top (first)
|
414
414
|
# alternative being the most probable, as ranked by the recognizer.
|
415
415
|
# @!attribute [rw] is_final
|
416
416
|
# @return [true, false]
|
417
|
-
# Output only. If
|
418
|
-
# interim result that may change. If
|
419
|
-
# speech service will return this particular
|
417
|
+
# Output only. If `false`, this `StreamingRecognitionResult` represents an
|
418
|
+
# interim result that may change. If `true`, this is the final time the
|
419
|
+
# speech service will return this particular `StreamingRecognitionResult`,
|
420
420
|
# the recognizer will not return any further hypotheses for this portion of
|
421
421
|
# the transcript and corresponding audio.
|
422
422
|
# @!attribute [rw] stability
|
@@ -424,15 +424,15 @@ module Google
|
|
424
424
|
# Output only. An estimate of the likelihood that the recognizer will not
|
425
425
|
# change its guess about this interim result. Values range from 0.0
|
426
426
|
# (completely unstable) to 1.0 (completely stable).
|
427
|
-
# This field is only provided for interim results (
|
428
|
-
# The default of 0.0 is a sentinel value indicating
|
427
|
+
# This field is only provided for interim results (`is_final=false`).
|
428
|
+
# The default of 0.0 is a sentinel value indicating `stability` was not set.
|
429
429
|
class StreamingRecognitionResult; end
|
430
430
|
|
431
431
|
# A speech recognition result corresponding to a portion of the audio.
|
432
432
|
# @!attribute [rw] alternatives
|
433
433
|
# @return [Array<Google::Cloud::Speech::V1::SpeechRecognitionAlternative>]
|
434
434
|
# Output only. May contain one or more recognition hypotheses (up to the
|
435
|
-
# maximum specified in
|
435
|
+
# maximum specified in `max_alternatives`).
|
436
436
|
# These alternatives are ordered in terms of accuracy, with the top (first)
|
437
437
|
# alternative being the most probable, as ranked by the recognizer.
|
438
438
|
class SpeechRecognitionResult; end
|
@@ -446,10 +446,10 @@ module Google
|
|
446
446
|
# Output only. The confidence estimate between 0.0 and 1.0. A higher number
|
447
447
|
# indicates an estimated greater likelihood that the recognized words are
|
448
448
|
# correct. This field is set only for the top alternative of a non-streaming
|
449
|
-
# result or, of a streaming result where
|
449
|
+
# result or, of a streaming result where `is_final=true`.
|
450
450
|
# This field is not guaranteed to be accurate and users should not rely on it
|
451
451
|
# to be always provided.
|
452
|
-
# The default of 0.0 is a sentinel value indicating
|
452
|
+
# The default of 0.0 is a sentinel value indicating `confidence` was not set.
|
453
453
|
# @!attribute [rw] words
|
454
454
|
# @return [Array<Google::Cloud::Speech::V1::WordInfo>]
|
455
455
|
# Output only. A list of word-specific information for each recognized word.
|
@@ -460,7 +460,7 @@ module Google
|
|
460
460
|
# @return [Google::Protobuf::Duration]
|
461
461
|
# Output only. Time offset relative to the beginning of the audio,
|
462
462
|
# and corresponding to the start of the spoken word.
|
463
|
-
# This field is only set if
|
463
|
+
# This field is only set if `enable_word_time_offsets=true` and only
|
464
464
|
# in the top hypothesis.
|
465
465
|
# This is an experimental feature and the accuracy of the time offset can
|
466
466
|
# vary.
|
@@ -468,7 +468,7 @@ module Google
|
|
468
468
|
# @return [Google::Protobuf::Duration]
|
469
469
|
# Output only. Time offset relative to the beginning of the audio,
|
470
470
|
# and corresponding to the end of the spoken word.
|
471
|
-
# This field is only set if
|
471
|
+
# This field is only set if `enable_word_time_offsets=true` and only
|
472
472
|
# in the top hypothesis.
|
473
473
|
# This is an experimental feature and the accuracy of the time offset can
|
474
474
|
# vary.
|