ask_chatgpt 0.3.1 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 86f36d546bd015947fcac7a2bb9366530ba0ff8a9012f21f2151ae0484ecf647
4
- data.tar.gz: 136c817bc54317c77325ca0145857262d48680b8b25072e3364dcd0fe526e501
3
+ metadata.gz: 6a5ccd11d62b51d4422562be281edcf67254e9829f551edf2f91acd970fd091b
4
+ data.tar.gz: 1328f4ffedb742cce8c330de2fedf7bca4632610a5e95e0215a6dabc6dc3d50d
5
5
  SHA512:
6
- metadata.gz: 7b3112557fcb6e3fa11856693e26dc37df80dedafde058f6f181d303a336f75cc4ab21f9c1daf89dd61d253eea150b05d5f664408ab5299c07210f37d8f9d87c
7
- data.tar.gz: 4c5a22ca8157813e15fa5e2f639eefd89967ba9bbd300d6c0929bfe18cd3b2ff4e600197d59411520da05626c77ec9ca1786e2b6f3bfd7b51a2d5403f4528098
6
+ metadata.gz: 869bb402cb40bbaa8d0e60a4c0c9e3247a155a18b60d805c3943baa2b0c90d1ccf8471de4359df5ca63a6e5197231cca1cbae5398716e49ed0ecd0c14b2e45fe
7
+ data.tar.gz: 83c04d9f695e3772ec35351b4de6e019d2f866023421045ec8f6dc2decdaded75f3feb9c70947caf1837b3942333ab69c23b4e02ca7e9efe87c77c7b248fb8af
data/README.md CHANGED
@@ -31,6 +31,10 @@ Go to Rails console and run:
31
31
  [first_name, last_name].join
32
32
  end
33
33
  }
34
+ #
35
+ # --- NEW ---
36
+ #
37
+ gpt.speak # or with alias gpt.s
34
38
  ```
35
39
 
36
40
  OR with CLI tool:
@@ -46,6 +50,8 @@ aGVsbG8gd29ybGQ=
46
50
  hello world
47
51
  ```
48
52
 
53
+ >ask_chatgpt -s 1 # start voice input with CLI
54
+
49
55
  See some examples below. You can also create your own prompts with just few lines of code [here](#options--configurations).
50
56
 
51
57
  Also you can use a CLI tool, [how to use it](#cli-tool).
@@ -113,6 +119,17 @@ And you can edit:
113
119
  # config.max_tokens = 3000 # or nil by default
114
120
  # config.included_prompts = []
115
121
 
122
+ # enable voice input with `gpt.speak` or `gpt.s`. Note, you also need to configure `audio_device_id`
123
+ # config.voice_enabled = true
124
+
125
+ # to get audio device ID (index in the input devices)
126
+ # install ffmpeg, and execute from the console
127
+ # `ffmpeg -f avfoundation -list_devices true -i ""`
128
+ # config.audio_device_id = 1
129
+
130
+ # after "voice_max_duration" seconds it will send audio to Open AI
131
+ # config.voice_max_duration = 10 # 10 seconds
132
+
116
133
  # Examples of custom prompts:
117
134
  # you can use them `gpt.extract_email("some string")`
118
135
 
@@ -182,8 +199,55 @@ end
182
199
 
183
200
  or directly in console `gpt.debug!` (and finish `gpt.debug!(:off)`)
184
201
 
202
+ ## Voice Input
203
+
204
+ Demo: https://youtu.be/uBR0wnQvKao
205
+
206
+ For now I consider this as an experimental and fun feature. Look forward seeing your feedback.
207
+
208
+ Works with command: `gpt.speak` or `gpt.s` (alias).
209
+
210
+ This command starts recording right away and it will stop after `voice_max_duration` seconds or if you press any key.
211
+
212
+ To exit recording mode press `Q`.
213
+
214
+ Voice is using `ffmpeg` tool, so you need to install it. Some instruction like this will work: https://www.hostinger.com/tutorials/how-to-install-ffmpeg.
215
+
216
+ Also, you need to configure `audio_device_id`. Run `ffmpeg -f avfoundation -list_devices true -i ""`
217
+
218
+ It will give you list of all devices, like this:
219
+
220
+ ```s
221
+ ffmpeg -f avfoundation -list_devices true -i ""
222
+ ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
223
+ built with Apple clang version 14.0.0 (clang-1400.0.29.202)
224
+ configuration: --prefix=/usr/local/Cellar/ffmpeg/6.0 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox
225
+ libavutil 58. 2.100 / 58. 2.100
226
+ libavcodec 60. 3.100 / 60. 3.100
227
+ libavformat 60. 3.100 / 60. 3.100
228
+ libavdevice 60. 1.100 / 60. 1.100
229
+ libavfilter 9. 3.100 / 9. 3.100
230
+ libswscale 7. 1.100 / 7. 1.100
231
+ libswresample 4. 10.100 / 4. 10.100
232
+ libpostproc 57. 1.100 / 57. 1.100
233
+ [AVFoundation indev @ 0x7f7fd1a04380] AVFoundation video devices:
234
+ [AVFoundation indev @ 0x7f7fd1a04380] [0] FaceTime HD Camera
235
+ [AVFoundation indev @ 0x7f7fd1a04380] [1] USB Camera VID:1133 PID:2085
236
+ [AVFoundation indev @ 0x7f7fd1a04380] [2] Capture screen 0
237
+ [AVFoundation indev @ 0x7f7fd1a04380] [3] Capture screen 1
238
+ [AVFoundation indev @ 0x7f7fd1a04380] AVFoundation audio devices:
239
+ [AVFoundation indev @ 0x7f7fd1a04380] [0] Microsoft Teams Audio
240
+ [AVFoundation indev @ 0x7f7fd1a04380] [1] Built-in Microphone
241
+ [AVFoundation indev @ 0x7f7fd1a04380] [2] Unknown USB Audio Device
242
+ : Input/output error
243
+ ```
244
+
245
+ In my case I used "1", because it's `Built-in Microphone`.
246
+
185
247
  ## CLI Tool
186
248
 
249
+ You can ask questions from cli or even start voice input.
250
+
187
251
  Example 1:
188
252
  ![AskChatGPT](docs/unzip.gif)
189
253
 
@@ -232,6 +296,10 @@ end
232
296
  - print tokens usage? `.with_usage`
233
297
  - support org_id? in the configs
234
298
  - use `gpt` in the code of the main app (e.g. model/controller)
299
+ - when voice is used add support for payloads, e.g. `gpt.with_payload(json).speak` (and it will send payload with my question)
300
+ - refactor voice input code :) as first version it's fine
301
+ - can we discover audio device ID?
302
+ - use tempfile for audio, instead of output.wav
235
303
 
236
304
  ## Contributing
237
305
 
data/bin/ask_chatgpt CHANGED
@@ -24,6 +24,7 @@ parser = OptionParser.new do |opts|
24
24
  ask_chatgpt -f app/models/user.rb -q "find a bug in this Rails model"
25
25
  ask_chatgpt -f app/models/user.rb -q "create RSpec spec for this model"
26
26
  ask_chatgpt -f test/dummy/Gemfile -q "sort Ruby gems alphabetically"
27
+ ask_chatgpt -s 1"
27
28
 
28
29
  Version: #{AskChatGPT::VERSION}
29
30
 
@@ -33,6 +34,10 @@ parser = OptionParser.new do |opts|
33
34
  options[:prompt] = prompt
34
35
  end
35
36
 
37
+ opts.on("-s", "--speak AudioDeviceID", String, "Specify audio device ID") do |audio_device_id|
38
+ options[:audio_device_id] = audio_device_id
39
+ end
40
+
36
41
  opts.on("-f", "--file FILE", String, "Specify file with prompt") do |file|
37
42
  options[:file_path] = file
38
43
  end
@@ -53,14 +58,22 @@ AskChatGPT.debug = !!options[:debug]
53
58
 
54
59
  options[:prompt] = ARGV.join(" ") if options[:prompt].blank?
55
60
 
56
- if options[:prompt].blank?
61
+ if options[:prompt].blank? && options[:audio_device_id].blank?
57
62
  puts parser
58
63
  exit
59
64
  end
60
65
 
61
66
  include AskChatGPT::Console
62
67
 
63
- instance = gpt.ask(options[:prompt])
64
- instance = instance.payload(File.read(options[:file_path])) if options[:file_path].present?
68
+ if options[:audio_device_id].present?
69
+ AskChatGPT.voice_enabled = true
70
+ AskChatGPT.voice_max_duration = 20
71
+ AskChatGPT.audio_device_id = options[:audio_device_id]
65
72
 
66
- puts instance.inspect
73
+ instance = gpt.speak
74
+ else
75
+ instance = gpt.ask(options[:prompt])
76
+ instance = instance.payload(File.read(options[:file_path])) if options[:file_path].present?
77
+
78
+ puts instance.inspect
79
+ end
@@ -2,6 +2,7 @@ require_relative "sugar"
2
2
  require_relative "prompts/base"
3
3
  require_relative "prompts/improve"
4
4
  require_relative "default_behavior"
5
+ require_relative "voice"
5
6
 
6
7
  Dir[File.join(__dir__, "prompts", "*.rb")].each do |file|
7
8
  require file
@@ -23,6 +24,14 @@ module AskChatgpt
23
24
  AskChatgpt::Executor.new(client)
24
25
  end
25
26
 
27
+ def speak
28
+ puts "Voice input is not enabled (docs: https://github.com/railsjazz/ask_chatgpt)" unless AskChatGPT.voice_enabled
29
+ puts "Audio device ID is not configured (docs: https://github.com/railsjazz/ask_chatgpt)" unless AskChatGPT.audio_device_id
30
+
31
+ AskChatgpt::VoiceFlow::Voice.new.run
32
+ end
33
+ alias_method :s, :speak
34
+
26
35
  def initialize(client)
27
36
  @scope = AskChatGPT.included_prompts.dup
28
37
  @client = client
@@ -1,3 +1,3 @@
1
1
  module AskChatgpt
2
- VERSION = "0.3.1"
2
+ VERSION = "0.4.0"
3
3
  end
@@ -0,0 +1,176 @@
1
+ module StringExt
2
+ refine String do
3
+ def black; "\e[30m#{self}\e[0m" end
4
+ def red; "\e[31m#{self}\e[0m" end
5
+ def green; "\e[32m#{self}\e[0m" end
6
+ def brown; "\e[33m#{self}\e[0m" end
7
+ def blue; "\e[34m#{self}\e[0m" end
8
+ def magenta; "\e[35m#{self}\e[0m" end
9
+ def cyan; "\e[36m#{self}\e[0m" end
10
+ def gray; "\e[37m#{self}\e[0m" end
11
+ end
12
+ end
13
+
14
+ using StringExt
15
+
16
+ module AskChatgpt
17
+ module VoiceFlow
18
+ require 'io/console'
19
+ require 'fileutils'
20
+ require 'timeout'
21
+ require 'open3'
22
+
23
+ class AudioRecorder
24
+ OUTPUT_FILE = "output.wav"
25
+
26
+ def initialize(duration)
27
+ @duration = duration
28
+ end
29
+
30
+ # ffmpeg -f avfoundation -list_devices true -i ""
31
+ def audio_device_id
32
+ AskChatGPT.audio_device_id
33
+ end
34
+
35
+ def start
36
+ delete_audio_file
37
+ ffmpeg_command = build_ffmpeg_command
38
+ @stdin, @stdout_and_stderr, @wait_thread = Open3.popen2e(ffmpeg_command)
39
+ end
40
+
41
+ def stop
42
+ @stdin.puts 'q'
43
+ @stdin.close
44
+ @stdout_and_stderr.close
45
+ sleep(0.2)
46
+ rescue Errno::EPIPE, IOError
47
+ end
48
+
49
+ def delete_audio_file
50
+ FileUtils.rm(OUTPUT_FILE) if File.exist?(OUTPUT_FILE)
51
+ end
52
+
53
+ private
54
+
55
+ def build_ffmpeg_command
56
+ case RUBY_PLATFORM
57
+ when /darwin/
58
+ input_device = "-f avfoundation -i \":#{audio_device_id}\""
59
+ when /linux/
60
+ input_device = '-f alsa -i default'
61
+ when /mingw|mswin/
62
+ input_device = '-f dshow -i audio="Microphone (Realtek High Definition Audio)"'
63
+ else
64
+ raise "Unsupported platform: #{RUBY_PLATFORM}"
65
+ end
66
+
67
+ "ffmpeg -loglevel quiet #{input_device} -t #{@duration} #{OUTPUT_FILE}"
68
+ end
69
+ end
70
+
71
+ class Voice
72
+ def initialize
73
+ @messages = []
74
+ @wanna_quit = false
75
+ @duration = (AskChatGPT.voice_max_duration.presence || 10).to_i
76
+ @ffmpeg_wait_duration = 0.5
77
+ @executing = true
78
+ @spinner = nil
79
+ end
80
+
81
+ def run
82
+ while @executing
83
+ # Start the parallel process
84
+ audio_recorder = AudioRecorder.new(@duration)
85
+ audio_recorder.start
86
+
87
+ begin
88
+ Timeout.timeout(@duration + @ffmpeg_wait_duration) do
89
+ @spinner = TTY::Spinner.new("[Recording]".red + " / Press any key to stop recording or \"Esc\" / \"q\" to quit ... ".blue + ":spinner".red, format: :spin)
90
+ @spinner.auto_spin
91
+ sleep(@ffmpeg_wait_duration) # five some time for ffmpeg to start
92
+ # Listen for user input in the main process
93
+ begin
94
+ char = $stdin.getch
95
+ if char.ord == 27 || char.upcase == "Q"
96
+ audio_recorder.stop
97
+ @executing = false
98
+ @spinner.stop
99
+ puts "Bye...".brown
100
+ end
101
+ break
102
+ end while char.nil?
103
+ end
104
+ rescue Timeout::Error
105
+ ensure
106
+ audio_recorder.stop
107
+ @spinner.stop
108
+ break unless @executing
109
+ end
110
+
111
+ if !File.exist?("output.wav")
112
+ puts "No audio file found, please try again.".brown
113
+ sleep(0.5)
114
+ next
115
+ end
116
+
117
+ @spinner = TTY::Spinner.new("Thinking :spinner".cyan, format: :dots)
118
+ @spinner.auto_spin
119
+ response = client.transcribe(parameters: { model: "whisper-1", file: File.open("output.wav", "rb") })
120
+ @spinner.stop
121
+
122
+ if response["error"]
123
+ puts response["error"].inspect.brown
124
+ @executing = false
125
+ break
126
+ end
127
+
128
+ user_input = response["text"].to_s
129
+ puts "USER> ".green + user_input
130
+ @messages << { role: "user", content:user_input }
131
+ print "ASSISTANT> ".magenta
132
+
133
+ stop_stream = false
134
+ reply = []
135
+
136
+ keypresser = Thread.new do
137
+ loop { stop_stream = true if $stdin.getch }
138
+ end
139
+
140
+ begin
141
+ client.chat(
142
+ parameters: {
143
+ model: "gpt-3.5-turbo",
144
+ messages: @messages,
145
+ temperature: 0.7,
146
+ stream: proc do |chunk, _bytesize|
147
+ break if stop_stream
148
+ message = chunk.dig("choices", 0, "delta", "content")
149
+ next if message.to_s.empty?
150
+
151
+ message = message.gsub("\n", "\r\n")
152
+
153
+ print message
154
+ reply += [message]
155
+ end
156
+ })
157
+ rescue LocalJumpError
158
+ puts
159
+ ensure
160
+ Thread.kill(keypresser)
161
+ stop_stream = false
162
+ @messages << { role: "assistant", content: reply.join }
163
+ puts
164
+ end
165
+
166
+ audio_recorder.delete_audio_file
167
+ end
168
+ end
169
+
170
+ def client
171
+ @client ||= OpenAI::Client.new(access_token: AskChatGPT.access_token)
172
+ end
173
+ end
174
+
175
+ end
176
+ end
data/lib/ask_chatgpt.rb CHANGED
@@ -49,6 +49,19 @@ module AskChatgpt
49
49
  mattr_accessor :included_prompts
50
50
  @@included_prompts = [AskChatGPT::Prompts::App.new]
51
51
 
52
+ # enable voice input, requires ffmpeg to be installed and also you need to configure audio_device_id
53
+ mattr_accessor :voice_enabled
54
+ @voice_enabled = false
55
+
56
+ # to get audio device ID (index in the input devices)
57
+ # ffmpeg -f avfoundation -list_devices true -i ""
58
+ mattr_accessor :audio_device_id
59
+ @@audio_device_id = nil
60
+
61
+ # max duration of audio to record
62
+ mattr_accessor :voice_max_duration
63
+ @@voice_max_duration = 10 # 10 seconds
64
+
52
65
  def self.setup
53
66
  yield(self)
54
67
  end
@@ -10,6 +10,11 @@ AskChatGPT.setup do |config|
10
10
  # config.temperature = 0.1
11
11
  # config.included_prompts = []
12
12
 
13
+ # enable voice input, requires ffmpeg to be installed and also you need to configure audio_device_id
14
+ # config.voice_enabled = true
15
+ # config.audio_device_id = 1
16
+ # config.voice_max_duration = 10 # 10 seconds
17
+
13
18
  # Examples of custom prompts:
14
19
  # you can use them `gpt.ask(:extract_email, "some string")`
15
20
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ask_chatgpt
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Igor Kasyanchuk
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2023-05-01 00:00:00.000000000 Z
12
+ date: 2023-05-09 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rails
@@ -95,6 +95,20 @@ dependencies:
95
95
  - - ">="
96
96
  - !ruby/object:Gem::Version
97
97
  version: 1.4.3
98
+ - !ruby/object:Gem::Dependency
99
+ name: io-console
100
+ requirement: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - ">="
103
+ - !ruby/object:Gem::Version
104
+ version: '0'
105
+ type: :runtime
106
+ prerelease: false
107
+ version_requirements: !ruby/object:Gem::Requirement
108
+ requirements:
109
+ - - ">="
110
+ - !ruby/object:Gem::Version
111
+ version: '0'
98
112
  - !ruby/object:Gem::Dependency
99
113
  name: wrapped_print
100
114
  requirement: !ruby/object:Gem::Requirement
@@ -156,6 +170,7 @@ files:
156
170
  - lib/ask_chatgpt/railtie.rb
157
171
  - lib/ask_chatgpt/sugar.rb
158
172
  - lib/ask_chatgpt/version.rb
173
+ - lib/ask_chatgpt/voice.rb
159
174
  - lib/generators/ask_chatgpt/USAGE
160
175
  - lib/generators/ask_chatgpt/ask_chatgpt_generator.rb
161
176
  - lib/generators/ask_chatgpt/templates/template.rb