ask_chatgpt 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 86f36d546bd015947fcac7a2bb9366530ba0ff8a9012f21f2151ae0484ecf647
4
- data.tar.gz: 136c817bc54317c77325ca0145857262d48680b8b25072e3364dcd0fe526e501
3
+ metadata.gz: 6a5ccd11d62b51d4422562be281edcf67254e9829f551edf2f91acd970fd091b
4
+ data.tar.gz: 1328f4ffedb742cce8c330de2fedf7bca4632610a5e95e0215a6dabc6dc3d50d
5
5
  SHA512:
6
- metadata.gz: 7b3112557fcb6e3fa11856693e26dc37df80dedafde058f6f181d303a336f75cc4ab21f9c1daf89dd61d253eea150b05d5f664408ab5299c07210f37d8f9d87c
7
- data.tar.gz: 4c5a22ca8157813e15fa5e2f639eefd89967ba9bbd300d6c0929bfe18cd3b2ff4e600197d59411520da05626c77ec9ca1786e2b6f3bfd7b51a2d5403f4528098
6
+ metadata.gz: 869bb402cb40bbaa8d0e60a4c0c9e3247a155a18b60d805c3943baa2b0c90d1ccf8471de4359df5ca63a6e5197231cca1cbae5398716e49ed0ecd0c14b2e45fe
7
+ data.tar.gz: 83c04d9f695e3772ec35351b4de6e019d2f866023421045ec8f6dc2decdaded75f3feb9c70947caf1837b3942333ab69c23b4e02ca7e9efe87c77c7b248fb8af
data/README.md CHANGED
@@ -31,6 +31,10 @@ Go to Rails console and run:
31
31
  [first_name, last_name].join
32
32
  end
33
33
  }
34
+ #
35
+ # --- NEW ---
36
+ #
37
+ gpt.speak # or with alias gpt.s
34
38
  ```
35
39
 
36
40
  OR with CLI tool:
@@ -46,6 +50,8 @@ aGVsbG8gd29ybGQ=
46
50
  hello world
47
51
  ```
48
52
 
53
+ >ask_chatgpt -s 1 # start voice input with CLI
54
+
49
55
  See some examples below. You can also create your own prompts with just few lines of code [here](#options--configurations).
50
56
 
51
57
  Also you can use a CLI tool, [how to use it](#cli-tool).
@@ -113,6 +119,17 @@ And you can edit:
113
119
  # config.max_tokens = 3000 # or nil by default
114
120
  # config.included_prompts = []
115
121
 
122
+ # enable voice input with `gpt.speak` or `gpt.s`. Note, you also need to configure `audio_device_id`
123
+ # config.voice_enabled = true
124
+
125
+ # to get audio device ID (index in the input devices)
126
+ # install ffmpeg, and execute from the console
127
+ # `ffmpeg -f avfoundation -list_devices true -i ""`
128
+ # config.audio_device_id = 1
129
+
130
+ # after "voice_max_duration" seconds it will send audio to Open AI
131
+ # config.voice_max_duration = 10 # 10 seconds
132
+
116
133
  # Examples of custom prompts:
117
134
  # you can use them `gpt.extract_email("some string")`
118
135
 
@@ -182,8 +199,55 @@ end
182
199
 
183
200
  or directly in console `gpt.debug!` (and finish `gpt.debug!(:off)`)
184
201
 
202
+ ## Voice Input
203
+
204
+ Demo: https://youtu.be/uBR0wnQvKao
205
+
206
+ For now I consider this as an experimental and fun feature. Look forward seeing your feedback.
207
+
208
+ Works with command: `gpt.speak` or `gpt.s` (alias).
209
+
210
+ This command starts recording right away and it will stop after `voice_max_duration` seconds or if you press any key.
211
+
212
+ To exit recording mode press `Q`.
213
+
214
+ Voice is using `ffmpeg` tool, so you need to install it. Some instruction like this will work: https://www.hostinger.com/tutorials/how-to-install-ffmpeg.
215
+
216
+ Also, you need to configure `audio_device_id`. Run `ffmpeg -f avfoundation -list_devices true -i ""`
217
+
218
+ It will give you list of all devices, like this:
219
+
220
+ ```s
221
+ ffmpeg -f avfoundation -list_devices true -i ""
222
+ ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
223
+ built with Apple clang version 14.0.0 (clang-1400.0.29.202)
224
+ configuration: --prefix=/usr/local/Cellar/ffmpeg/6.0 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox
225
+ libavutil 58. 2.100 / 58. 2.100
226
+ libavcodec 60. 3.100 / 60. 3.100
227
+ libavformat 60. 3.100 / 60. 3.100
228
+ libavdevice 60. 1.100 / 60. 1.100
229
+ libavfilter 9. 3.100 / 9. 3.100
230
+ libswscale 7. 1.100 / 7. 1.100
231
+ libswresample 4. 10.100 / 4. 10.100
232
+ libpostproc 57. 1.100 / 57. 1.100
233
+ [AVFoundation indev @ 0x7f7fd1a04380] AVFoundation video devices:
234
+ [AVFoundation indev @ 0x7f7fd1a04380] [0] FaceTime HD Camera
235
+ [AVFoundation indev @ 0x7f7fd1a04380] [1] USB Camera VID:1133 PID:2085
236
+ [AVFoundation indev @ 0x7f7fd1a04380] [2] Capture screen 0
237
+ [AVFoundation indev @ 0x7f7fd1a04380] [3] Capture screen 1
238
+ [AVFoundation indev @ 0x7f7fd1a04380] AVFoundation audio devices:
239
+ [AVFoundation indev @ 0x7f7fd1a04380] [0] Microsoft Teams Audio
240
+ [AVFoundation indev @ 0x7f7fd1a04380] [1] Built-in Microphone
241
+ [AVFoundation indev @ 0x7f7fd1a04380] [2] Unknown USB Audio Device
242
+ : Input/output error
243
+ ```
244
+
245
+ In my case I used "1", because it's `Built-in Microphone`.
246
+
185
247
  ## CLI Tool
186
248
 
249
+ You can ask questions from cli or even start voice input.
250
+
187
251
  Example 1:
188
252
  ![AskChatGPT](docs/unzip.gif)
189
253
 
@@ -232,6 +296,10 @@ end
232
296
  - print tokens usage? `.with_usage`
233
297
  - support org_id? in the configs
234
298
  - use `gpt` in the code of the main app (e.g. model/controller)
299
+ - when voice is used add support for payloads, e.g. `gpt.with_payload(json).speak` (and it will send payload with my question)
300
+ - refactor voice input code :) as first version it's fine
301
+ - can we discover audio device ID?
302
+ - use tempfile for audio, instead of output.wav
235
303
 
236
304
  ## Contributing
237
305
 
data/bin/ask_chatgpt CHANGED
@@ -24,6 +24,7 @@ parser = OptionParser.new do |opts|
24
24
  ask_chatgpt -f app/models/user.rb -q "find a bug in this Rails model"
25
25
  ask_chatgpt -f app/models/user.rb -q "create RSpec spec for this model"
26
26
  ask_chatgpt -f test/dummy/Gemfile -q "sort Ruby gems alphabetically"
27
+ ask_chatgpt -s 1"
27
28
 
28
29
  Version: #{AskChatGPT::VERSION}
29
30
 
@@ -33,6 +34,10 @@ parser = OptionParser.new do |opts|
33
34
  options[:prompt] = prompt
34
35
  end
35
36
 
37
+ opts.on("-s", "--speak AudioDeviceID", String, "Specify audio device ID") do |audio_device_id|
38
+ options[:audio_device_id] = audio_device_id
39
+ end
40
+
36
41
  opts.on("-f", "--file FILE", String, "Specify file with prompt") do |file|
37
42
  options[:file_path] = file
38
43
  end
@@ -53,14 +58,22 @@ AskChatGPT.debug = !!options[:debug]
53
58
 
54
59
  options[:prompt] = ARGV.join(" ") if options[:prompt].blank?
55
60
 
56
- if options[:prompt].blank?
61
+ if options[:prompt].blank? && options[:audio_device_id].blank?
57
62
  puts parser
58
63
  exit
59
64
  end
60
65
 
61
66
  include AskChatGPT::Console
62
67
 
63
- instance = gpt.ask(options[:prompt])
64
- instance = instance.payload(File.read(options[:file_path])) if options[:file_path].present?
68
+ if options[:audio_device_id].present?
69
+ AskChatGPT.voice_enabled = true
70
+ AskChatGPT.voice_max_duration = 20
71
+ AskChatGPT.audio_device_id = options[:audio_device_id]
65
72
 
66
- puts instance.inspect
73
+ instance = gpt.speak
74
+ else
75
+ instance = gpt.ask(options[:prompt])
76
+ instance = instance.payload(File.read(options[:file_path])) if options[:file_path].present?
77
+
78
+ puts instance.inspect
79
+ end
@@ -2,6 +2,7 @@ require_relative "sugar"
2
2
  require_relative "prompts/base"
3
3
  require_relative "prompts/improve"
4
4
  require_relative "default_behavior"
5
+ require_relative "voice"
5
6
 
6
7
  Dir[File.join(__dir__, "prompts", "*.rb")].each do |file|
7
8
  require file
@@ -23,6 +24,14 @@ module AskChatgpt
23
24
  AskChatgpt::Executor.new(client)
24
25
  end
25
26
 
27
+ def speak
28
+ puts "Voice input is not enabled (docs: https://github.com/railsjazz/ask_chatgpt)" unless AskChatGPT.voice_enabled
29
+ puts "Audio device ID is not configured (docs: https://github.com/railsjazz/ask_chatgpt)" unless AskChatGPT.audio_device_id
30
+
31
+ AskChatgpt::VoiceFlow::Voice.new.run
32
+ end
33
+ alias_method :s, :speak
34
+
26
35
  def initialize(client)
27
36
  @scope = AskChatGPT.included_prompts.dup
28
37
  @client = client
@@ -1,3 +1,3 @@
1
1
  module AskChatgpt
2
- VERSION = "0.3.1"
2
+ VERSION = "0.4.0"
3
3
  end
@@ -0,0 +1,176 @@
1
+ module StringExt
2
+ refine String do
3
+ def black; "\e[30m#{self}\e[0m" end
4
+ def red; "\e[31m#{self}\e[0m" end
5
+ def green; "\e[32m#{self}\e[0m" end
6
+ def brown; "\e[33m#{self}\e[0m" end
7
+ def blue; "\e[34m#{self}\e[0m" end
8
+ def magenta; "\e[35m#{self}\e[0m" end
9
+ def cyan; "\e[36m#{self}\e[0m" end
10
+ def gray; "\e[37m#{self}\e[0m" end
11
+ end
12
+ end
13
+
14
+ using StringExt
15
+
16
+ module AskChatgpt
17
+ module VoiceFlow
18
+ require 'io/console'
19
+ require 'fileutils'
20
+ require 'timeout'
21
+ require 'open3'
22
+
23
+ class AudioRecorder
24
+ OUTPUT_FILE = "output.wav"
25
+
26
+ def initialize(duration)
27
+ @duration = duration
28
+ end
29
+
30
+ # ffmpeg -f avfoundation -list_devices true -i ""
31
+ def audio_device_id
32
+ AskChatGPT.audio_device_id
33
+ end
34
+
35
+ def start
36
+ delete_audio_file
37
+ ffmpeg_command = build_ffmpeg_command
38
+ @stdin, @stdout_and_stderr, @wait_thread = Open3.popen2e(ffmpeg_command)
39
+ end
40
+
41
+ def stop
42
+ @stdin.puts 'q'
43
+ @stdin.close
44
+ @stdout_and_stderr.close
45
+ sleep(0.2)
46
+ rescue Errno::EPIPE, IOError
47
+ end
48
+
49
+ def delete_audio_file
50
+ FileUtils.rm(OUTPUT_FILE) if File.exist?(OUTPUT_FILE)
51
+ end
52
+
53
+ private
54
+
55
+ def build_ffmpeg_command
56
+ case RUBY_PLATFORM
57
+ when /darwin/
58
+ input_device = "-f avfoundation -i \":#{audio_device_id}\""
59
+ when /linux/
60
+ input_device = '-f alsa -i default'
61
+ when /mingw|mswin/
62
+ input_device = '-f dshow -i audio="Microphone (Realtek High Definition Audio)"'
63
+ else
64
+ raise "Unsupported platform: #{RUBY_PLATFORM}"
65
+ end
66
+
67
+ "ffmpeg -loglevel quiet #{input_device} -t #{@duration} #{OUTPUT_FILE}"
68
+ end
69
+ end
70
+
71
+ class Voice
72
+ def initialize
73
+ @messages = []
74
+ @wanna_quit = false
75
+ @duration = (AskChatGPT.voice_max_duration.presence || 10).to_i
76
+ @ffmpeg_wait_duration = 0.5
77
+ @executing = true
78
+ @spinner = nil
79
+ end
80
+
81
+ def run
82
+ while @executing
83
+ # Start the parallel process
84
+ audio_recorder = AudioRecorder.new(@duration)
85
+ audio_recorder.start
86
+
87
+ begin
88
+ Timeout.timeout(@duration + @ffmpeg_wait_duration) do
89
+ @spinner = TTY::Spinner.new("[Recording]".red + " / Press any key to stop recording or \"Esc\" / \"q\" to quit ... ".blue + ":spinner".red, format: :spin)
90
+ @spinner.auto_spin
91
+ sleep(@ffmpeg_wait_duration) # five some time for ffmpeg to start
92
+ # Listen for user input in the main process
93
+ begin
94
+ char = $stdin.getch
95
+ if char.ord == 27 || char.upcase == "Q"
96
+ audio_recorder.stop
97
+ @executing = false
98
+ @spinner.stop
99
+ puts "Bye...".brown
100
+ end
101
+ break
102
+ end while char.nil?
103
+ end
104
+ rescue Timeout::Error
105
+ ensure
106
+ audio_recorder.stop
107
+ @spinner.stop
108
+ break unless @executing
109
+ end
110
+
111
+ if !File.exist?("output.wav")
112
+ puts "No audio file found, please try again.".brown
113
+ sleep(0.5)
114
+ next
115
+ end
116
+
117
+ @spinner = TTY::Spinner.new("Thinking :spinner".cyan, format: :dots)
118
+ @spinner.auto_spin
119
+ response = client.transcribe(parameters: { model: "whisper-1", file: File.open("output.wav", "rb") })
120
+ @spinner.stop
121
+
122
+ if response["error"]
123
+ puts response["error"].inspect.brown
124
+ @executing = false
125
+ break
126
+ end
127
+
128
+ user_input = response["text"].to_s
129
+ puts "USER> ".green + user_input
130
+ @messages << { role: "user", content:user_input }
131
+ print "ASSISTANT> ".magenta
132
+
133
+ stop_stream = false
134
+ reply = []
135
+
136
+ keypresser = Thread.new do
137
+ loop { stop_stream = true if $stdin.getch }
138
+ end
139
+
140
+ begin
141
+ client.chat(
142
+ parameters: {
143
+ model: "gpt-3.5-turbo",
144
+ messages: @messages,
145
+ temperature: 0.7,
146
+ stream: proc do |chunk, _bytesize|
147
+ break if stop_stream
148
+ message = chunk.dig("choices", 0, "delta", "content")
149
+ next if message.to_s.empty?
150
+
151
+ message = message.gsub("\n", "\r\n")
152
+
153
+ print message
154
+ reply += [message]
155
+ end
156
+ })
157
+ rescue LocalJumpError
158
+ puts
159
+ ensure
160
+ Thread.kill(keypresser)
161
+ stop_stream = false
162
+ @messages << { role: "assistant", content: reply.join }
163
+ puts
164
+ end
165
+
166
+ audio_recorder.delete_audio_file
167
+ end
168
+ end
169
+
170
+ def client
171
+ @client ||= OpenAI::Client.new(access_token: AskChatGPT.access_token)
172
+ end
173
+ end
174
+
175
+ end
176
+ end
data/lib/ask_chatgpt.rb CHANGED
@@ -49,6 +49,19 @@ module AskChatgpt
49
49
  mattr_accessor :included_prompts
50
50
  @@included_prompts = [AskChatGPT::Prompts::App.new]
51
51
 
52
+ # enable voice input, requires ffmpeg to be installed and also you need to configure audio_device_id
53
+ mattr_accessor :voice_enabled
54
+ @voice_enabled = false
55
+
56
+ # to get audio device ID (index in the input devices)
57
+ # ffmpeg -f avfoundation -list_devices true -i ""
58
+ mattr_accessor :audio_device_id
59
+ @@audio_device_id = nil
60
+
61
+ # max duration of audio to record
62
+ mattr_accessor :voice_max_duration
63
+ @@voice_max_duration = 10 # 10 seconds
64
+
52
65
  def self.setup
53
66
  yield(self)
54
67
  end
@@ -10,6 +10,11 @@ AskChatGPT.setup do |config|
10
10
  # config.temperature = 0.1
11
11
  # config.included_prompts = []
12
12
 
13
+ # enable voice input, requires ffmpeg to be installed and also you need to configure audio_device_id
14
+ # config.voice_enabled = true
15
+ # config.audio_device_id = 1
16
+ # config.voice_max_duration = 10 # 10 seconds
17
+
13
18
  # Examples of custom prompts:
14
19
  # you can use them `gpt.ask(:extract_email, "some string")`
15
20
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ask_chatgpt
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Igor Kasyanchuk
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2023-05-01 00:00:00.000000000 Z
12
+ date: 2023-05-09 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rails
@@ -95,6 +95,20 @@ dependencies:
95
95
  - - ">="
96
96
  - !ruby/object:Gem::Version
97
97
  version: 1.4.3
98
+ - !ruby/object:Gem::Dependency
99
+ name: io-console
100
+ requirement: !ruby/object:Gem::Requirement
101
+ requirements:
102
+ - - ">="
103
+ - !ruby/object:Gem::Version
104
+ version: '0'
105
+ type: :runtime
106
+ prerelease: false
107
+ version_requirements: !ruby/object:Gem::Requirement
108
+ requirements:
109
+ - - ">="
110
+ - !ruby/object:Gem::Version
111
+ version: '0'
98
112
  - !ruby/object:Gem::Dependency
99
113
  name: wrapped_print
100
114
  requirement: !ruby/object:Gem::Requirement
@@ -156,6 +170,7 @@ files:
156
170
  - lib/ask_chatgpt/railtie.rb
157
171
  - lib/ask_chatgpt/sugar.rb
158
172
  - lib/ask_chatgpt/version.rb
173
+ - lib/ask_chatgpt/voice.rb
159
174
  - lib/generators/ask_chatgpt/USAGE
160
175
  - lib/generators/ask_chatgpt/ask_chatgpt_generator.rb
161
176
  - lib/generators/ask_chatgpt/templates/template.rb