RubyGems - ask_chatgpt - Versions diffs - 0.3.1 → 0.4.0 - Mend

ask_chatgpt 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

checksums.yaml +4 -4
data/README.md +68 -0
data/bin/ask_chatgpt +17 -4
data/lib/ask_chatgpt/executor.rb +9 -0
data/lib/ask_chatgpt/version.rb +1 -1
data/lib/ask_chatgpt/voice.rb +176 -0
data/lib/ask_chatgpt.rb +13 -0
data/lib/generators/ask_chatgpt/templates/template.rb +5 -0
metadata +17 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 86f36d546bd015947fcac7a2bb9366530ba0ff8a9012f21f2151ae0484ecf647
-  data.tar.gz: 136c817bc54317c77325ca0145857262d48680b8b25072e3364dcd0fe526e501
+  metadata.gz: 6a5ccd11d62b51d4422562be281edcf67254e9829f551edf2f91acd970fd091b
+  data.tar.gz: 1328f4ffedb742cce8c330de2fedf7bca4632610a5e95e0215a6dabc6dc3d50d
 SHA512:
-  metadata.gz: 7b3112557fcb6e3fa11856693e26dc37df80dedafde058f6f181d303a336f75cc4ab21f9c1daf89dd61d253eea150b05d5f664408ab5299c07210f37d8f9d87c
-  data.tar.gz: 4c5a22ca8157813e15fa5e2f639eefd89967ba9bbd300d6c0929bfe18cd3b2ff4e600197d59411520da05626c77ec9ca1786e2b6f3bfd7b51a2d5403f4528098
+  metadata.gz: 869bb402cb40bbaa8d0e60a4c0c9e3247a155a18b60d805c3943baa2b0c90d1ccf8471de4359df5ca63a6e5197231cca1cbae5398716e49ed0ecd0c14b2e45fe
+  data.tar.gz: 83c04d9f695e3772ec35351b4de6e019d2f866023421045ec8f6dc2decdaded75f3feb9c70947caf1837b3942333ab69c23b4e02ca7e9efe87c77c7b248fb8af

data/README.md CHANGED Viewed

@@ -31,6 +31,10 @@ Go to Rails console and run:
       [first_name, last_name].join
     end
   }
+  #
+  # --- NEW ---
+  #
+  gpt.speak # or with alias gpt.s
 ```
 OR with CLI tool:
@@ -46,6 +50,8 @@ aGVsbG8gd29ybGQ=
 hello world
 ```
+>ask_chatgpt -s 1 # start voice input with CLI
 See some examples below. You can also create your own prompts with just few lines of code [here](#options--configurations).
 Also you can use a CLI tool, [how to use it](#cli-tool).
@@ -113,6 +119,17 @@ And you can edit:
     # config.max_tokens       = 3000 # or nil by default
     # config.included_prompts = []
+    # enable voice input with `gpt.speak` or `gpt.s`. Note, you also need to configure `audio_device_id`
+    # config.voice_enabled = true
+    # to get audio device ID (index in the input devices)
+    # install ffmpeg, and execute from the console
+    # `ffmpeg -f avfoundation -list_devices true -i ""`
+    # config.audio_device_id = 1
+    # after "voice_max_duration" seconds it will send audio to Open AI
+    # config.voice_max_duration = 10 # 10 seconds
     # Examples of custom prompts:
     # you can use them `gpt.extract_email("some string")`
@@ -182,8 +199,55 @@ end
 or directly in console `gpt.debug!` (and finish `gpt.debug!(:off)`)
+## Voice Input
+Demo: https://youtu.be/uBR0wnQvKao
+For now I consider this as an experimental and fun feature. Look forward seeing your feedback.
+Works with command: `gpt.speak` or `gpt.s` (alias).
+This command starts recording right away and it will stop after `voice_max_duration` seconds or if you press any key.
+To exit recording mode press `Q`.
+Voice is using `ffmpeg` tool, so you need to install it. Some instruction like this will work: https://www.hostinger.com/tutorials/how-to-install-ffmpeg.
+Also, you need to configure `audio_device_id`. Run `ffmpeg -f avfoundation -list_devices true -i ""`
+It will give you list of all devices, like this:
+```s
+ffmpeg -f avfoundation -list_devices true -i ""
+ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
+  built with Apple clang version 14.0.0 (clang-1400.0.29.202)
+  configuration: --prefix=/usr/local/Cellar/ffmpeg/6.0 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox
+  libavutil      58.  2.100 / 58.  2.100
+  libavcodec     60.  3.100 / 60.  3.100
+  libavformat    60.  3.100 / 60.  3.100
+  libavdevice    60.  1.100 / 60.  1.100
+  libavfilter     9.  3.100 /  9.  3.100
+  libswscale      7.  1.100 /  7.  1.100
+  libswresample   4. 10.100 /  4. 10.100
+  libpostproc    57.  1.100 / 57.  1.100
+[AVFoundation indev @ 0x7f7fd1a04380] AVFoundation video devices:
+[AVFoundation indev @ 0x7f7fd1a04380] [0] FaceTime HD Camera
+[AVFoundation indev @ 0x7f7fd1a04380] [1] USB Camera VID:1133 PID:2085
+[AVFoundation indev @ 0x7f7fd1a04380] [2] Capture screen 0
+[AVFoundation indev @ 0x7f7fd1a04380] [3] Capture screen 1
+[AVFoundation indev @ 0x7f7fd1a04380] AVFoundation audio devices:
+[AVFoundation indev @ 0x7f7fd1a04380] [0] Microsoft Teams Audio
+[AVFoundation indev @ 0x7f7fd1a04380] [1] Built-in Microphone
+[AVFoundation indev @ 0x7f7fd1a04380] [2] Unknown USB Audio Device
+: Input/output error
+```
+In my case I used "1", because it's `Built-in Microphone`.
 ## CLI Tool
+You can ask questions from cli or even start voice input.
 Example 1:
 ![AskChatGPT](docs/unzip.gif)
@@ -232,6 +296,10 @@ end
 - print tokens usage? `.with_usage`
 - support org_id? in the configs
 - use `gpt` in the code of the main app (e.g. model/controller)
+- when voice is used add support for payloads, e.g. `gpt.with_payload(json).speak` (and it will send payload with my question)
+- refactor voice input code :) as first version it's fine
+- can we discover audio device ID?
+- use tempfile for audio, instead of output.wav
 ## Contributing

data/bin/ask_chatgpt CHANGED Viewed

@@ -24,6 +24,7 @@ parser = OptionParser.new do |opts|
       ask_chatgpt -f app/models/user.rb -q "find a bug in this Rails model"
       ask_chatgpt -f app/models/user.rb -q "create RSpec spec for this model"
       ask_chatgpt -f test/dummy/Gemfile -q "sort Ruby gems alphabetically"
+      ask_chatgpt -s 1"
     Version: #{AskChatGPT::VERSION}
@@ -33,6 +34,10 @@ parser = OptionParser.new do |opts|
     options[:prompt] = prompt
   end
+  opts.on("-s", "--speak AudioDeviceID", String, "Specify audio device ID") do |audio_device_id|
+    options[:audio_device_id] = audio_device_id
+  end
   opts.on("-f", "--file FILE", String, "Specify file with prompt") do |file|
     options[:file_path] = file
   end
@@ -53,14 +58,22 @@ AskChatGPT.debug = !!options[:debug]
 options[:prompt] = ARGV.join(" ") if options[:prompt].blank?
-if options[:prompt].blank?
+if options[:prompt].blank? &&  options[:audio_device_id].blank?
   puts parser
   exit
 end
 include AskChatGPT::Console
-instance = gpt.ask(options[:prompt])
-instance = instance.payload(File.read(options[:file_path])) if options[:file_path].present?
+if options[:audio_device_id].present?
+  AskChatGPT.voice_enabled = true
+  AskChatGPT.voice_max_duration = 20
+  AskChatGPT.audio_device_id = options[:audio_device_id]
-puts instance.inspect
+  instance = gpt.speak
+else
+  instance = gpt.ask(options[:prompt])
+  instance = instance.payload(File.read(options[:file_path])) if options[:file_path].present?
+  puts instance.inspect
+end

data/lib/ask_chatgpt/executor.rb CHANGED Viewed

@@ -2,6 +2,7 @@ require_relative "sugar"
 require_relative "prompts/base"
 require_relative "prompts/improve"
 require_relative "default_behavior"
+require_relative "voice"
 Dir[File.join(__dir__, "prompts", "*.rb")].each do |file|
   require file
@@ -23,6 +24,14 @@ module AskChatgpt
       AskChatgpt::Executor.new(client)
     end
+    def speak
+      puts "Voice input is not enabled (docs: https://github.com/railsjazz/ask_chatgpt)" unless AskChatGPT.voice_enabled
+      puts "Audio device ID is not configured (docs: https://github.com/railsjazz/ask_chatgpt)" unless AskChatGPT.audio_device_id
+      AskChatgpt::VoiceFlow::Voice.new.run
+    end
+    alias_method :s, :speak
     def initialize(client)
       @scope   = AskChatGPT.included_prompts.dup
       @client  = client

data/lib/ask_chatgpt/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module AskChatgpt
-  VERSION = "0.3.1"
+  VERSION = "0.4.0"
 end

data/lib/ask_chatgpt/voice.rb ADDED Viewed

@@ -0,0 +1,176 @@
+module StringExt
+  refine String do
+    def black;          "\e[30m#{self}\e[0m" end
+    def red;            "\e[31m#{self}\e[0m" end
+    def green;          "\e[32m#{self}\e[0m" end
+    def brown;          "\e[33m#{self}\e[0m" end
+    def blue;           "\e[34m#{self}\e[0m" end
+    def magenta;        "\e[35m#{self}\e[0m" end
+    def cyan;           "\e[36m#{self}\e[0m" end
+    def gray;           "\e[37m#{self}\e[0m" end
+  end
+end
+using StringExt
+module AskChatgpt
+  module VoiceFlow
+    require 'io/console'
+    require 'fileutils'
+    require 'timeout'
+    require 'open3'
+    class AudioRecorder
+      OUTPUT_FILE = "output.wav"
+      def initialize(duration)
+        @duration = duration
+      end
+      # ffmpeg -f avfoundation -list_devices true -i ""
+      def audio_device_id
+        AskChatGPT.audio_device_id
+      end
+      def start
+        delete_audio_file
+        ffmpeg_command = build_ffmpeg_command
+        @stdin, @stdout_and_stderr, @wait_thread = Open3.popen2e(ffmpeg_command)
+      end
+      def stop
+        @stdin.puts 'q'
+        @stdin.close
+        @stdout_and_stderr.close
+        sleep(0.2)
+      rescue Errno::EPIPE, IOError
+      end
+      def delete_audio_file
+        FileUtils.rm(OUTPUT_FILE) if File.exist?(OUTPUT_FILE)
+      end
+      private
+      def build_ffmpeg_command
+        case RUBY_PLATFORM
+        when /darwin/
+          input_device = "-f avfoundation -i \":#{audio_device_id}\""
+        when /linux/
+          input_device = '-f alsa -i default'
+        when /mingw|mswin/
+          input_device = '-f dshow -i audio="Microphone (Realtek High Definition Audio)"'
+        else
+          raise "Unsupported platform: #{RUBY_PLATFORM}"
+        end
+        "ffmpeg -loglevel quiet #{input_device} -t #{@duration} #{OUTPUT_FILE}"
+      end
+    end
+    class Voice
+      def initialize
+        @messages = []
+        @wanna_quit = false
+        @duration = (AskChatGPT.voice_max_duration.presence || 10).to_i
+        @ffmpeg_wait_duration = 0.5
+        @executing = true
+        @spinner = nil
+      end
+      def run
+        while @executing
+          # Start the parallel process
+          audio_recorder = AudioRecorder.new(@duration)
+          audio_recorder.start
+          begin
+            Timeout.timeout(@duration + @ffmpeg_wait_duration) do
+              @spinner = TTY::Spinner.new("[Recording]".red + " / Press any key to stop recording or \"Esc\" / \"q\" to quit ... ".blue + ":spinner".red, format: :spin)
+              @spinner.auto_spin
+              sleep(@ffmpeg_wait_duration) # five some time for ffmpeg to start
+              # Listen for user input in the main process
+              begin
+                char = $stdin.getch
+                if char.ord == 27 || char.upcase == "Q"
+                  audio_recorder.stop
+                  @executing = false
+                  @spinner.stop
+                  puts "Bye...".brown
+                end
+                break
+              end while char.nil?
+            end
+          rescue Timeout::Error
+          ensure
+            audio_recorder.stop
+            @spinner.stop
+            break unless @executing
+          end
+          if !File.exist?("output.wav")
+            puts "No audio file found, please try again.".brown
+            sleep(0.5)
+            next
+          end
+          @spinner = TTY::Spinner.new("Thinking :spinner".cyan, format: :dots)
+          @spinner.auto_spin
+          response = client.transcribe(parameters: { model: "whisper-1", file: File.open("output.wav", "rb") })
+          @spinner.stop
+          if response["error"]
+            puts response["error"].inspect.brown
+            @executing = false
+            break
+          end
+          user_input = response["text"].to_s
+          puts "USER> ".green + user_input
+          @messages << { role: "user", content:user_input }
+          print "ASSISTANT> ".magenta
+          stop_stream = false
+          reply = []
+          keypresser = Thread.new do
+            loop { stop_stream = true if $stdin.getch }
+          end
+          begin
+            client.chat(
+              parameters: {
+                model: "gpt-3.5-turbo",
+                messages: @messages,
+                temperature: 0.7,
+                stream: proc do |chunk, _bytesize|
+                  break if stop_stream
+                  message = chunk.dig("choices", 0, "delta", "content")
+                  next if message.to_s.empty?
+                  message = message.gsub("\n", "\r\n")
+                  print message
+                  reply += [message]
+                end
+              })
+          rescue LocalJumpError
+            puts
+          ensure
+            Thread.kill(keypresser)
+            stop_stream = false
+            @messages << { role: "assistant", content: reply.join }
+            puts
+          end
+          audio_recorder.delete_audio_file
+        end
+      end
+      def client
+        @client ||= OpenAI::Client.new(access_token: AskChatGPT.access_token)
+      end
+    end
+  end
+end

data/lib/ask_chatgpt.rb CHANGED Viewed

@@ -49,6 +49,19 @@ module AskChatgpt
   mattr_accessor :included_prompts
   @@included_prompts = [AskChatGPT::Prompts::App.new]
+  # enable voice input, requires ffmpeg to be installed and also you need to configure audio_device_id
+  mattr_accessor :voice_enabled
+  @voice_enabled = false
+  # to get audio device ID (index in the input devices)
+  # ffmpeg -f avfoundation -list_devices true -i ""
+  mattr_accessor :audio_device_id
+  @@audio_device_id = nil
+  # max duration of audio to record
+  mattr_accessor :voice_max_duration
+  @@voice_max_duration = 10 # 10 seconds
   def self.setup
     yield(self)
   end

data/lib/generators/ask_chatgpt/templates/template.rb CHANGED Viewed

@@ -10,6 +10,11 @@ AskChatGPT.setup do |config|
   # config.temperature      = 0.1
   # config.included_prompts = []
+  # enable voice input, requires ffmpeg to be installed and also you need to configure audio_device_id
+  # config.voice_enabled      = true
+  # config.audio_device_id    = 1
+  # config.voice_max_duration = 10 # 10 seconds
   # Examples of custom prompts:
   # you can use them `gpt.ask(:extract_email, "some string")`

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: ask_chatgpt
 version: !ruby/object:Gem::Version
-  version: 0.3.1
+  version: 0.4.0
 platform: ruby
 authors:
 - Igor Kasyanchuk
@@ -9,7 +9,7 @@ authors:
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2023-05-01 00:00:00.000000000 Z
+date: 2023-05-09 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rails
@@ -95,6 +95,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: 1.4.3
+- !ruby/object:Gem::Dependency
+  name: io-console
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 - !ruby/object:Gem::Dependency
   name: wrapped_print
   requirement: !ruby/object:Gem::Requirement
@@ -156,6 +170,7 @@ files:
 - lib/ask_chatgpt/railtie.rb
 - lib/ask_chatgpt/sugar.rb
 - lib/ask_chatgpt/version.rb
+- lib/ask_chatgpt/voice.rb
 - lib/generators/ask_chatgpt/USAGE
 - lib/generators/ask_chatgpt/ask_chatgpt_generator.rb
 - lib/generators/ask_chatgpt/templates/template.rb