aia 0.5.15 → 0.5.16

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2d027907d70cb497a761f25ad65c2e9429f312cc9694466bf9a8db612a1ead0a
4
- data.tar.gz: b76efb7bd589a685e9380d969db85f3df88c36fe888fd03e44819d26e339a353
3
+ metadata.gz: 0e8b6a3c91dad9236a014bbe130c6f359ba3121a8e731398304f6d6305158138
4
+ data.tar.gz: ce833e093f76d57388296361371484e1ce75c48442f4199895135f24ab3a8289
5
5
  SHA512:
6
- metadata.gz: b83635891018c810bf7c794a66bb7c0842e28d9152ceb444f5105468928b419575ce7f3f88e70528ef8dc87b46ab2246525e43726f9480a2f2ce8be00a850270
7
- data.tar.gz: 979137d859737b3dec4f264d1e6c6611131b5fbbcb4682ec1dd5843a69c147754b0937414f9c891cc1e8b0a09d95129af83ba1432bc9b01e8465ff7562245ca9
6
+ metadata.gz: bfd04950aeb63e7d35f1063d8264fcdbd5e66d09ddf0ca9776caacfe978c4340a00b3e26ccadd728f0b83d9278eedc391b72c942e357a95acd6e39f6aab93f4d
7
+ data.tar.gz: 7e4f2a9698906c61d84eb4d19f9d059867579e922da77e1a7ddb55fe1dc50a6784b1f7e831ea6c3daf68a3e5078c21ec61514dafee29b135bc7ebf9c55a585c8
data/.semver CHANGED
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  :major: 0
3
3
  :minor: 5
4
- :patch: 15
4
+ :patch: 16
5
5
  :special: ''
6
6
  :metadata: ''
data/CHANGELOG.md CHANGED
@@ -1,5 +1,15 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.5.16] 2024-04-02
4
+ - fixed prompt pipelines
5
+ - added //next and //pipeline directives as shortcuts to //config [next,pipeline]
6
+ - Added new backend "client" as an internal OpenAI client
7
+ - Added --sm, --speech_model default: tts-1
8
+ - Added --tm, --transcription_model default: whisper-1
9
+ - Added --voice default: alloy (if "siri" and Mac? then uses cli tool "say")
10
+ - Added --image_size and --image_quality (--is --iq)
11
+
12
+
3
13
  ## [0.5.15] 2024-03-30
4
14
  - Added the ability to accept piped in text to be appeded to the end of the prompt text: curl $URL | aia ad_hoc
5
15
  - Fixed bugs with entering directives as follow-up prompts during a chat session
data/README.md CHANGED
@@ -6,15 +6,16 @@ It leverages the `prompt_manager` gem to manage prompts for the `mods` and `sgpt
6
6
 
7
7
  **Most Recent Change**: Refer to the [Changelog](CHANGELOG.md)
8
8
 
9
+
10
+ > v0.5.16
11
+ > - fixed bugs with the prompt pipeline
12
+ > - Added new backend "client" which is an `aia` internal client to the OpenAI API that allows both text-to-speech and speech-to-text
13
+ > - Added --image_size and --image_quality to support image generation with the dall-e-2 and dall-e-3 models using the new internal `aia` OpenAI client.
14
+ >
9
15
  > v0.5.15
10
16
  > - Support piped content by appending to end of prompt text
11
17
  > - fixed bugs with directives entered as follow-up while in chat mode
12
18
  >
13
- > v0.5.14
14
- > - Directly access OpenAI to do text to speech when using the `--speak` option
15
- > - Added --voice to specify which voice to use
16
- > - Added --speech_model to specify which TTS model to use
17
- >
18
19
 
19
20
 
20
21
  <!-- Tocer[start]: Auto-generated, don't remove. -->
@@ -43,6 +44,7 @@ It leverages the `prompt_manager` gem to manage prompts for the `mods` and `sgpt
43
44
  - [--next](#--next)
44
45
  - [--pipeline](#--pipeline)
45
46
  - [Best Practices ??](#best-practices-)
47
+ - [Example pipline](#example-pipline)
46
48
  - [All About ROLES](#all-about-roles)
47
49
  - [The --roles_dir (AIA_ROLES_DIR)](#the---roles_dir-aia_roles_dir)
48
50
  - [The --role Option](#the---role-option)
@@ -344,6 +346,8 @@ three.txt contains //config next four
344
346
  ```
345
347
  BUT if you have more than two prompts in your sequence then consider using the --pipeline option.
346
348
 
349
+ **The directive //next is short for //config next**
350
+
347
351
  ### --pipeline
348
352
 
349
353
  `aia one --pipeline two,three,four`
@@ -352,6 +356,8 @@ or inside of the `one.txt` prompt file use this directive:
352
356
 
353
357
  `//config pipeline two,three,four`
354
358
 
359
+ **The directive //pipeline is short for //config pipeline**
360
+
355
361
  ### Best Practices ??
356
362
 
357
363
  Since the response of one prompt is fed into the next prompt within the sequence instead of having all prompts write their response to the same out file, use these directives inside the associated prompt files:
@@ -366,6 +372,47 @@ Since the response of one prompt is fed into the next prompt within the sequence
366
372
 
367
373
  This way you can see the response that was generated for each prompt in the sequence.
368
374
 
375
+ ### Example pipline
376
+
377
+ TODO: the audio-to-text is still under development.
378
+
379
+ Suppose you have an audio file of a meeting. You what to get a transcription of what was said in that meeting. Sometimes raw transcriptions hide the real value of the recording so you have crafted a pompt that takes the raw transcriptions and does a technical summary with a list of action items.
380
+
381
+ Create two prompts named transcribe.txt and tech_summary.txt
382
+
383
+ ```
384
+ # transcribe.txt
385
+ # Desc: takes one audio file
386
+ # note that there is no "prompt" text only the directive
387
+
388
+ //config backend client
389
+ //config model whisper-1
390
+ //next tech_summary
391
+ ```
392
+ and
393
+
394
+ ```
395
+ # tech_summary.txt
396
+
397
+ //config model gpt-4-turbo
398
+ //config out_file meeting_summary.md
399
+
400
+ Review the raw transcript of a technical meeting,
401
+ summarize the discussion and
402
+ note any action items that were generated.
403
+
404
+ Format your response in markdown.
405
+ ```
406
+
407
+ Now you can do this:
408
+
409
+ ```
410
+ aia transcribe my_tech_meeting.m4a
411
+ ```
412
+
413
+ You summary of the meeting is in the file `meeting_summary.md`
414
+
415
+
369
416
  ## All About ROLES
370
417
 
371
418
  ### The --roles_dir (AIA_ROLES_DIR)
data/lib/aia/cli.rb CHANGED
@@ -155,9 +155,11 @@ class AIA::Cli
155
155
  extra: [''], #
156
156
  #
157
157
  model: ["gpt-4-1106-preview", "--llm --model"],
158
- speech_model: ["tts-1", "--sm --spech_model"],
158
+ speech_model: ["tts-1", "--sm --speech_model"],
159
159
  voice: ["alloy", "--voice"],
160
160
  #
161
+ transcription_model: ["wisper-1", "--tm --transcription_model"],
162
+ #
161
163
  dump_file: [nil, "--dump"],
162
164
  completion: [nil, "--completion"],
163
165
  #
@@ -186,6 +188,11 @@ class AIA::Cli
186
188
  log_file: ["~/.prompts/_prompts.log", "-l --log_file --no-log_file"],
187
189
  #
188
190
  backend: ['mods', "-b --be --backend --no-backend"],
191
+ #
192
+ # text2image related ...
193
+ #
194
+ image_size: ['', '--is --image_size'],
195
+ image_quality: ['', '--iq --image_quality'],
189
196
  }
190
197
 
191
198
  AIA.config = AIA::Config.new(@options.transform_values { |values| values.first })
@@ -70,6 +70,8 @@ class AIA::Directives
70
70
  Pathname.new(value) :
71
71
  Pathname.pwd + value
72
72
  end
73
+ elsif %w[next pipeline].include? item.downcase
74
+ pipeline(value)
73
75
  else
74
76
  AIA.config[item] = value
75
77
  end
@@ -79,6 +81,33 @@ class AIA::Directives
79
81
  end
80
82
 
81
83
 
84
+ # TODO: we need a way to submit CLI arguments into
85
+ # the next prompt(s) from the main prompt.
86
+ # currently the config for subsequent prompts
87
+ # is expected to be set within those prompts.
88
+ # Maybe something like:
89
+ # //next prompt_id CLI args
90
+ # This would mean that the pipeline would be:
91
+ # //pipeline id1 cli args, id2 cli args, id3 cli args
92
+ #
93
+
94
+ # TODO: Change AIA.config.pipline Array to be an Array of arrays
95
+ # where each entry is:
96
+ # [prompt_id, cli_args]
97
+ # This means that:
98
+ # entry = AIA.config.pipeline.shift
99
+ # entry.is_A?(Sring) ? 'old format' : 'new format'
100
+ #
101
+
102
+ # //next id
103
+ # //pipeline id1,id2, id3 , id4
104
+ def pipeline(what)
105
+ return if what.empty?
106
+ AIA.config.pipeline << what.split(',').map(&:strip)
107
+ AIA.config.pipeline.flatten!
108
+ end
109
+ alias_method :next, :pipeline
110
+
82
111
  # when path_to_file is relative it will be
83
112
  # relative to the PWD.
84
113
  #
data/lib/aia/main.rb CHANGED
@@ -33,6 +33,8 @@ class AIA::Main
33
33
  @directive_output = ""
34
34
  AIA::Tools.load_tools
35
35
 
36
+ AIA.client = AIA::Client.new
37
+
36
38
  AIA::Cli.new(args)
37
39
 
38
40
  if AIA.config.debug?
@@ -115,6 +117,8 @@ class AIA::Main
115
117
 
116
118
  result = get_and_display_result(the_prompt)
117
119
 
120
+ AIA.speak(result) if AIA.config.speak?
121
+
118
122
  logger.prompt_result(@prompt, result)
119
123
 
120
124
  if AIA.config.chat?
@@ -125,14 +129,34 @@ class AIA::Main
125
129
 
126
130
  return if AIA.config.next.empty? && AIA.config.pipeline.empty?
127
131
 
128
- # Reset some config items to defaults
132
+ keep_going(result) unless AIA.config.pipeline.empty?
133
+ end
134
+
135
+
136
+ # The AIA.config.pipeline is NOT empty, so feed this result
137
+ # into the next prompt within the pipeline.
138
+ #
139
+ def keep_going(result)
140
+ temp_file = Tempfile.new('aia_pipeline')
141
+ temp_file.write(result)
142
+ temp_file.close
143
+
129
144
  AIA.config.directives = []
130
- AIA.config.next = AIA.config.pipeline.shift
131
- AIA.config.arguments = [AIA.config.next, AIA.config.out_file.to_s]
145
+ AIA.config.model = ""
146
+ AIA.config.arguments = [
147
+ AIA.config.pipeline.shift,
148
+ temp_file.path,
149
+ # TODO: additional arguments from the pipeline
150
+ ]
132
151
  AIA.config.next = ""
133
152
 
153
+ AIA.config.files = [temp_file.path]
154
+
134
155
  @prompt = AIA::Prompt.new.prompt
135
- call # Recurse!
156
+ call # Recurse! until the AIA.config.pipeline is emplty
157
+ puts
158
+ ensure
159
+ temp_file.unlink
136
160
  end
137
161
 
138
162
 
@@ -0,0 +1,197 @@
1
+ # lib/aia/tools/client.rb
2
+
3
+ require_relative 'backend_common'
4
+
5
+ OpenAI.configure do |config|
6
+ config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
7
+ end
8
+
9
+ class AIA::Client < AIA::Tools
10
+ include AIA::BackendCommon
11
+
12
+ meta(
13
+ name: 'client',
14
+ role: :backend,
15
+ desc: 'Ruby implementation of the OpenAI API',
16
+ url: 'https://github.com/alexrudall/ruby-openai',
17
+ install: 'gem install ruby-openai',
18
+ )
19
+
20
+ attr_reader :client, :raw_response
21
+
22
+ DEFAULT_PARAMETERS = ''
23
+ DIRECTIVES = []
24
+
25
+ def initialize(text: "", files: [])
26
+ super
27
+
28
+ @client = OpenAI::Client.new
29
+ end
30
+
31
+ def build_command
32
+ # No-Op
33
+ end
34
+
35
+
36
+ def run
37
+ handle_model(AIA.config.model)
38
+ rescue => e
39
+ puts "Error handling model #{AIA.config.model}: #{e.message}"
40
+ end
41
+
42
+ def speak(what = @text)
43
+ print "Speaking ... " if AIA.verbose?
44
+ text2audio(what)
45
+ puts "Done." if AIA.verbose?
46
+ end
47
+
48
+
49
+ ###########################################################
50
+ private
51
+
52
+ # Handling different models more abstractly
53
+ def handle_model(model_name)
54
+ case model_name
55
+ when /vision/
56
+ image2text
57
+
58
+ when /^gpt.*$/, /^babbage.*$/, /^davinci.*$/
59
+ text2text
60
+
61
+ when /^dall-e.*$/
62
+ text2image
63
+
64
+ when /^tts.*$/
65
+ text2audio
66
+
67
+ when /^whisper.*$/
68
+ audio2text
69
+
70
+ else
71
+ raise "Unsupported model: #{model_name}"
72
+ end
73
+ end
74
+
75
+
76
+ def image2text
77
+ # TODO: Implement
78
+ end
79
+
80
+
81
+ def text2text
82
+ @raw_response = client.chat(
83
+ parameters: {
84
+ model: AIA.config.model, # Required.
85
+ messages: [{ role: "user", content: text}], # Required.
86
+ temperature: AIA.config.temp,
87
+ }
88
+ )
89
+
90
+ response = raw_response.dig('choices', 0, 'message', 'content')
91
+
92
+ response
93
+ end
94
+
95
+
96
+ def text2image
97
+ parameters = {
98
+ model: AIA.config.model,
99
+ prompt: text
100
+ }
101
+
102
+ parameters[:size] = AIA.config.image_size unless AIA.config.image_size.empty?
103
+ parameters[:quality] = AIA.config.image_quality unless AIA.config.image_quality.empty?
104
+
105
+ raw_response = client.images.generate(parameters:)
106
+
107
+ response = raw_response.dig("data", 0, "url")
108
+
109
+ response
110
+ end
111
+
112
+
113
+ def text2audio(what = @text, save: false, play: true)
114
+ raise "OpenAI's text to speech capability is not available" unless client
115
+
116
+ player = select_audio_player
117
+
118
+ response = client.audio.speech(
119
+ parameters: {
120
+ model: AIA.config.speech_model,
121
+ input: what,
122
+ voice: AIA.config.voice
123
+ }
124
+ )
125
+
126
+ handle_audio_response(response, player, save, play)
127
+ end
128
+
129
+
130
+ def audio2text(path_to_audio_file = @files.first)
131
+ response = client.audio.transcribe(
132
+ parameters: {
133
+ model: AIA.config.model,
134
+ file: File.open(path_to_audio_file, "rb")
135
+ }
136
+ )
137
+
138
+ response["text"]
139
+ rescue => e
140
+ "An error occurred: #{e.message}"
141
+ end
142
+
143
+
144
+ # Helper methods
145
+ def select_audio_player
146
+ case OS.host_os
147
+ when /mac|darwin/
148
+ 'afplay'
149
+ when /linux/
150
+ 'mpg123'
151
+ when /mswin|mingw|cygwin/
152
+ 'cmdmp3'
153
+ else
154
+ raise "No MP3 player available"
155
+ end
156
+ end
157
+
158
+
159
+ def handle_audio_response(response, player, save, play)
160
+ Tempfile.create(['speech', '.mp3']) do |f|
161
+ f.binmode
162
+ f.write(response)
163
+ f.close
164
+ `cp #{f.path} #{Pathname.pwd + "speech.mp3"}` if save
165
+ `#{player} #{f.path}` if play
166
+ end
167
+ end
168
+
169
+
170
+ ###########################################################
171
+ public
172
+
173
+ class << self
174
+
175
+ def list_models
176
+ new.client.model.list
177
+ end
178
+
179
+
180
+ def speak(what)
181
+ save_model = AIA.config.model
182
+ AIA.config.model = AIA.config.speech_model
183
+
184
+ new(text: what).speak
185
+
186
+ AIA.config.model = save_model
187
+ end
188
+
189
+ end
190
+
191
+ end
192
+
193
+
194
+ __END__
195
+
196
+
197
+ ##########################################################
data/lib/aia.rb CHANGED
@@ -49,12 +49,6 @@ module AIA
49
49
  attr_accessor :client
50
50
 
51
51
  def run(args=ARGV)
52
- begin
53
- @client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
54
- rescue OpenAI::ConfigurationError
55
- @client = nil
56
- end
57
-
58
52
  args = args.split(' ') if args.is_a?(String)
59
53
 
60
54
  # TODO: Currently this is a one and done architecture.
@@ -72,43 +66,13 @@ module AIA
72
66
  if OS.osx? && 'siri' == config.voice.downcase
73
67
  system "say #{Shellwords.escape(what)}"
74
68
  else
75
- use_openai_tts(what)
69
+ Client.speak(what)
76
70
  end
77
71
  end
78
72
 
79
73
 
80
- def use_openai_tts(what)
81
- if client.nil?
82
- puts "\nWARNING: OpenAI's text to speech capability is not available at this time."
83
- return
84
- end
85
-
86
- player = if OS.osx?
87
- 'afplay'
88
- elsif OS.linux?
89
- 'mpg123'
90
- elsif OS.windows?
91
- 'cmdmp3'
92
- else
93
- puts "\nWARNING: There is no MP3 player available"
94
- return
95
- end
96
-
97
- response = client.audio.speech(
98
- parameters: {
99
- model: config.speech_model,
100
- input: what,
101
- voice: config.voice
102
- }
103
- )
104
-
105
- Tempfile.create(['speech', '.mp3']) do |f|
106
- f.binmode
107
- f.write(response)
108
- f.close
109
- `#{player} #{f.path}`
110
- end
111
- end
74
+ def verbose? = AIA.config.verbose?
75
+ def debug? = AIA.config.debug?
112
76
  end
113
77
  end
114
78
 
data/man/aia.1 CHANGED
@@ -1,6 +1,6 @@
1
1
  .\" Generated by kramdown-man 1.0.1
2
2
  .\" https://github.com/postmodern/kramdown-man#readme
3
- .TH aia 1 "v0.5.14" AIA "User Manuals"
3
+ .TH aia 1 "v0.5.16" AIA "User Manuals"
4
4
  .SH NAME
5
5
  .PP
6
6
  aia \- command\-line interface for an AI assistant
@@ -39,6 +39,12 @@ This option tells \fBaia\fR to replace references to system environment variable
39
39
  \fB\-\-erb\fR
40
40
  If dynamic prompt content using \[Do](\.\.\.) wasn\[cq]t enough here is ERB\. Embedded RUby\. <%\[eq] ruby code %> within a prompt will have its ruby code executed and the results of that execution will be inserted into the prompt\. I\[cq]m sure we will find a way to truly misuse this capability\. Remember, some say that the simple prompt is the best prompt\.
41
41
  .TP
42
+ \fB\-\-iq\fR, \fB\-\-image\[ru]quality\fR \fIVALUE\fP
43
+ (Used with backend \[oq]client\[cq] only) See the OpenAI docs for valid values (depends on model) \- default: \[oq]\[cq]
44
+ .TP
45
+ \fB\-\-is\fR, \fB\-\-image\[ru]size\fR \fIVALUE\fP
46
+ (Used with backend \[oq]client\[cq] only) See the OpenAI docs for valid values (depends on model) \- default: \[oq]\[cq]
47
+ .TP
42
48
  \fB\-\-model\fR \fINAME\fP
43
49
  Name of the LLM model to use \- default is gpt\-4\-1106\-preview
44
50
  .TP
@@ -48,9 +54,18 @@ Render markdown to the terminal using the external tool \[lq]glow\[rq] \- defaul
48
54
  \fB\-\-speak\fR
49
55
  Simple implementation\. Uses the \[lq]say\[rq] command to speak the response\. Fun with \-\-chat
50
56
  .TP
57
+ \fB\-\-sm\fR, \fB\-\-speech\[ru]model\fR \fIMODEL NAME\fP
58
+ Which OpenAI LLM to use for text\-to\-speech (TTS) \- default: tts\-1
59
+ .TP
60
+ \fB\-\-voice\fR \fIVOICE NAME\fP
61
+ Which voice to use when speaking text\. If its \[lq]siri\[rq] and the platform is a Mac, then the CLI utility \[lq]say\[rq] is used\. Any other name will be used with OpenAI \- default: alloy
62
+ .TP
51
63
  \fB\-\-terse\fR
52
64
  Add a clause to the prompt text that instructs the backend to be terse in its response\.
53
65
  .TP
66
+ \fB\-\-tm\fR, \fB\-\-transcription\[ru]model\fR \fIMODEL NAME\fP
67
+ Which OpenAI LLM to use for audio\-to\-text \- default: whisper\-1
68
+ .TP
54
69
  \fB\-\-version\fR
55
70
  Print Version \- default is false
56
71
  .TP
@@ -175,6 +190,28 @@ or just
175
190
  \fB\[sl]\[sl]config next three\fR
176
191
  .PP
177
192
  if you want to specify them one at a time\.
193
+ .PP
194
+ You can also use the shortcuts \fB\[sl]\[sl]next\fR and \fB\[sl]\[sl]pipeline\fR
195
+ .PP
196
+ .PP
197
+ .RS 4
198
+ .EX
199
+ \[sl]\[sl]next two
200
+ \[sl]\[sl]next three
201
+ \[sl]\[sl]next four
202
+ \[sl]\[sl]next five
203
+ .EE
204
+ .RE
205
+ .PP
206
+ Is the same thing as
207
+ .PP
208
+ .PP
209
+ .RS 4
210
+ .EX
211
+ \[sl]\[sl]pipeline two,three,four
212
+ \[sl]\[sl]next five
213
+ .EE
214
+ .RE
178
215
  .SH SEE ALSO
179
216
  .RS
180
217
  .IP \(bu 2
@@ -221,6 +258,13 @@ glow
221
258
  .UE
222
259
  Render markdown on the CLI
223
260
  .RE
261
+ .SH Image Generation
262
+ .PP
263
+ The \-\-backend \[lq]client\[rq] is the only back end that supports image generation using the \fBdall\-e\-2\fR and \fBdall\-e\-3\fR models through OpenAI\. The result of your prompt will be a URL that points to the OpenAI storage space where your image is placed\.
264
+ .PP
265
+ Use \-\-image\[ru]size and \-\-image\[ru]quality to specified the desired size and quality of the generated image\. The valid values are available at the OpenAI website\.
266
+ .PP
267
+ https:\[sl]\[sl]platform\.openai\.com\[sl]docs\[sl]guides\[sl]images\[sl]usage?context\[eq]node
224
268
  .SH AUTHOR
225
269
  .PP
226
270
  Dewayne VanHoozer
data/man/aia.1.md CHANGED
@@ -1,4 +1,4 @@
1
- # aia 1 "v0.5.14" AIA "User Manuals"
1
+ # aia 1 "v0.5.16" AIA "User Manuals"
2
2
 
3
3
  ## NAME
4
4
 
@@ -43,6 +43,12 @@ The aia command-line tool is an interface for interacting with an AI model backe
43
43
  `--erb`
44
44
  : If dynamic prompt content using $(...) wasn't enough here is ERB. Embedded RUby. <%= ruby code %> within a prompt will have its ruby code executed and the results of that execution will be inserted into the prompt. I'm sure we will find a way to truly misuse this capability. Remember, some say that the simple prompt is the best prompt.
45
45
 
46
+ `--iq`, `--image_quality` *VALUE*
47
+ : (Used with backend 'client' only) See the OpenAI docs for valid values (depends on model) - default: ''
48
+
49
+ `--is`, `--image_size` *VALUE*
50
+ : (Used with backend 'client' only) See the OpenAI docs for valid values (depends on model) - default: ''
51
+
46
52
  `--model` *NAME*
47
53
  : Name of the LLM model to use - default is gpt-4-1106-preview
48
54
 
@@ -52,9 +58,18 @@ The aia command-line tool is an interface for interacting with an AI model backe
52
58
  `--speak`
53
59
  : Simple implementation. Uses the "say" command to speak the response. Fun with --chat
54
60
 
61
+ `--sm`, `--speech_model` *MODEL NAME*
62
+ : Which OpenAI LLM to use for text-to-speech (TTS) - default: tts-1
63
+
64
+ `--voice` *VOICE NAME*
65
+ : Which voice to use when speaking text. If its "siri" and the platform is a Mac, then the CLI utility "say" is used. Any other name will be used with OpenAI - default: alloy
66
+
55
67
  `--terse`
56
68
  : Add a clause to the prompt text that instructs the backend to be terse in its response.
57
69
 
70
+ `--tm`, `--transcription_model` *MODEL NAME*
71
+ : Which OpenAI LLM to use for audio-to-text - default: whisper-1
72
+
58
73
  `--version`
59
74
  : Print Version - default is false
60
75
 
@@ -176,6 +191,21 @@ or just
176
191
 
177
192
  if you want to specify them one at a time.
178
193
 
194
+ You can also use the shortcuts `//next` and `//pipeline`
195
+
196
+ ```
197
+ //next two
198
+ //next three
199
+ //next four
200
+ //next five
201
+ ```
202
+
203
+ Is the same thing as
204
+
205
+ ```
206
+ //pipeline two,three,four
207
+ //next five
208
+ ```
179
209
 
180
210
  ## SEE ALSO
181
211
 
@@ -193,6 +223,13 @@ if you want to specify them one at a time.
193
223
 
194
224
  - [glow](https://github.com/charmbracelet/glow) Render markdown on the CLI
195
225
 
226
+ ## Image Generation
227
+
228
+ The --backend "client" is the only back end that supports image generation using the `dall-e-2` and `dall-e-3` models through OpenAI. The result of your prompt will be a URL that points to the OpenAI storage space where your image is placed.
229
+
230
+ Use --image_size and --image_quality to specified the desired size and quality of the generated image. The valid values are available at the OpenAI website.
231
+
232
+ https://platform.openai.com/docs/guides/images/usage?context=node
196
233
 
197
234
  ## AUTHOR
198
235
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: aia
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.15
4
+ version: 0.5.16
5
5
  platform: ruby
6
6
  authors:
7
7
  - Dewayne VanHoozer
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-03-30 00:00:00.000000000 Z
11
+ date: 2024-04-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: hashie
@@ -278,6 +278,7 @@ files:
278
278
  - lib/aia/prompt.rb
279
279
  - lib/aia/tools.rb
280
280
  - lib/aia/tools/backend_common.rb
281
+ - lib/aia/tools/client.rb
281
282
  - lib/aia/tools/editor.rb
282
283
  - lib/aia/tools/fzf.rb
283
284
  - lib/aia/tools/glow.rb