ollama-ruby 0.7.0 → 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -26,161 +26,23 @@ gem 'ollama-ruby'
26
26
 
27
27
  to your Gemfile and run `bundle install` in your terminal.
28
28
 
29
- ## Executables
30
-
31
- ### ollama\_chat
32
-
33
- This a chat client, that can be used to connect to an ollama server and enter a
34
- chat converstation with a LLM. It can be called with the following arguments:
35
-
36
- ```
37
- Usage: ollama_chat [OPTIONS]
38
-
39
- -f CONFIG config file to read
40
- -u URL the ollama base url, OLLAMA_URL
41
- -m MODEL the ollama model to chat with, OLLAMA_CHAT_MODEL
42
- -s SYSTEM the system prompt to use as a file, OLLAMA_CHAT_SYSTEM
43
- -c CHAT a saved chat conversation to load
44
- -C COLLECTION name of the collection used in this conversation
45
- -D DOCUMENT load document and add to embeddings collection (multiple)
46
- -M use (empty) MemoryCache for this chat session
47
- -E disable embeddings for this chat session
48
- -V display the current version number and quit
49
- -h this help
50
- ```
51
-
52
- The base URL can be either set by the environment variable `OLLAMA_URL` or it
53
- is derived from the environment variable `OLLAMA_HOST`. The default model to
54
- connect can be configured in the environment variable `OLLAMA_MODEL`.
55
-
56
- The YAML config file in `$XDG_CONFIG_HOME/ollama_chat/config.yml`, that you can
57
- use for more complex settings, it looks like this:
58
-
59
- ```
60
- ---
61
- url: <%= ENV['OLLAMA_URL'] || 'http://%s' % ENV.fetch('OLLAMA_HOST') %>
62
- model:
63
- name: <%= ENV.fetch('OLLAMA_CHAT_MODEL', 'llama3.1') %>
64
- options:
65
- num_ctx: 8192
66
- system: <%= ENV.fetch('OLLAMA_CHAT_SYSTEM', 'null') %>
67
- voice: Samantha
68
- markdown: true
69
- embedding:
70
- enabled: true
71
- model:
72
- name: mxbai-embed-large
73
- options: {}
74
- collection: <%= ENV.fetch('OLLAMA_CHAT_COLLECTION', 'ollama_chat') %>
75
- found_texts_size: 4096
76
- splitter:
77
- name: RecursiveCharacter
78
- chunk_size: 1024
79
- cache: Ollama::Documents::RedisCache
80
- redis:
81
- url: <%= ENV.fetch('REDIS_URL', 'null') %>
82
- debug: <%= ENV['OLLAMA_CHAT_DEBUG'].to_i == 1 ? true : false %>
83
- ```
84
-
85
- If you want to store embeddings persistently, set an environment variable
86
- `REDIS_URL` or update the `redis.url` setting in your `config.yml` file to
87
- connect to a Redis server. Without this setup, embeddings will only be stored
88
- in process memory, which is less durable.
89
-
90
- Some settings can be passed as arguments as well, e. g. if you want to choose a
91
- specific system prompt:
92
-
93
- ```
94
- $ ollama_chat -s sherlock.txt
95
- Model with architecture llama found.
96
- Connecting to llama3.1@http://ollama.local.net:11434 now…
97
- Configured system prompt is:
98
- You are Sherlock Holmes and the user is your new client, Dr. Watson is also in
99
- the room. You will talk and act in the typical manner of Sherlock Holmes do and
100
- try to solve the user's case using logic and deduction.
101
-
102
- Type /help to display the chat help.
103
- 📨 user:
104
- Good morning.
105
- 📨 assistant:
106
- Ah, good morning, my dear fellow! It is a pleasure to make your acquaintance. I
107
- am Sherlock Holmes, the renowned detective, and this is my trusty sidekick, Dr.
108
- Watson. Please, have a seat and tell us about the nature of your visit. What
109
- seems to be the problem that has brought you to our humble abode at 221B Baker
110
- Street?
111
-
112
- (Watson nods in encouragement as he takes notes)
113
-
114
- Now, pray tell, what is it that puzzles you, my dear client? A missing item,
115
- perhaps? Or a mysterious occurrence that requires clarification? The game, as
116
- they say, is afoot!
117
- ```
118
-
119
- This example shows how an image like this can be sent to a vision model for
120
- analysis:
121
-
122
- ![cat](spec/assets/kitten.jpg)
123
-
124
- ```
125
- $ ollama_chat -m llava-llama3
126
- Model with architecture llama found.
127
- Connecting to llava-llama3@http://localhost:11434 now…
128
- Type /help to display the chat help.
129
- 📸 user> What's on this image? ./spec/assets/kitten.jpg
130
- 📨 assistant:
131
- The image captures a moment of tranquility featuring a young cat. The cat,
132
- adorned with gray and white fur marked by black stripes on its face and legs,
133
- is the central figure in this scene. Its eyes, a striking shade of blue, are
134
- wide open and directed towards the camera, giving an impression of curiosity or
135
- alertness.
136
-
137
- The cat is comfortably nestled on a red blanket, which contrasts vividly with
138
- its fur. The blanket, soft and inviting, provides a sense of warmth to the
139
- image. In the background, partially obscured by the cat's head, is another
140
- blanket of similar red hue. The repetition of the color adds a sense of harmony
141
- to the composition.
142
-
143
- The cat's position on the right side of the photo creates an interesting
144
- asymmetry with the camera lens, which occupies the left side of the frame. This
145
- visual balance enhances the overall composition of the image.
29
+ ## Usage
146
30
 
147
- There are no discernible texts or other objects in the image. The focus is
148
- solely on the cat and its immediate surroundings. The image does not provide
149
- any information about the location or setting beyond what has been described.
150
- The simplicity of the scene allows the viewer to concentrate on the main
151
- subject - the young, blue-eyed cat.
152
- ```
31
+ In your own software the library can be used as shown in this example:
153
32
 
154
- The following commands can be given inside the chat, if prefixed by a `/`:
33
+ ```ruby
34
+ require "ollama"
35
+ include Ollama
155
36
 
156
- ```
157
- /copy to copy last response to clipboard
158
- /paste to paste content
159
- /markdown toggle markdown output
160
- /stream toggle stream output
161
- /location toggle location submission
162
- /voice( change) toggle voice output or change the voice
163
- /list [n] list the last n / all conversation exchanges
164
- /clear clear the whole conversation
165
- /clobber clear the conversation and collection
166
- /pop [n] pop the last n exchanges, defaults to 1
167
- /model change the model
168
- /system change system prompt (clears conversation)
169
- /regenerate the last answer message
170
- /collection( clear|change) change (default) collection or clear
171
- /info show information for current session
172
- /import source import the source's content
173
- /summarize [n] source summarize the source's content in n words
174
- /embedding toggle embedding paused or not
175
- /embed source embed the source's content
176
- /web [n] query query web search & return n or 1 results
177
- /save filename store conversation messages
178
- /load filename load conversation messages
179
- /quit to quit
180
- /help to view this help
37
+ ollama = Client.new(base_url: 'http://localhost:11434')
38
+ messages = Message.new(role: 'user', content: 'Why is the sky blue?')
39
+ ollama.chat(model: 'llama3.1', stream: true, messages:, &Print) # or
40
+ print ollama.chat(model: 'llama3.1', stream: true, messages:).lazy.map { |response|
41
+ response.message.content
42
+ }
181
43
  ```
182
44
 
183
- ### ollama\_console
45
+ ## Try out things in ollama\_console
184
46
 
185
47
  This is an interactive console, that can be used to try the different commands
186
48
  provided by an `Ollama::Client` instance. For example this command generate a
@@ -197,21 +59,6 @@ Commands: chat,copy,create,delete,embeddings,generate,help,ps,pull,push,show,tag
197
59
  > In a small village nestled between two great palm trees 🌳, there lived a
198
60
  > brave adventurer named Alex 👦. […]
199
61
 
200
- ## Usage
201
-
202
- In your own software the library can be used as shown in this example:
203
-
204
- ```ruby
205
- require "ollama"
206
- include Ollama
207
-
208
- ollama = Client.new(base_url: 'http://localhost:11434')
209
- messages = Message.new(role: 'user', content: 'Why is the sky blue?')
210
- ollama.chat(model: 'llama3.1', stream: true, messages:, &Print) # or
211
- print ollama.chat(model: 'llama3.1', stream: true, messages:).lazy.map { |response|
212
- response.message.content
213
- }
214
- ```
215
62
 
216
63
  ## API
217
64
 
@@ -463,11 +310,166 @@ If `Ollama::Errors::TimeoutError` is raised, it might help to increase the
463
310
 
464
311
  For more generic errors an `Ollama::Errors::Error` is raised.
465
312
 
313
+ ## Other executables
314
+
315
+ ### ollama\_chat
316
+
317
+ This a chat client, that can be used to connect to an ollama server and enter a
318
+ chat converstation with a LLM. It can be called with the following arguments:
319
+
320
+ ```
321
+ Usage: ollama_chat [OPTIONS]
322
+
323
+ -f CONFIG config file to read
324
+ -u URL the ollama base url, OLLAMA_URL
325
+ -m MODEL the ollama model to chat with, OLLAMA_CHAT_MODEL
326
+ -s SYSTEM the system prompt to use as a file, OLLAMA_CHAT_SYSTEM
327
+ -c CHAT a saved chat conversation to load
328
+ -C COLLECTION name of the collection used in this conversation
329
+ -D DOCUMENT load document and add to embeddings collection (multiple)
330
+ -M use (empty) MemoryCache for this chat session
331
+ -E disable embeddings for this chat session
332
+ -V display the current version number and quit
333
+ -h this help
334
+ ```
335
+
336
+ The base URL can be either set by the environment variable `OLLAMA_URL` or it
337
+ is derived from the environment variable `OLLAMA_HOST`. The default model to
338
+ connect can be configured in the environment variable `OLLAMA_MODEL`.
339
+
340
+ The YAML config file in `$XDG_CONFIG_HOME/ollama_chat/config.yml`, that you can
341
+ use for more complex settings, it looks like this:
342
+
343
+ ```
344
+ ---
345
+ url: <%= ENV['OLLAMA_URL'] || 'http://%s' % ENV.fetch('OLLAMA_HOST') %>
346
+ model:
347
+ name: <%= ENV.fetch('OLLAMA_CHAT_MODEL', 'llama3.1') %>
348
+ options:
349
+ num_ctx: 8192
350
+ system: <%= ENV.fetch('OLLAMA_CHAT_SYSTEM', 'null') %>
351
+ voice: Samantha
352
+ markdown: true
353
+ embedding:
354
+ enabled: true
355
+ model:
356
+ name: mxbai-embed-large
357
+ options: {}
358
+ collection: <%= ENV.fetch('OLLAMA_CHAT_COLLECTION', 'ollama_chat') %>
359
+ found_texts_size: 4096
360
+ splitter:
361
+ name: RecursiveCharacter
362
+ chunk_size: 1024
363
+ cache: Ollama::Documents::RedisCache
364
+ redis:
365
+ url: <%= ENV.fetch('REDIS_URL', 'null') %>
366
+ debug: <%= ENV['OLLAMA_CHAT_DEBUG'].to_i == 1 ? true : false %>
367
+ ```
368
+
369
+ If you want to store embeddings persistently, set an environment variable
370
+ `REDIS_URL` or update the `redis.url` setting in your `config.yml` file to
371
+ connect to a Redis server. Without this setup, embeddings will only be stored
372
+ in process memory, which is less durable.
373
+
374
+ Some settings can be passed as arguments as well, e. g. if you want to choose a
375
+ specific system prompt:
376
+
377
+ ```
378
+ $ ollama_chat -s sherlock.txt
379
+ Model with architecture llama found.
380
+ Connecting to llama3.1@http://ollama.local.net:11434 now…
381
+ Configured system prompt is:
382
+ You are Sherlock Holmes and the user is your new client, Dr. Watson is also in
383
+ the room. You will talk and act in the typical manner of Sherlock Holmes do and
384
+ try to solve the user's case using logic and deduction.
385
+
386
+ Type /help to display the chat help.
387
+ 📨 user:
388
+ Good morning.
389
+ 📨 assistant:
390
+ Ah, good morning, my dear fellow! It is a pleasure to make your acquaintance. I
391
+ am Sherlock Holmes, the renowned detective, and this is my trusty sidekick, Dr.
392
+ Watson. Please, have a seat and tell us about the nature of your visit. What
393
+ seems to be the problem that has brought you to our humble abode at 221B Baker
394
+ Street?
395
+
396
+ (Watson nods in encouragement as he takes notes)
397
+
398
+ Now, pray tell, what is it that puzzles you, my dear client? A missing item,
399
+ perhaps? Or a mysterious occurrence that requires clarification? The game, as
400
+ they say, is afoot!
401
+ ```
402
+
403
+ This example shows how an image like this can be sent to a vision model for
404
+ analysis:
405
+
406
+ ![cat](spec/assets/kitten.jpg)
407
+
408
+ ```
409
+ $ ollama_chat -m llava-llama3
410
+ Model with architecture llama found.
411
+ Connecting to llava-llama3@http://localhost:11434 now…
412
+ Type /help to display the chat help.
413
+ 📸 user> What's on this image? ./spec/assets/kitten.jpg
414
+ 📨 assistant:
415
+ The image captures a moment of tranquility featuring a young cat. The cat,
416
+ adorned with gray and white fur marked by black stripes on its face and legs,
417
+ is the central figure in this scene. Its eyes, a striking shade of blue, are
418
+ wide open and directed towards the camera, giving an impression of curiosity or
419
+ alertness.
420
+
421
+ The cat is comfortably nestled on a red blanket, which contrasts vividly with
422
+ its fur. The blanket, soft and inviting, provides a sense of warmth to the
423
+ image. In the background, partially obscured by the cat's head, is another
424
+ blanket of similar red hue. The repetition of the color adds a sense of harmony
425
+ to the composition.
426
+
427
+ The cat's position on the right side of the photo creates an interesting
428
+ asymmetry with the camera lens, which occupies the left side of the frame. This
429
+ visual balance enhances the overall composition of the image.
430
+
431
+ There are no discernible texts or other objects in the image. The focus is
432
+ solely on the cat and its immediate surroundings. The image does not provide
433
+ any information about the location or setting beyond what has been described.
434
+ The simplicity of the scene allows the viewer to concentrate on the main
435
+ subject - the young, blue-eyed cat.
436
+ ```
437
+
438
+ The following commands can be given inside the chat, if prefixed by a `/`:
439
+
440
+ ```
441
+ /copy to copy last response to clipboard
442
+ /paste to paste content
443
+ /markdown toggle markdown output
444
+ /stream toggle stream output
445
+ /location toggle location submission
446
+ /voice( change) toggle voice output or change the voice
447
+ /list [n] list the last n / all conversation exchanges
448
+ /clear clear the whole conversation
449
+ /clobber clear the conversation and collection
450
+ /pop [n] pop the last n exchanges, defaults to 1
451
+ /model change the model
452
+ /system change system prompt (clears conversation)
453
+ /regenerate the last answer message
454
+ /collection( clear|change) change (default) collection or clear
455
+ /info show information for current session
456
+ /document_policy pick a scan policy for document references
457
+ /import source import the source's content
458
+ /summarize [n] source summarize the source's content in n words
459
+ /embedding toggle embedding paused or not
460
+ /embed source embed the source's content
461
+ /web [n] query query web search & return n or 1 results
462
+ /save filename store conversation messages
463
+ /load filename load conversation messages
464
+ /quit to quit
465
+ /help to view this help
466
+ ```
467
+
466
468
  ## Download
467
469
 
468
470
  The homepage of this library is located at
469
471
 
470
- * https://github.com/flori/ollama
472
+ * https://github.com/flori/ollama-ruby
471
473
 
472
474
  ## Author
473
475
 
data/bin/ollama_chat CHANGED
@@ -49,9 +49,10 @@ class OllamaChatConfig
49
49
  voice:
50
50
  enabled: false
51
51
  default: Samantha
52
- list: <%= `say -v ?`.lines.map { _1[/^(.+?)\s+[a-z]{2}_[a-zA-Z0-9]{2,}/, 1] }.uniq.sort.to_s.force_encoding('ASCII-8BIT') %>
52
+ list: <%= `say -v ? 2>/dev/null`.lines.map { _1[/^(.+?)\s+[a-z]{2}_[a-zA-Z0-9]{2,}/, 1] }.uniq.sort.to_s.force_encoding('ASCII-8BIT') %>
53
53
  markdown: true
54
54
  stream: true
55
+ document_policy: importing
55
56
  embedding:
56
57
  enabled: true
57
58
  model:
@@ -59,6 +60,7 @@ class OllamaChatConfig
59
60
  options: {}
60
61
  # Retrieval prompt template:
61
62
  prompt: 'Represent this sentence for searching relevant passages: %s'
63
+ batch_size: 10
62
64
  collection: <%= ENV['OLLAMA_CHAT_COLLECTION'] %>
63
65
  found_texts_size: 4096
64
66
  found_texts_count: null
@@ -478,54 +480,6 @@ def parse_source(source_io)
478
480
  end
479
481
  end
480
482
 
481
- def embed_source(source_io, source)
482
- $embedding.on? or return parse_source(source_io)
483
- puts "Embedding #{italic { source_io&.content_type }} document #{source.to_s.inspect}."
484
- text = parse_source(source_io) or return
485
- text.downcase!
486
- splitter_config = $config.embedding.splitter
487
- inputs = nil
488
- case splitter_config.name
489
- when 'Character'
490
- splitter = Ollama::Documents::Splitters::Character.new(
491
- chunk_size: splitter_config.chunk_size,
492
- )
493
- inputs = splitter.split(text)
494
- when 'RecursiveCharacter'
495
- splitter = Ollama::Documents::Splitters::RecursiveCharacter.new(
496
- chunk_size: splitter_config.chunk_size,
497
- )
498
- inputs = splitter.split(text)
499
- when 'Semantic'
500
- splitter = Ollama::Documents::Splitters::Semantic.new(
501
- ollama:, model: $config.embedding.model.name,
502
- chunk_size: splitter_config.chunk_size,
503
- )
504
- inputs = splitter.split(
505
- text,
506
- breakpoint: splitter_config.breakpoint.to_sym,
507
- percentage: splitter_config.percentage?,
508
- percentile: splitter_config.percentile?,
509
- )
510
- inputs = splitter.split(text)
511
- end
512
- inputs or return
513
- source = source.to_s
514
- if source.start_with?(?!)
515
- source = Ollama::Utils::Width.truncate(
516
- source[1..-1].gsub(/\W+/, ?_),
517
- length: 10
518
- )
519
- end
520
- $documents.add(inputs, source:)
521
- end
522
-
523
- def add_image(images, source_io, source)
524
- STDERR.puts "Adding #{source_io&.content_type} image #{source.to_s.inspect}."
525
- image = Image.for_io(source_io, path: source.to_s)
526
- (images << image).uniq!
527
- end
528
-
529
483
  def http_options(url)
530
484
  options = {}
531
485
  if ssl_no_verify = $config.ssl_no_verify?
@@ -554,7 +508,7 @@ def fetch_source(source, &block)
554
508
  ) do |tmp|
555
509
  block.(tmp)
556
510
  end
557
- when %r(\Afile://(?:(?:[.-]|[[:alnum:]])*)(/\S*)|([~.]?/\S*))
511
+ when %r(\Afile://(/\S*)|\A((?:\.\.|[~.]?)/\S*))
558
512
  filename = $~.captures.compact.first
559
513
  filename = File.expand_path(filename)
560
514
  Utils::Fetcher.read(filename) do |tmp|
@@ -567,30 +521,90 @@ rescue => e
567
521
  STDERR.puts "Cannot fetch source #{source.to_s.inspect}: #{e}\n#{e.backtrace * ?\n}"
568
522
  end
569
523
 
524
+ def add_image(images, source_io, source)
525
+ STDERR.puts "Adding #{source_io&.content_type} image #{source.to_s.inspect}."
526
+ image = Image.for_io(source_io, path: source.to_s)
527
+ (images << image).uniq!
528
+ end
529
+
530
+ def import_source(source_io, source)
531
+ source = source.to_s
532
+ puts "Importing #{italic { source_io&.content_type }} document #{source.inspect} now."
533
+ "Imported #{source.inspect}:\n%s\n\n" % parse_source(source_io)
534
+ end
535
+
570
536
  def import(source)
571
- puts "Now importing #{source.to_s.inspect}."
572
537
  fetch_source(source) do |source_io|
573
- content = parse_source(source_io)
574
- content.present? or return
538
+ content = import_source(source_io, source) or return
575
539
  source_io.rewind
576
540
  content
577
541
  end
578
542
  end
579
543
 
580
- def summarize(source, words: nil)
544
+ def summarize_source(source_io, source, words: nil)
545
+ puts "Summarizing #{italic { source_io&.content_type }} document #{source.inspect} now."
581
546
  words = words.to_i
582
547
  words < 1 and words = 100
583
- puts "Now summarizing #{source.to_s.inspect}."
584
- source_content =
585
- fetch_source(source) do |source_io|
586
- content = parse_source(source_io)
587
- content.present? or return
588
- source_io.rewind
589
- content
590
- end
548
+ source_content = parse_source(source_io)
549
+ source_content.present? or return
591
550
  $config.prompts.summarize % { source_content:, words: }
592
551
  end
593
552
 
553
+ def summarize(source, words: nil)
554
+ fetch_source(source) do |source_io|
555
+ content = summarize_source(source_io, source, words:) or return
556
+ source_io.rewind
557
+ content
558
+ end
559
+ end
560
+
561
+ def embed_source(source_io, source, count: nil)
562
+ $embedding.on? or return parse_source(source_io)
563
+ m = "Embedding #{italic { source_io&.content_type }} document #{source.to_s.inspect}."
564
+ if count
565
+ puts '%u. %s' % [ count, m ]
566
+ else
567
+ puts m
568
+ end
569
+ text = parse_source(source_io) or return
570
+ text.downcase!
571
+ splitter_config = $config.embedding.splitter
572
+ inputs = nil
573
+ case splitter_config.name
574
+ when 'Character'
575
+ splitter = Ollama::Documents::Splitters::Character.new(
576
+ chunk_size: splitter_config.chunk_size,
577
+ )
578
+ inputs = splitter.split(text)
579
+ when 'RecursiveCharacter'
580
+ splitter = Ollama::Documents::Splitters::RecursiveCharacter.new(
581
+ chunk_size: splitter_config.chunk_size,
582
+ )
583
+ inputs = splitter.split(text)
584
+ when 'Semantic'
585
+ splitter = Ollama::Documents::Splitters::Semantic.new(
586
+ ollama:, model: $config.embedding.model.name,
587
+ chunk_size: splitter_config.chunk_size,
588
+ )
589
+ inputs = splitter.split(
590
+ text,
591
+ breakpoint: splitter_config.breakpoint.to_sym,
592
+ percentage: splitter_config.percentage?,
593
+ percentile: splitter_config.percentile?,
594
+ )
595
+ inputs = splitter.split(text)
596
+ end
597
+ inputs or return
598
+ source = source.to_s
599
+ if source.start_with?(?!)
600
+ source = Ollama::Utils::Width.truncate(
601
+ source[1..-1].gsub(/\W+/, ?_),
602
+ length: 10
603
+ )
604
+ end
605
+ $documents.add(inputs, source:, batch_size: $config.embedding.batch_size?)
606
+ end
607
+
594
608
  def embed(source)
595
609
  if $embedding.on?
596
610
  puts "Now embedding #{source.to_s.inspect}."
@@ -612,7 +626,8 @@ def parse_content(content, images)
612
626
  images.clear
613
627
  tags = Utils::Tags.new
614
628
 
615
- content.scan(%r([.~]?/\S+|https?://\S+|#\S+)).each do |source|
629
+ contents = [ content ]
630
+ content.scan(%r((?:\.\.|[.~])?/\S+|https?://\S+|#\S+)).each do |source|
616
631
  case source
617
632
  when /\A#(\S+)/
618
633
  tags.add($1, source:)
@@ -622,8 +637,15 @@ def parse_content(content, images)
622
637
  case source_io&.content_type&.media_type
623
638
  when 'image'
624
639
  add_image(images, source_io, source)
625
- when 'text', 'application'
626
- embed_source(source_io, source)
640
+ when 'text', 'application', nil
641
+ case $document_policy
642
+ when 'importing'
643
+ contents << import_source(source_io, source)
644
+ when 'embedding'
645
+ embed_source(source_io, source)
646
+ when 'summarizing'
647
+ contents << summarize_source(source_io, source)
648
+ end
627
649
  else
628
650
  STDERR.puts(
629
651
  "Cannot fetch #{source.to_s.inspect} with content type "\
@@ -633,8 +655,8 @@ def parse_content(content, images)
633
655
  end
634
656
  end
635
657
  end
636
-
637
- return content, (tags unless tags.empty?)
658
+ new_content = contents.select(&:present?).compact * "\n\n"
659
+ return new_content, (tags unless tags.empty?)
638
660
  end
639
661
 
640
662
  def choose_model(cli_model, current_model)
@@ -668,16 +690,44 @@ def choose_collection(current_collection)
668
690
  end
669
691
  ensure
670
692
  puts "Using collection #{bold{$documents.collection}}."
671
- collection_stats
693
+ info
694
+ end
695
+
696
+ def choose_document_policy
697
+ policies = %w[ importing embedding summarizing ].sort
698
+ current = if policies.index($document_policy)
699
+ $document_policy
700
+ elsif policies.index($config.document_policy)
701
+ $config.document_policy
702
+ else
703
+ policies.first
704
+ end
705
+ policies.unshift('[EXIT]')
706
+ policy = Ollama::Utils::Chooser.choose(policies)
707
+ case policy
708
+ when nil, '[EXIT]'
709
+ puts "Exiting chooser."
710
+ policy = current
711
+ end
712
+ $document_policy = policy
713
+ ensure
714
+ puts "Using document policy #{bold{$document_policy}}."
715
+ info
672
716
  end
673
717
 
674
718
  def collection_stats
719
+ list = $documents.collections.sort.map { |c|
720
+ ' ' + ($documents.collection == c ? bold { c } : c).to_s
721
+ }.join(?\n)
675
722
  puts <<~EOT
676
- Collection
723
+ Current Collection
677
724
  Name: #{bold{$documents.collection}}
678
725
  Embedding model: #{bold{$embedding_model}}
679
726
  #Embeddings: #{$documents.size}
727
+ #Tags: #{$documents.tags.size}
680
728
  Tags: #{$documents.tags}
729
+ List:
730
+ #{list}
681
731
  EOT
682
732
  end
683
733
 
@@ -744,6 +794,7 @@ def info
744
794
  $markdown.show
745
795
  $stream.show
746
796
  $location.show
797
+ puts "Document policy for references in user text: #{bold{$document_policy}}"
747
798
  if $voice.on?
748
799
  puts "Using voice #{bold{$current_voice}} to speak."
749
800
  end
@@ -787,6 +838,7 @@ def display_chat_help
787
838
  /regenerate the last answer message
788
839
  /collection( clear|change) change (default) collection or clear
789
840
  /info show information for current session
841
+ /document_policy pick a scan policy for document references
790
842
  /import source import the source's content
791
843
  /summarize [n] source summarize the source's content in n words
792
844
  /embedding toggle embedding paused or not
@@ -841,10 +893,11 @@ $opts[?V] and version
841
893
  base_url = $opts[?u] || $config.url
842
894
  $ollama = Client.new(base_url:, debug: $config.debug)
843
895
 
844
- $model = choose_model($opts[?m], $config.model.name)
845
- options = Options[$config.model.options]
846
- model_system = pull_model_unless_present($model, options)
847
- messages = []
896
+ $document_policy = $config.document_policy
897
+ $model = choose_model($opts[?m], $config.model.name)
898
+ options = Options[$config.model.options]
899
+ model_system = pull_model_unless_present($model, options)
900
+ messages = []
848
901
  $embedding_enabled.set($config.embedding.enabled && !$opts[?E])
849
902
 
850
903
  if $opts[?c]
@@ -889,11 +942,13 @@ if $embedding.on?
889
942
  end
890
943
  end
891
944
  puts "Collection #{bold{collection}}: Adding #{document_list.size} documents…"
945
+ count = 1
892
946
  document_list.each_slice(25) do |docs|
893
947
  docs.each do |doc|
894
948
  fetch_source(doc) do |doc_io|
895
- embed_source(doc_io, doc)
949
+ embed_source(doc_io, doc, count:)
896
950
  end
951
+ count += 1
897
952
  end
898
953
  end
899
954
  end
@@ -955,7 +1010,7 @@ loop do
955
1010
  puts "Cleared messages."
956
1011
  next
957
1012
  when %r(^/clobber$)
958
- if ask?(prompt: 'Are you sure? (y/n) ') =~ /\Ay/i
1013
+ if ask?(prompt: 'Are you sure to clear messages and collection? (y/n) ') =~ /\Ay/i
959
1014
  clear_messages(messages)
960
1015
  $documents.clear
961
1016
  puts "Cleared messages and collection #{bold{$documents.collection}}."
@@ -1020,9 +1075,12 @@ loop do
1020
1075
  choose_collection($documents.collection)
1021
1076
  end
1022
1077
  next
1023
- when %r(/info)
1078
+ when %r(^/info$)
1024
1079
  info
1025
1080
  next
1081
+ when %r(^/document_policy$)
1082
+ choose_document_policy
1083
+ next
1026
1084
  when %r(^/import\s+(.+))
1027
1085
  parse_content = false
1028
1086
  content = import($1) or next