ollama-ruby 0.7.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -26,161 +26,23 @@ gem 'ollama-ruby'
26
26
 
27
27
  to your Gemfile and run `bundle install` in your terminal.
28
28
 
29
- ## Executables
30
-
31
- ### ollama\_chat
32
-
33
- This a chat client, that can be used to connect to an ollama server and enter a
34
- chat converstation with a LLM. It can be called with the following arguments:
35
-
36
- ```
37
- Usage: ollama_chat [OPTIONS]
38
-
39
- -f CONFIG config file to read
40
- -u URL the ollama base url, OLLAMA_URL
41
- -m MODEL the ollama model to chat with, OLLAMA_CHAT_MODEL
42
- -s SYSTEM the system prompt to use as a file, OLLAMA_CHAT_SYSTEM
43
- -c CHAT a saved chat conversation to load
44
- -C COLLECTION name of the collection used in this conversation
45
- -D DOCUMENT load document and add to embeddings collection (multiple)
46
- -M use (empty) MemoryCache for this chat session
47
- -E disable embeddings for this chat session
48
- -V display the current version number and quit
49
- -h this help
50
- ```
51
-
52
- The base URL can be either set by the environment variable `OLLAMA_URL` or it
53
- is derived from the environment variable `OLLAMA_HOST`. The default model to
54
- connect can be configured in the environment variable `OLLAMA_MODEL`.
55
-
56
- The YAML config file in `$XDG_CONFIG_HOME/ollama_chat/config.yml`, that you can
57
- use for more complex settings, it looks like this:
58
-
59
- ```
60
- ---
61
- url: <%= ENV['OLLAMA_URL'] || 'http://%s' % ENV.fetch('OLLAMA_HOST') %>
62
- model:
63
- name: <%= ENV.fetch('OLLAMA_CHAT_MODEL', 'llama3.1') %>
64
- options:
65
- num_ctx: 8192
66
- system: <%= ENV.fetch('OLLAMA_CHAT_SYSTEM', 'null') %>
67
- voice: Samantha
68
- markdown: true
69
- embedding:
70
- enabled: true
71
- model:
72
- name: mxbai-embed-large
73
- options: {}
74
- collection: <%= ENV.fetch('OLLAMA_CHAT_COLLECTION', 'ollama_chat') %>
75
- found_texts_size: 4096
76
- splitter:
77
- name: RecursiveCharacter
78
- chunk_size: 1024
79
- cache: Ollama::Documents::RedisCache
80
- redis:
81
- url: <%= ENV.fetch('REDIS_URL', 'null') %>
82
- debug: <%= ENV['OLLAMA_CHAT_DEBUG'].to_i == 1 ? true : false %>
83
- ```
84
-
85
- If you want to store embeddings persistently, set an environment variable
86
- `REDIS_URL` or update the `redis.url` setting in your `config.yml` file to
87
- connect to a Redis server. Without this setup, embeddings will only be stored
88
- in process memory, which is less durable.
89
-
90
- Some settings can be passed as arguments as well, e. g. if you want to choose a
91
- specific system prompt:
92
-
93
- ```
94
- $ ollama_chat -s sherlock.txt
95
- Model with architecture llama found.
96
- Connecting to llama3.1@http://ollama.local.net:11434 now…
97
- Configured system prompt is:
98
- You are Sherlock Holmes and the user is your new client, Dr. Watson is also in
99
- the room. You will talk and act in the typical manner of Sherlock Holmes do and
100
- try to solve the user's case using logic and deduction.
101
-
102
- Type /help to display the chat help.
103
- 📨 user:
104
- Good morning.
105
- 📨 assistant:
106
- Ah, good morning, my dear fellow! It is a pleasure to make your acquaintance. I
107
- am Sherlock Holmes, the renowned detective, and this is my trusty sidekick, Dr.
108
- Watson. Please, have a seat and tell us about the nature of your visit. What
109
- seems to be the problem that has brought you to our humble abode at 221B Baker
110
- Street?
111
-
112
- (Watson nods in encouragement as he takes notes)
113
-
114
- Now, pray tell, what is it that puzzles you, my dear client? A missing item,
115
- perhaps? Or a mysterious occurrence that requires clarification? The game, as
116
- they say, is afoot!
117
- ```
118
-
119
- This example shows how an image like this can be sent to a vision model for
120
- analysis:
121
-
122
- ![cat](spec/assets/kitten.jpg)
123
-
124
- ```
125
- $ ollama_chat -m llava-llama3
126
- Model with architecture llama found.
127
- Connecting to llava-llama3@http://localhost:11434 now…
128
- Type /help to display the chat help.
129
- 📸 user> What's on this image? ./spec/assets/kitten.jpg
130
- 📨 assistant:
131
- The image captures a moment of tranquility featuring a young cat. The cat,
132
- adorned with gray and white fur marked by black stripes on its face and legs,
133
- is the central figure in this scene. Its eyes, a striking shade of blue, are
134
- wide open and directed towards the camera, giving an impression of curiosity or
135
- alertness.
136
-
137
- The cat is comfortably nestled on a red blanket, which contrasts vividly with
138
- its fur. The blanket, soft and inviting, provides a sense of warmth to the
139
- image. In the background, partially obscured by the cat's head, is another
140
- blanket of similar red hue. The repetition of the color adds a sense of harmony
141
- to the composition.
142
-
143
- The cat's position on the right side of the photo creates an interesting
144
- asymmetry with the camera lens, which occupies the left side of the frame. This
145
- visual balance enhances the overall composition of the image.
29
+ ## Usage
146
30
 
147
- There are no discernible texts or other objects in the image. The focus is
148
- solely on the cat and its immediate surroundings. The image does not provide
149
- any information about the location or setting beyond what has been described.
150
- The simplicity of the scene allows the viewer to concentrate on the main
151
- subject - the young, blue-eyed cat.
152
- ```
31
+ In your own software the library can be used as shown in this example:
153
32
 
154
- The following commands can be given inside the chat, if prefixed by a `/`:
33
+ ```ruby
34
+ require "ollama"
35
+ include Ollama
155
36
 
156
- ```
157
- /copy to copy last response to clipboard
158
- /paste to paste content
159
- /markdown toggle markdown output
160
- /stream toggle stream output
161
- /location toggle location submission
162
- /voice( change) toggle voice output or change the voice
163
- /list [n] list the last n / all conversation exchanges
164
- /clear clear the whole conversation
165
- /clobber clear the conversation and collection
166
- /pop [n] pop the last n exchanges, defaults to 1
167
- /model change the model
168
- /system change system prompt (clears conversation)
169
- /regenerate the last answer message
170
- /collection( clear|change) change (default) collection or clear
171
- /info show information for current session
172
- /import source import the source's content
173
- /summarize [n] source summarize the source's content in n words
174
- /embedding toggle embedding paused or not
175
- /embed source embed the source's content
176
- /web [n] query query web search & return n or 1 results
177
- /save filename store conversation messages
178
- /load filename load conversation messages
179
- /quit to quit
180
- /help to view this help
37
+ ollama = Client.new(base_url: 'http://localhost:11434')
38
+ messages = Message.new(role: 'user', content: 'Why is the sky blue?')
39
+ ollama.chat(model: 'llama3.1', stream: true, messages:, &Print) # or
40
+ print ollama.chat(model: 'llama3.1', stream: true, messages:).lazy.map { |response|
41
+ response.message.content
42
+ }
181
43
  ```
182
44
 
183
- ### ollama\_console
45
+ ## Try out things in ollama\_console
184
46
 
185
47
  This is an interactive console, that can be used to try the different commands
186
48
  provided by an `Ollama::Client` instance. For example this command generate a
@@ -197,21 +59,6 @@ Commands: chat,copy,create,delete,embeddings,generate,help,ps,pull,push,show,tag
197
59
  > In a small village nestled between two great palm trees 🌳, there lived a
198
60
  > brave adventurer named Alex 👦. […]
199
61
 
200
- ## Usage
201
-
202
- In your own software the library can be used as shown in this example:
203
-
204
- ```ruby
205
- require "ollama"
206
- include Ollama
207
-
208
- ollama = Client.new(base_url: 'http://localhost:11434')
209
- messages = Message.new(role: 'user', content: 'Why is the sky blue?')
210
- ollama.chat(model: 'llama3.1', stream: true, messages:, &Print) # or
211
- print ollama.chat(model: 'llama3.1', stream: true, messages:).lazy.map { |response|
212
- response.message.content
213
- }
214
- ```
215
62
 
216
63
  ## API
217
64
 
@@ -463,11 +310,166 @@ If `Ollama::Errors::TimeoutError` is raised, it might help to increase the
463
310
 
464
311
  For more generic errors an `Ollama::Errors::Error` is raised.
465
312
 
313
+ ## Other executables
314
+
315
+ ### ollama\_chat
316
+
317
+ This a chat client, that can be used to connect to an ollama server and enter a
318
+ chat converstation with a LLM. It can be called with the following arguments:
319
+
320
+ ```
321
+ Usage: ollama_chat [OPTIONS]
322
+
323
+ -f CONFIG config file to read
324
+ -u URL the ollama base url, OLLAMA_URL
325
+ -m MODEL the ollama model to chat with, OLLAMA_CHAT_MODEL
326
+ -s SYSTEM the system prompt to use as a file, OLLAMA_CHAT_SYSTEM
327
+ -c CHAT a saved chat conversation to load
328
+ -C COLLECTION name of the collection used in this conversation
329
+ -D DOCUMENT load document and add to embeddings collection (multiple)
330
+ -M use (empty) MemoryCache for this chat session
331
+ -E disable embeddings for this chat session
332
+ -V display the current version number and quit
333
+ -h this help
334
+ ```
335
+
336
+ The base URL can be either set by the environment variable `OLLAMA_URL` or it
337
+ is derived from the environment variable `OLLAMA_HOST`. The default model to
338
+ connect can be configured in the environment variable `OLLAMA_MODEL`.
339
+
340
+ The YAML config file in `$XDG_CONFIG_HOME/ollama_chat/config.yml`, that you can
341
+ use for more complex settings, it looks like this:
342
+
343
+ ```
344
+ ---
345
+ url: <%= ENV['OLLAMA_URL'] || 'http://%s' % ENV.fetch('OLLAMA_HOST') %>
346
+ model:
347
+ name: <%= ENV.fetch('OLLAMA_CHAT_MODEL', 'llama3.1') %>
348
+ options:
349
+ num_ctx: 8192
350
+ system: <%= ENV.fetch('OLLAMA_CHAT_SYSTEM', 'null') %>
351
+ voice: Samantha
352
+ markdown: true
353
+ embedding:
354
+ enabled: true
355
+ model:
356
+ name: mxbai-embed-large
357
+ options: {}
358
+ collection: <%= ENV.fetch('OLLAMA_CHAT_COLLECTION', 'ollama_chat') %>
359
+ found_texts_size: 4096
360
+ splitter:
361
+ name: RecursiveCharacter
362
+ chunk_size: 1024
363
+ cache: Ollama::Documents::RedisCache
364
+ redis:
365
+ url: <%= ENV.fetch('REDIS_URL', 'null') %>
366
+ debug: <%= ENV['OLLAMA_CHAT_DEBUG'].to_i == 1 ? true : false %>
367
+ ```
368
+
369
+ If you want to store embeddings persistently, set an environment variable
370
+ `REDIS_URL` or update the `redis.url` setting in your `config.yml` file to
371
+ connect to a Redis server. Without this setup, embeddings will only be stored
372
+ in process memory, which is less durable.
373
+
374
+ Some settings can be passed as arguments as well, e. g. if you want to choose a
375
+ specific system prompt:
376
+
377
+ ```
378
+ $ ollama_chat -s sherlock.txt
379
+ Model with architecture llama found.
380
+ Connecting to llama3.1@http://ollama.local.net:11434 now…
381
+ Configured system prompt is:
382
+ You are Sherlock Holmes and the user is your new client, Dr. Watson is also in
383
+ the room. You will talk and act in the typical manner of Sherlock Holmes do and
384
+ try to solve the user's case using logic and deduction.
385
+
386
+ Type /help to display the chat help.
387
+ 📨 user:
388
+ Good morning.
389
+ 📨 assistant:
390
+ Ah, good morning, my dear fellow! It is a pleasure to make your acquaintance. I
391
+ am Sherlock Holmes, the renowned detective, and this is my trusty sidekick, Dr.
392
+ Watson. Please, have a seat and tell us about the nature of your visit. What
393
+ seems to be the problem that has brought you to our humble abode at 221B Baker
394
+ Street?
395
+
396
+ (Watson nods in encouragement as he takes notes)
397
+
398
+ Now, pray tell, what is it that puzzles you, my dear client? A missing item,
399
+ perhaps? Or a mysterious occurrence that requires clarification? The game, as
400
+ they say, is afoot!
401
+ ```
402
+
403
+ This example shows how an image like this can be sent to a vision model for
404
+ analysis:
405
+
406
+ ![cat](spec/assets/kitten.jpg)
407
+
408
+ ```
409
+ $ ollama_chat -m llava-llama3
410
+ Model with architecture llama found.
411
+ Connecting to llava-llama3@http://localhost:11434 now…
412
+ Type /help to display the chat help.
413
+ 📸 user> What's on this image? ./spec/assets/kitten.jpg
414
+ 📨 assistant:
415
+ The image captures a moment of tranquility featuring a young cat. The cat,
416
+ adorned with gray and white fur marked by black stripes on its face and legs,
417
+ is the central figure in this scene. Its eyes, a striking shade of blue, are
418
+ wide open and directed towards the camera, giving an impression of curiosity or
419
+ alertness.
420
+
421
+ The cat is comfortably nestled on a red blanket, which contrasts vividly with
422
+ its fur. The blanket, soft and inviting, provides a sense of warmth to the
423
+ image. In the background, partially obscured by the cat's head, is another
424
+ blanket of similar red hue. The repetition of the color adds a sense of harmony
425
+ to the composition.
426
+
427
+ The cat's position on the right side of the photo creates an interesting
428
+ asymmetry with the camera lens, which occupies the left side of the frame. This
429
+ visual balance enhances the overall composition of the image.
430
+
431
+ There are no discernible texts or other objects in the image. The focus is
432
+ solely on the cat and its immediate surroundings. The image does not provide
433
+ any information about the location or setting beyond what has been described.
434
+ The simplicity of the scene allows the viewer to concentrate on the main
435
+ subject - the young, blue-eyed cat.
436
+ ```
437
+
438
+ The following commands can be given inside the chat, if prefixed by a `/`:
439
+
440
+ ```
441
+ /copy to copy last response to clipboard
442
+ /paste to paste content
443
+ /markdown toggle markdown output
444
+ /stream toggle stream output
445
+ /location toggle location submission
446
+ /voice( change) toggle voice output or change the voice
447
+ /list [n] list the last n / all conversation exchanges
448
+ /clear clear the whole conversation
449
+ /clobber clear the conversation and collection
450
+ /pop [n] pop the last n exchanges, defaults to 1
451
+ /model change the model
452
+ /system change system prompt (clears conversation)
453
+ /regenerate the last answer message
454
+ /collection( clear|change) change (default) collection or clear
455
+ /info show information for current session
456
+ /document_policy pick a scan policy for document references
457
+ /import source import the source's content
458
+ /summarize [n] source summarize the source's content in n words
459
+ /embedding toggle embedding paused or not
460
+ /embed source embed the source's content
461
+ /web [n] query query web search & return n or 1 results
462
+ /save filename store conversation messages
463
+ /load filename load conversation messages
464
+ /quit to quit
465
+ /help to view this help
466
+ ```
467
+
466
468
  ## Download
467
469
 
468
470
  The homepage of this library is located at
469
471
 
470
- * https://github.com/flori/ollama
472
+ * https://github.com/flori/ollama-ruby
471
473
 
472
474
  ## Author
473
475
 
data/bin/ollama_chat CHANGED
@@ -49,9 +49,10 @@ class OllamaChatConfig
49
49
  voice:
50
50
  enabled: false
51
51
  default: Samantha
52
- list: <%= `say -v ?`.lines.map { _1[/^(.+?)\s+[a-z]{2}_[a-zA-Z0-9]{2,}/, 1] }.uniq.sort.to_s.force_encoding('ASCII-8BIT') %>
52
+ list: <%= `say -v ? 2>/dev/null`.lines.map { _1[/^(.+?)\s+[a-z]{2}_[a-zA-Z0-9]{2,}/, 1] }.uniq.sort.to_s.force_encoding('ASCII-8BIT') %>
53
53
  markdown: true
54
54
  stream: true
55
+ document_policy: importing
55
56
  embedding:
56
57
  enabled: true
57
58
  model:
@@ -59,6 +60,7 @@ class OllamaChatConfig
59
60
  options: {}
60
61
  # Retrieval prompt template:
61
62
  prompt: 'Represent this sentence for searching relevant passages: %s'
63
+ batch_size: 10
62
64
  collection: <%= ENV['OLLAMA_CHAT_COLLECTION'] %>
63
65
  found_texts_size: 4096
64
66
  found_texts_count: null
@@ -478,54 +480,6 @@ def parse_source(source_io)
478
480
  end
479
481
  end
480
482
 
481
- def embed_source(source_io, source)
482
- $embedding.on? or return parse_source(source_io)
483
- puts "Embedding #{italic { source_io&.content_type }} document #{source.to_s.inspect}."
484
- text = parse_source(source_io) or return
485
- text.downcase!
486
- splitter_config = $config.embedding.splitter
487
- inputs = nil
488
- case splitter_config.name
489
- when 'Character'
490
- splitter = Ollama::Documents::Splitters::Character.new(
491
- chunk_size: splitter_config.chunk_size,
492
- )
493
- inputs = splitter.split(text)
494
- when 'RecursiveCharacter'
495
- splitter = Ollama::Documents::Splitters::RecursiveCharacter.new(
496
- chunk_size: splitter_config.chunk_size,
497
- )
498
- inputs = splitter.split(text)
499
- when 'Semantic'
500
- splitter = Ollama::Documents::Splitters::Semantic.new(
501
- ollama:, model: $config.embedding.model.name,
502
- chunk_size: splitter_config.chunk_size,
503
- )
504
- inputs = splitter.split(
505
- text,
506
- breakpoint: splitter_config.breakpoint.to_sym,
507
- percentage: splitter_config.percentage?,
508
- percentile: splitter_config.percentile?,
509
- )
510
- inputs = splitter.split(text)
511
- end
512
- inputs or return
513
- source = source.to_s
514
- if source.start_with?(?!)
515
- source = Ollama::Utils::Width.truncate(
516
- source[1..-1].gsub(/\W+/, ?_),
517
- length: 10
518
- )
519
- end
520
- $documents.add(inputs, source:)
521
- end
522
-
523
- def add_image(images, source_io, source)
524
- STDERR.puts "Adding #{source_io&.content_type} image #{source.to_s.inspect}."
525
- image = Image.for_io(source_io, path: source.to_s)
526
- (images << image).uniq!
527
- end
528
-
529
483
  def http_options(url)
530
484
  options = {}
531
485
  if ssl_no_verify = $config.ssl_no_verify?
@@ -554,7 +508,7 @@ def fetch_source(source, &block)
554
508
  ) do |tmp|
555
509
  block.(tmp)
556
510
  end
557
- when %r(\Afile://(?:(?:[.-]|[[:alnum:]])*)(/\S*)|([~.]?/\S*))
511
+ when %r(\Afile://(/\S*)|\A((?:\.\.|[~.]?)/\S*))
558
512
  filename = $~.captures.compact.first
559
513
  filename = File.expand_path(filename)
560
514
  Utils::Fetcher.read(filename) do |tmp|
@@ -567,30 +521,90 @@ rescue => e
567
521
  STDERR.puts "Cannot fetch source #{source.to_s.inspect}: #{e}\n#{e.backtrace * ?\n}"
568
522
  end
569
523
 
524
+ def add_image(images, source_io, source)
525
+ STDERR.puts "Adding #{source_io&.content_type} image #{source.to_s.inspect}."
526
+ image = Image.for_io(source_io, path: source.to_s)
527
+ (images << image).uniq!
528
+ end
529
+
530
+ def import_source(source_io, source)
531
+ source = source.to_s
532
+ puts "Importing #{italic { source_io&.content_type }} document #{source.inspect} now."
533
+ "Imported #{source.inspect}:\n%s\n\n" % parse_source(source_io)
534
+ end
535
+
570
536
  def import(source)
571
- puts "Now importing #{source.to_s.inspect}."
572
537
  fetch_source(source) do |source_io|
573
- content = parse_source(source_io)
574
- content.present? or return
538
+ content = import_source(source_io, source) or return
575
539
  source_io.rewind
576
540
  content
577
541
  end
578
542
  end
579
543
 
580
- def summarize(source, words: nil)
544
+ def summarize_source(source_io, source, words: nil)
545
+ puts "Summarizing #{italic { source_io&.content_type }} document #{source.inspect} now."
581
546
  words = words.to_i
582
547
  words < 1 and words = 100
583
- puts "Now summarizing #{source.to_s.inspect}."
584
- source_content =
585
- fetch_source(source) do |source_io|
586
- content = parse_source(source_io)
587
- content.present? or return
588
- source_io.rewind
589
- content
590
- end
548
+ source_content = parse_source(source_io)
549
+ source_content.present? or return
591
550
  $config.prompts.summarize % { source_content:, words: }
592
551
  end
593
552
 
553
+ def summarize(source, words: nil)
554
+ fetch_source(source) do |source_io|
555
+ content = summarize_source(source_io, source, words:) or return
556
+ source_io.rewind
557
+ content
558
+ end
559
+ end
560
+
561
+ def embed_source(source_io, source, count: nil)
562
+ $embedding.on? or return parse_source(source_io)
563
+ m = "Embedding #{italic { source_io&.content_type }} document #{source.to_s.inspect}."
564
+ if count
565
+ puts '%u. %s' % [ count, m ]
566
+ else
567
+ puts m
568
+ end
569
+ text = parse_source(source_io) or return
570
+ text.downcase!
571
+ splitter_config = $config.embedding.splitter
572
+ inputs = nil
573
+ case splitter_config.name
574
+ when 'Character'
575
+ splitter = Ollama::Documents::Splitters::Character.new(
576
+ chunk_size: splitter_config.chunk_size,
577
+ )
578
+ inputs = splitter.split(text)
579
+ when 'RecursiveCharacter'
580
+ splitter = Ollama::Documents::Splitters::RecursiveCharacter.new(
581
+ chunk_size: splitter_config.chunk_size,
582
+ )
583
+ inputs = splitter.split(text)
584
+ when 'Semantic'
585
+ splitter = Ollama::Documents::Splitters::Semantic.new(
586
+ ollama:, model: $config.embedding.model.name,
587
+ chunk_size: splitter_config.chunk_size,
588
+ )
589
+ inputs = splitter.split(
590
+ text,
591
+ breakpoint: splitter_config.breakpoint.to_sym,
592
+ percentage: splitter_config.percentage?,
593
+ percentile: splitter_config.percentile?,
594
+ )
595
+ inputs = splitter.split(text)
596
+ end
597
+ inputs or return
598
+ source = source.to_s
599
+ if source.start_with?(?!)
600
+ source = Ollama::Utils::Width.truncate(
601
+ source[1..-1].gsub(/\W+/, ?_),
602
+ length: 10
603
+ )
604
+ end
605
+ $documents.add(inputs, source:, batch_size: $config.embedding.batch_size?)
606
+ end
607
+
594
608
  def embed(source)
595
609
  if $embedding.on?
596
610
  puts "Now embedding #{source.to_s.inspect}."
@@ -612,7 +626,8 @@ def parse_content(content, images)
612
626
  images.clear
613
627
  tags = Utils::Tags.new
614
628
 
615
- content.scan(%r([.~]?/\S+|https?://\S+|#\S+)).each do |source|
629
+ contents = [ content ]
630
+ content.scan(%r((?:\.\.|[.~])?/\S+|https?://\S+|#\S+)).each do |source|
616
631
  case source
617
632
  when /\A#(\S+)/
618
633
  tags.add($1, source:)
@@ -622,8 +637,15 @@ def parse_content(content, images)
622
637
  case source_io&.content_type&.media_type
623
638
  when 'image'
624
639
  add_image(images, source_io, source)
625
- when 'text', 'application'
626
- embed_source(source_io, source)
640
+ when 'text', 'application', nil
641
+ case $document_policy
642
+ when 'importing'
643
+ contents << import_source(source_io, source)
644
+ when 'embedding'
645
+ embed_source(source_io, source)
646
+ when 'summarizing'
647
+ contents << summarize_source(source_io, source)
648
+ end
627
649
  else
628
650
  STDERR.puts(
629
651
  "Cannot fetch #{source.to_s.inspect} with content type "\
@@ -633,8 +655,8 @@ def parse_content(content, images)
633
655
  end
634
656
  end
635
657
  end
636
-
637
- return content, (tags unless tags.empty?)
658
+ new_content = contents.select(&:present?).compact * "\n\n"
659
+ return new_content, (tags unless tags.empty?)
638
660
  end
639
661
 
640
662
  def choose_model(cli_model, current_model)
@@ -668,16 +690,44 @@ def choose_collection(current_collection)
668
690
  end
669
691
  ensure
670
692
  puts "Using collection #{bold{$documents.collection}}."
671
- collection_stats
693
+ info
694
+ end
695
+
696
+ def choose_document_policy
697
+ policies = %w[ importing embedding summarizing ].sort
698
+ current = if policies.index($document_policy)
699
+ $document_policy
700
+ elsif policies.index($config.document_policy)
701
+ $config.document_policy
702
+ else
703
+ policies.first
704
+ end
705
+ policies.unshift('[EXIT]')
706
+ policy = Ollama::Utils::Chooser.choose(policies)
707
+ case policy
708
+ when nil, '[EXIT]'
709
+ puts "Exiting chooser."
710
+ policy = current
711
+ end
712
+ $document_policy = policy
713
+ ensure
714
+ puts "Using document policy #{bold{$document_policy}}."
715
+ info
672
716
  end
673
717
 
674
718
  def collection_stats
719
+ list = $documents.collections.sort.map { |c|
720
+ ' ' + ($documents.collection == c ? bold { c } : c).to_s
721
+ }.join(?\n)
675
722
  puts <<~EOT
676
- Collection
723
+ Current Collection
677
724
  Name: #{bold{$documents.collection}}
678
725
  Embedding model: #{bold{$embedding_model}}
679
726
  #Embeddings: #{$documents.size}
727
+ #Tags: #{$documents.tags.size}
680
728
  Tags: #{$documents.tags}
729
+ List:
730
+ #{list}
681
731
  EOT
682
732
  end
683
733
 
@@ -744,6 +794,7 @@ def info
744
794
  $markdown.show
745
795
  $stream.show
746
796
  $location.show
797
+ puts "Document policy for references in user text: #{bold{$document_policy}}"
747
798
  if $voice.on?
748
799
  puts "Using voice #{bold{$current_voice}} to speak."
749
800
  end
@@ -787,6 +838,7 @@ def display_chat_help
787
838
  /regenerate the last answer message
788
839
  /collection( clear|change) change (default) collection or clear
789
840
  /info show information for current session
841
+ /document_policy pick a scan policy for document references
790
842
  /import source import the source's content
791
843
  /summarize [n] source summarize the source's content in n words
792
844
  /embedding toggle embedding paused or not
@@ -841,10 +893,11 @@ $opts[?V] and version
841
893
  base_url = $opts[?u] || $config.url
842
894
  $ollama = Client.new(base_url:, debug: $config.debug)
843
895
 
844
- $model = choose_model($opts[?m], $config.model.name)
845
- options = Options[$config.model.options]
846
- model_system = pull_model_unless_present($model, options)
847
- messages = []
896
+ $document_policy = $config.document_policy
897
+ $model = choose_model($opts[?m], $config.model.name)
898
+ options = Options[$config.model.options]
899
+ model_system = pull_model_unless_present($model, options)
900
+ messages = []
848
901
  $embedding_enabled.set($config.embedding.enabled && !$opts[?E])
849
902
 
850
903
  if $opts[?c]
@@ -889,11 +942,13 @@ if $embedding.on?
889
942
  end
890
943
  end
891
944
  puts "Collection #{bold{collection}}: Adding #{document_list.size} documents…"
945
+ count = 1
892
946
  document_list.each_slice(25) do |docs|
893
947
  docs.each do |doc|
894
948
  fetch_source(doc) do |doc_io|
895
- embed_source(doc_io, doc)
949
+ embed_source(doc_io, doc, count:)
896
950
  end
951
+ count += 1
897
952
  end
898
953
  end
899
954
  end
@@ -955,7 +1010,7 @@ loop do
955
1010
  puts "Cleared messages."
956
1011
  next
957
1012
  when %r(^/clobber$)
958
- if ask?(prompt: 'Are you sure? (y/n) ') =~ /\Ay/i
1013
+ if ask?(prompt: 'Are you sure to clear messages and collection? (y/n) ') =~ /\Ay/i
959
1014
  clear_messages(messages)
960
1015
  $documents.clear
961
1016
  puts "Cleared messages and collection #{bold{$documents.collection}}."
@@ -1020,9 +1075,12 @@ loop do
1020
1075
  choose_collection($documents.collection)
1021
1076
  end
1022
1077
  next
1023
- when %r(/info)
1078
+ when %r(^/info$)
1024
1079
  info
1025
1080
  next
1081
+ when %r(^/document_policy$)
1082
+ choose_document_policy
1083
+ next
1026
1084
  when %r(^/import\s+(.+))
1027
1085
  parse_content = false
1028
1086
  content = import($1) or next