ollama-ruby 0.11.0 → 0.12.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6611c14b779a919b256552774080305825b4e81cd5e43ae0303026aaaed7c13e
4
- data.tar.gz: a34e71ef5b07cdd8a0cdf5dce9de7f7e9c3faca1829e4837c54ea243458b8f05
3
+ metadata.gz: 417e37f903a49bd6cea779cbf15674cc89ebee9c9ea9bf8930fc2b1fb8007794
4
+ data.tar.gz: f171af3e86996daf776df3f2b28dbf39c71c23139118a879b36e9cbabae5f376
5
5
  SHA512:
6
- metadata.gz: aa38db4bdd42ebffe2ef9b5abf7f644e38821acc4fb9650b7e8646a9fab3350640224c4ba59e13d7f182becd2a84ac881fd0c5d136ca88782e1a7febec31c2fe
7
- data.tar.gz: b14f2626d9f2c8e6e8bc6c3b3d8947d78b4400cf0dd8888cf9f818261ba3605fc9971c2c9015f976ec48f2004cb4b0ff5d922e2efe001b5b443f03728f14ce51
6
+ metadata.gz: 7f236cf84e27feb750836fe695cf5812f4ac4af3fb63c6694e71ce6a077383abc14030aed184d668de59e9d082ba8ab890eddfc2052d817b27b708b69af540b9
7
+ data.tar.gz: 585818aceb2af7d0cad69566d6bd62c8b443650b03f844650831a8e6fcdfcd370b93f8127b85726548ff2fe08a817d5c4f1333c8f8af81af5d8777e8cc57610b
data/CHANGES.md CHANGED
@@ -1,5 +1,43 @@
1
1
  # Changes
2
2
 
3
+ ## 2024-11-26 v0.12.0
4
+
5
+ * **Upgrade display/clear links used in chat**:
6
+ * Created `$links` set to store used links.
7
+ * Added `/links` command to display used links as a enumerated list.
8
+ * Implemented `/links (clear)` feature to remove all or specific used links.
9
+ * **Update semantic splitter to handle embeddings size < 2**:
10
+ + Added condition to return sentences directly when embeddings size is less
11
+ than 2.
12
+ * **Removed collection list from chat info output**
13
+ * **Add SQLiteCache spec for convert_to_vector method**:
14
+ - Test creates a vector with two elements and checks if
15
+ `cache.convert_to_vector(vector)` returns the same vector (which for this
16
+ cache is just a Ruby array).
17
+ * **Add tests for retrieving tags from cache**:
18
+ * Test if tags are returned as an instance of `Ollama::Utils::Tags`
19
+ * Test also checks if the order of the tags is correct
20
+ * **Added test case for clearing tags from `Ollama::Documents::SQLiteCache`**
21
+ - Updated spec for new `clear_for_tags` method
22
+ * **Migrate SQLite cache to use new clear_for_tags method**:
23
+ + Added `clear_for_tags` method to SQLiteCache class in `sqlite_cache.rb`
24
+ + Modified `clear` method in `records.rb` to call `clear_for_tags` if
25
+ available
26
+ + Created `find_records_for_tags` method in `sqlite_cache.rb` to find records
27
+ by tags
28
+ + Updated `find_records` method in `sqlite_cache.rb` to use new
29
+ `find_records_for_tags` method
30
+ * **Use Ollama::Utils::Tags for consistently handling tags**
31
+ * **Upgrade SQLite cache to use correct prefix for full_each**:
32
+ * Use `?%` as the default prefix in `SQLiteCache#full_each`
33
+ * Add specs for setting keys with different prefixes in `SQLiteCache`
34
+ * Add specs for setting keys with different prefixes in `MemoryCache`
35
+ * **Refactor SQLite cache query explanation**
36
+ + Use new variable `e` to store sanitized query for debugging purposes
37
+ + Pass sanitized query `e` to `@database.execute` for `EXPLAIN` instead of
38
+ original query `a[0]`
39
+ * **Add test for unique tags with leading # characters**
40
+
3
41
  ## 2024-11-20 v0.11.0
4
42
 
5
43
  * Added `voice` and `interactive` reader attributes to the Say handler class.
@@ -25,7 +63,7 @@
25
63
  + Added support for `file://` protocol to content scans.
26
64
  + Updated regex pattern to match local files starting with `~`, `.`, or `/`.
27
65
  + Remove # anchors for file URLs (and files)
28
- * Improved parsing of content in ollama_chat:
66
+ * Improved parsing of content in `ollama_chat`:
29
67
  + Use `content.scan(%r((https?://\S+)|(#\S+)|(\S+\/\S+)))` to match URLs, tags and files.
30
68
  + For foo/bar file pathes prepend `./`foo/bar, for foo you have to enter ./foo still.
31
69
  + Added a check for file existence before fetching its content
data/README.md CHANGED
@@ -44,8 +44,8 @@ print ollama.chat(model: 'llama3.1', stream: true, messages:).lazy.map { |respon
44
44
 
45
45
  ## Try out things in ollama\_console
46
46
 
47
- This is an interactive console, that can be used to try the different commands
48
- provided by an `Ollama::Client` instance. For example this command generate a
47
+ This is an interactive console where you can try out the different commands
48
+ provided by an `Ollama::Client` instance. For example, this command generates a
49
49
  response and displays it on the screen using the Markdown handler:
50
50
 
51
51
  ```
@@ -459,6 +459,7 @@ The following commands can be given inside the chat, if prefixed by a `/`:
459
459
  /embedding toggle embedding paused or not
460
460
  /embed source embed the source's content
461
461
  /web [n] query query web search & return n or 1 results
462
+ /links( clear) display (or clear) links used in the chat
462
463
  /save filename store conversation messages
463
464
  /load filename load conversation messages
464
465
  /quit to quit
data/Rakefile CHANGED
@@ -42,7 +42,9 @@ GemHadar do
42
42
  dependency 'tins', '~> 1.34'
43
43
  dependency 'kramdown-ansi', '~> 0.0', '>= 0.0.1'
44
44
  dependency 'ostruct', '~> 0.0'
45
- development_dependency 'all_images', '~> 0.4'
45
+ dependency 'sqlite-vec', '~> 0.0'
46
+ dependency 'sqlite3', '~> 2.0', '>= 2.0.1'
47
+ development_dependency 'all_images', '~> 0.6'
46
48
  development_dependency 'rspec', '~> 3.2'
47
49
  development_dependency 'webmock'
48
50
  development_dependency 'debug'
data/bin/ollama_chat CHANGED
@@ -57,13 +57,15 @@ class OllamaChatConfig
57
57
  enabled: true
58
58
  model:
59
59
  name: mxbai-embed-large
60
+ embedding_length: 1024
60
61
  options: {}
61
62
  # Retrieval prompt template:
62
63
  prompt: 'Represent this sentence for searching relevant passages: %s'
63
64
  batch_size: 10
65
+ database_filename: null # ':memory:'
64
66
  collection: <%= ENV['OLLAMA_CHAT_COLLECTION'] %>
65
67
  found_texts_size: 4096
66
- found_texts_count: null
68
+ found_texts_count: 10
67
69
  splitter:
68
70
  name: RecursiveCharacter
69
71
  chunk_size: 1024
@@ -81,12 +83,15 @@ class OllamaChatConfig
81
83
 
82
84
  def initialize(filename = nil)
83
85
  @filename = filename || default_path
86
+ unless File.directory?(cache_dir_path)
87
+ mkdir_p cache_dir_path.to_s
88
+ end
84
89
  @config = Provider.config(@filename, '⚙️')
85
90
  retried = false
86
91
  rescue ConfigurationFileMissing
87
92
  if @filename == default_path && !retried
88
93
  retried = true
89
- mkdir_p File.dirname(default_path)
94
+ mkdir_p config_dir_path.to_s
90
95
  File.secure_write(default_path, DEFAULT_CONFIG)
91
96
  retry
92
97
  else
@@ -105,6 +110,14 @@ class OllamaChatConfig
105
110
  def config_dir_path
106
111
  XDG.new.config_home + 'ollama_chat'
107
112
  end
113
+
114
+ def cache_dir_path
115
+ XDG.new.cache_home + 'ollama_chat'
116
+ end
117
+
118
+ def database_path
119
+ cache_dir_path + 'documents.db'
120
+ end
108
121
  end
109
122
 
110
123
  class FollowChat
@@ -316,11 +329,12 @@ def search_web(query, n = nil)
316
329
  doc.css('.results_links').each do |link|
317
330
  if n > 0
318
331
  url = link.css('.result__a').first&.[]('href')
319
- url.sub!(%r(\A/l/\?uddg=), '')
332
+ url.sub!(%r(\A(//duckduckgo\.com)?/l/\?uddg=), '')
320
333
  url.sub!(%r(&rut=.*), '')
321
334
  url = URI.decode_uri_component(url)
322
335
  url = URI.parse(url)
323
336
  url.host =~ /duckduckgo\.com/ and next
337
+ $links.add(url.to_s)
324
338
  result << url
325
339
  n -= 1
326
340
  else
@@ -387,7 +401,7 @@ module SourceParsing
387
401
  end
388
402
  source_io.rewind
389
403
  source_io.read
390
- when 'text/csv' # TODO
404
+ when 'text/csv'
391
405
  parse_csv(source_io)
392
406
  when 'application/rss+xml'
393
407
  parse_rss(source_io)
@@ -510,6 +524,7 @@ def fetch_source(source, &block)
510
524
  block.(tmp)
511
525
  end
512
526
  when %r(\Ahttps?://\S+)
527
+ $links.add(source.to_s)
513
528
  Utils::Fetcher.get(
514
529
  source,
515
530
  cache: $cache,
@@ -603,7 +618,6 @@ def embed_source(source_io, source, count: nil)
603
618
  percentage: splitter_config.percentage?,
604
619
  percentile: splitter_config.percentile?,
605
620
  )
606
- inputs = splitter.split(text)
607
621
  end
608
622
  inputs or return
609
623
  source = source.to_s
@@ -624,7 +638,6 @@ def embed(source)
624
638
  content.present? or return
625
639
  source_io.rewind
626
640
  embed_source(source_io, source)
627
- content
628
641
  end
629
642
  $config.prompts.embed % { source: }
630
643
  else
@@ -649,6 +662,7 @@ def parse_content(content, images)
649
662
  File.exist?(file) or next
650
663
  source = file
651
664
  when url
665
+ $links.add(url.to_s)
652
666
  source = url
653
667
  end
654
668
  fetch_source(source) do |source_io|
@@ -674,7 +688,7 @@ def parse_content(content, images)
674
688
  end
675
689
  end
676
690
  end
677
- new_content = contents.select(&:present?).compact * "\n\n"
691
+ new_content = contents.select { _1.present? rescue nil }.compact * "\n\n"
678
692
  return new_content, (tags unless tags.empty?)
679
693
  end
680
694
 
@@ -735,17 +749,12 @@ ensure
735
749
  end
736
750
 
737
751
  def collection_stats
738
- list = $documents.collections.sort.map { |c|
739
- ' ' + ($documents.collection == c ? bold { c } : c).to_s
740
- }.join(?\n)
741
752
  puts <<~EOT
742
753
  Current Collection
743
754
  Name: #{bold{$documents.collection}}
744
755
  #Embeddings: #{$documents.size}
745
756
  #Tags: #{$documents.tags.size}
746
757
  Tags: #{$documents.tags}
747
- List:
748
- #{list}
749
758
  EOT
750
759
  end
751
760
 
@@ -785,10 +794,11 @@ def set_system_prompt(messages, system)
785
794
  messages << Message.new(role: 'system', content: system)
786
795
  end
787
796
 
788
- def change_system_prompt(messages, default)
789
- prompts = $config.system_prompts.attribute_names.compact
790
- chosen = Utils::Chooser.choose(prompts)
791
- system = if chosen
797
+ def change_system_prompt(messages, default, system: nil)
798
+ selector = Regexp.new(system.to_s[1..-1].to_s)
799
+ prompts = $config.system_prompts.attribute_names.compact.grep(selector)
800
+ chosen = Utils::Chooser.choose(prompts, return_immediately: true)
801
+ system = if chosen
792
802
  $config.system_prompts.send(chosen)
793
803
  else
794
804
  default
@@ -869,6 +879,7 @@ def display_chat_help
869
879
  /embedding toggle embedding paused or not
870
880
  /embed source embed the source's content
871
881
  /web [n] query query web search & return n or 1 results
882
+ /links( clear) display (or clear) links used in the chat
872
883
  /save filename store conversation messages
873
884
  /load filename load conversation messages
874
885
  /quit to quit
@@ -907,16 +918,17 @@ end
907
918
 
908
919
  $opts = go 'f:u:m:s:c:C:D:MEVh'
909
920
 
910
- config = OllamaChatConfig.new($opts[?f])
911
- $config = config.config
921
+ $ollama_chat_config = OllamaChatConfig.new($opts[?f])
922
+ $config = $ollama_chat_config.config
912
923
 
913
924
  setup_switches
914
925
 
915
926
  $opts[?h] and usage
916
927
  $opts[?V] and version
917
928
 
918
- base_url = $opts[?u] || $config.url
919
- $ollama = Client.new(base_url:, debug: $config.debug)
929
+ base_url = $opts[?u] || $config.url
930
+ user_agent = [ File.basename($0), Ollama::VERSION ] * ?/
931
+ $ollama = Client.new(base_url:, debug: $config.debug, user_agent:)
920
932
 
921
933
  $document_policy = $config.document_policy
922
934
  $model = choose_model($opts[?m], $config.model.name)
@@ -929,8 +941,8 @@ if $opts[?c]
929
941
  messages.concat load_conversation($opts[?c])
930
942
  else
931
943
  default = $config.system_prompts.default? || model_system
932
- if $opts[?s] == ??
933
- change_system_prompt(messages, default)
944
+ if $opts[?s] =~ /\A\?/
945
+ change_system_prompt(messages, default, system: $opts[?s])
934
946
  else
935
947
  system = Utils::FileArgument.get_file_argument($opts[?s], default:)
936
948
  system.present? and set_system_prompt(messages, system)
@@ -944,12 +956,13 @@ if $embedding.on?
944
956
  collection = $opts[?C] || $config.embedding.collection
945
957
  $documents = Documents.new(
946
958
  ollama:,
947
- model: $embedding_model,
948
- model_options: $config.embedding.model.options,
949
- collection:,
950
- cache: configure_cache,
951
- redis_url: $config.redis.documents.url?,
952
- debug: ENV['DEBUG'].to_i == 1,
959
+ model: $embedding_model,
960
+ model_options: $config.embedding.model.options,
961
+ database_filename: $config.embedding.database_filename || $ollama_chat_config.database_path,
962
+ collection: ,
963
+ cache: configure_cache,
964
+ redis_url: $config.redis.documents.url?,
965
+ debug: $config.debug
953
966
  )
954
967
 
955
968
  document_list = $opts[?D].to_a
@@ -991,10 +1004,11 @@ end
991
1004
 
992
1005
  $current_voice = $config.voice.default
993
1006
 
994
- puts "Configuration read from #{config.filename.inspect} is:", $config
1007
+ puts "Configuration read from #{$ollama_chat_config.filename.inspect} is:", $config
995
1008
  info
996
1009
  puts "\nType /help to display the chat help."
997
1010
 
1011
+ $links = Set.new
998
1012
  images = []
999
1013
  loop do
1000
1014
  parse_content = true
@@ -1134,6 +1148,37 @@ loop do
1134
1148
  save_conversation($1, messages)
1135
1149
  puts "Saved conversation to #$1."
1136
1150
  next
1151
+ when %r(^/links(?:\s+(clear))?$)
1152
+ case $1
1153
+ when 'clear'
1154
+ loop do
1155
+ links = $links.dup.add('[EXIT]').add('[ALL]')
1156
+ link = Utils::Chooser.choose(links, prompt: 'Clear? %s')
1157
+ case link
1158
+ when nil, '[EXIT]'
1159
+ puts "Exiting chooser."
1160
+ break
1161
+ when '[ALL]'
1162
+ if ask?(prompt: 'Are you sure? (y/n) ') =~ /\Ay/i
1163
+ $links.clear
1164
+ puts "Cleared all links in list."
1165
+ break
1166
+ else
1167
+ puts 'Cancelled.'
1168
+ sleep 3
1169
+ end
1170
+ when /./
1171
+ $links.delete(link)
1172
+ puts "Cleared link from links in list."
1173
+ sleep 3
1174
+ end
1175
+ end
1176
+ when nil
1177
+ format = "% #{Math.log10($links.size).ceil}s. %s"
1178
+ connect = -> link { hyperlink(link) { link } }
1179
+ puts $links.each_with_index.map { |x, i| format % [ i + 1, connect.(x) ] }
1180
+ end
1181
+ next
1137
1182
  when %r(^/load\s+(.+)$)
1138
1183
  messages = load_conversation($1)
1139
1184
  puts "Loaded conversation from #$1."
data/lib/ollama/client.rb CHANGED
@@ -10,7 +10,7 @@ class Ollama::Client
10
10
 
11
11
  annotate :doc
12
12
 
13
- def initialize(base_url: nil, output: $stdout, connect_timeout: nil, read_timeout: nil, write_timeout: nil, debug: nil)
13
+ def initialize(base_url: nil, output: $stdout, connect_timeout: nil, read_timeout: nil, write_timeout: nil, debug: nil, user_agent: nil)
14
14
  base_url.nil? and base_url = ENV.fetch('OLLAMA_URL') do
15
15
  raise ArgumentError,
16
16
  'missing :base_url parameter or OLLAMA_URL environment variable'
@@ -21,8 +21,8 @@ class Ollama::Client
21
21
  @ssl_verify_peer = base_url.query.to_s.split(?&).inject({}) { |h, l|
22
22
  h.merge Hash[*l.split(?=)]
23
23
  }['ssl_verify_peer'] != 'false'
24
- @base_url, @output, @connect_timeout, @read_timeout, @write_timeout, @debug =
25
- base_url, output, connect_timeout, read_timeout, write_timeout, debug
24
+ @base_url, @output, @connect_timeout, @read_timeout, @write_timeout, @debug, @user_agent =
25
+ base_url, output, connect_timeout, read_timeout, write_timeout, debug, user_agent
26
26
  end
27
27
 
28
28
  attr_accessor :output
@@ -111,13 +111,13 @@ class Ollama::Client
111
111
 
112
112
  def headers
113
113
  {
114
- 'User-Agent' => self.class.user_agent,
114
+ 'User-Agent' => @user_agent || self.class.user_agent,
115
115
  'Content-Type' => 'application/json; charset=utf-8',
116
116
  }
117
117
  end
118
118
 
119
119
  def self.user_agent
120
- '%s/%s' % [ self.class, Ollama::VERSION ]
120
+ '%s/%s' % [ self, Ollama::VERSION ]
121
121
  end
122
122
 
123
123
  def excon(url)
@@ -1,7 +1,11 @@
1
1
  module Ollama::Documents::Cache::Common
2
2
  include Ollama::Utils::Math
3
3
 
4
- attr_writer :prefix # current prefix defined for the cache
4
+ def initialize(prefix:)
5
+ self.prefix = prefix
6
+ end
7
+
8
+ attr_accessor :prefix # current prefix defined for the cache
5
9
 
6
10
  # Returns an array of collection names that match the given prefix.
7
11
  #
@@ -4,7 +4,7 @@ class Ollama::Documents::MemoryCache
4
4
  include Ollama::Documents::Cache::Common
5
5
 
6
6
  def initialize(prefix:)
7
- @prefix = prefix
7
+ super(prefix:)
8
8
  @data = {}
9
9
  end
10
10
 
@@ -0,0 +1,87 @@
1
+ module Ollama::Documents::Cache::Records
2
+ class Record < JSON::GenericObject
3
+ def initialize(*a)
4
+ super
5
+ self.text ||= ''
6
+ self.norm ||= 0.0
7
+ end
8
+
9
+ def to_s
10
+ my_tags = tags_set
11
+ my_tags.empty? or my_tags = " #{my_tags}"
12
+ "#<#{self.class} #{text.inspect}#{my_tags} #{similarity || 'n/a'}>"
13
+ end
14
+
15
+ def tags_set
16
+ Ollama::Utils::Tags.new(tags, source:)
17
+ end
18
+
19
+ def ==(other)
20
+ text == other.text
21
+ end
22
+
23
+ alias inspect to_s
24
+ end
25
+
26
+ module RedisFullEach
27
+ def full_each(&block)
28
+ redis.scan_each(match: [ Ollama::Documents, ?* ] * ?-) do |key|
29
+ value = redis.get(key) or next
30
+ value = JSON(value, object_class: Ollama::Documents::Record)
31
+ block.(key, value)
32
+ end
33
+ end
34
+ end
35
+
36
+ module FindRecords
37
+ def find_records(needle, tags: nil, max_records: nil)
38
+ tags = Ollama::Utils::Tags.new(Array(tags)).to_a
39
+ records = self
40
+ if tags.present?
41
+ records = records.select { |_key, record| (tags & record.tags).size >= 1 }
42
+ end
43
+ needle_norm = norm(needle)
44
+ records = records.sort_by { |key, record|
45
+ record.key = key
46
+ record.similarity = cosine_similarity(
47
+ a: needle,
48
+ b: record.embedding,
49
+ a_norm: needle_norm,
50
+ b_norm: record.norm,
51
+ )
52
+ }
53
+ records.transpose.last&.reverse.to_a
54
+ end
55
+ end
56
+
57
+ module Tags
58
+ def clear(tags: nil)
59
+ tags = Ollama::Utils::Tags.new(tags).to_a
60
+ if tags.present?
61
+ if respond_to?(:clear_for_tags)
62
+ clear_for_tags(tags)
63
+ else
64
+ each do |key, record|
65
+ if (tags & record.tags.to_a).size >= 1
66
+ delete(unpre(key))
67
+ end
68
+ end
69
+ end
70
+ else
71
+ super()
72
+ end
73
+ end
74
+
75
+ def tags
76
+ if defined? super
77
+ super
78
+ else
79
+ each_with_object(Ollama::Utils::Tags.new) do |(_, record), t|
80
+ record.tags.each do |tag|
81
+ t.add(tag, source: record.source)
82
+ end
83
+ end
84
+ end
85
+ end
86
+ end
87
+ end
@@ -5,8 +5,9 @@ class Ollama::Documents
5
5
  def initialize(prefix:, url: ENV['REDIS_URL'], object_class: nil)
6
6
  super(prefix:)
7
7
  url or raise ArgumentError, 'require redis url'
8
- @prefix, @url, @object_class = prefix, url, object_class
8
+ @url, @object_class = url, object_class
9
9
  @redis_cache = Ollama::Documents::RedisCache.new(prefix:, url:, object_class:)
10
+ @redis_cache.extend(Ollama::Documents::Cache::Records::RedisFullEach)
10
11
  @redis_cache.full_each do |key, value|
11
12
  @data[key] = value
12
13
  end
@@ -5,8 +5,9 @@ class Ollama::Documents::RedisCache
5
5
  include Ollama::Documents::Cache::Common
6
6
 
7
7
  def initialize(prefix:, url: ENV['REDIS_URL'], object_class: nil, ex: nil)
8
+ super(prefix:)
8
9
  url or raise ArgumentError, 'require redis url'
9
- @prefix, @url, @object_class, @ex = prefix, url, object_class, ex
10
+ @url, @object_class, @ex = url, object_class, ex
10
11
  end
11
12
 
12
13
  attr_reader :object_class
@@ -18,7 +19,7 @@ class Ollama::Documents::RedisCache
18
19
  def [](key)
19
20
  value = redis.get(pre(key))
20
21
  unless value.nil?
21
- JSON(value, object_class:)
22
+ object_class ? JSON(value, object_class:) : JSON(value)
22
23
  end
23
24
  end
24
25
 
@@ -64,12 +65,4 @@ class Ollama::Documents::RedisCache
64
65
  self
65
66
  end
66
67
  include Enumerable
67
-
68
- def full_each(&block)
69
- redis.scan_each(match: [ Ollama::Documents, ?* ] * ?-) do |key|
70
- value = redis.get(key) or next
71
- value = JSON(value, object_class: Ollama::Documents::Record)
72
- block.(key, value)
73
- end
74
- end
75
68
  end