scout-ai 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (62) hide show
  1. checksums.yaml +4 -4
  2. data/.vimproject +80 -15
  3. data/README.md +296 -0
  4. data/Rakefile +2 -0
  5. data/VERSION +1 -1
  6. data/doc/Agent.md +279 -0
  7. data/doc/Chat.md +258 -0
  8. data/doc/LLM.md +446 -0
  9. data/doc/Model.md +513 -0
  10. data/doc/RAG.md +129 -0
  11. data/lib/scout/llm/agent/chat.rb +51 -1
  12. data/lib/scout/llm/agent/delegate.rb +39 -0
  13. data/lib/scout/llm/agent/iterate.rb +44 -0
  14. data/lib/scout/llm/agent.rb +42 -21
  15. data/lib/scout/llm/ask.rb +38 -6
  16. data/lib/scout/llm/backends/anthropic.rb +147 -0
  17. data/lib/scout/llm/backends/bedrock.rb +1 -1
  18. data/lib/scout/llm/backends/ollama.rb +23 -29
  19. data/lib/scout/llm/backends/openai.rb +34 -40
  20. data/lib/scout/llm/backends/responses.rb +158 -110
  21. data/lib/scout/llm/chat.rb +250 -94
  22. data/lib/scout/llm/embed.rb +4 -4
  23. data/lib/scout/llm/mcp.rb +28 -0
  24. data/lib/scout/llm/parse.rb +1 -0
  25. data/lib/scout/llm/rag.rb +9 -0
  26. data/lib/scout/llm/tools/call.rb +66 -0
  27. data/lib/scout/llm/tools/knowledge_base.rb +158 -0
  28. data/lib/scout/llm/tools/mcp.rb +59 -0
  29. data/lib/scout/llm/tools/workflow.rb +69 -0
  30. data/lib/scout/llm/tools.rb +58 -143
  31. data/lib/scout-ai.rb +1 -0
  32. data/scout-ai.gemspec +31 -18
  33. data/scout_commands/agent/ask +28 -71
  34. data/scout_commands/documenter +148 -0
  35. data/scout_commands/llm/ask +2 -2
  36. data/scout_commands/llm/server +319 -0
  37. data/share/server/chat.html +138 -0
  38. data/share/server/chat.js +468 -0
  39. data/test/scout/llm/backends/test_anthropic.rb +134 -0
  40. data/test/scout/llm/backends/test_openai.rb +45 -6
  41. data/test/scout/llm/backends/test_responses.rb +124 -0
  42. data/test/scout/llm/test_agent.rb +0 -70
  43. data/test/scout/llm/test_ask.rb +3 -1
  44. data/test/scout/llm/test_chat.rb +43 -1
  45. data/test/scout/llm/test_mcp.rb +29 -0
  46. data/test/scout/llm/tools/test_knowledge_base.rb +22 -0
  47. data/test/scout/llm/tools/test_mcp.rb +11 -0
  48. data/test/scout/llm/tools/test_workflow.rb +39 -0
  49. metadata +56 -17
  50. data/README.rdoc +0 -18
  51. data/python/scout_ai/__pycache__/__init__.cpython-310.pyc +0 -0
  52. data/python/scout_ai/__pycache__/__init__.cpython-311.pyc +0 -0
  53. data/python/scout_ai/__pycache__/huggingface.cpython-310.pyc +0 -0
  54. data/python/scout_ai/__pycache__/huggingface.cpython-311.pyc +0 -0
  55. data/python/scout_ai/__pycache__/util.cpython-310.pyc +0 -0
  56. data/python/scout_ai/__pycache__/util.cpython-311.pyc +0 -0
  57. data/python/scout_ai/atcold/plot_lib.py +0 -141
  58. data/python/scout_ai/atcold/spiral.py +0 -27
  59. data/python/scout_ai/huggingface/train/__pycache__/__init__.cpython-310.pyc +0 -0
  60. data/python/scout_ai/huggingface/train/__pycache__/next_token.cpython-310.pyc +0 -0
  61. data/python/scout_ai/language_model.py +0 -70
  62. /data/{python/scout_ai/atcold/__init__.py → test/scout/llm/tools/test_call.rb} +0 -0
data/doc/Model.md ADDED
@@ -0,0 +1,513 @@
1
+ # Model
2
+
3
+ The Model subsystem in scout-ai provides a small, composable framework to wrap machine‑learning models (pure Ruby, Python/PyTorch, and Hugging Face Transformers) with a consistent API for evaluation, training, feature extraction, post‑processing, and persistence.
4
+
5
+ It consists of a base class (ScoutModel) and higher-level implementations:
6
+ - PythonModel — instantiate and drive Python classes via ScoutPython.
7
+ - TorchModel — drive arbitrary PyTorch modules with simple training/eval loops, tensor helpers, and state save/load.
8
+ - HuggingfaceModel — convenience wrapper for Transformers models and tokenizers, with specializations:
9
+ - SequenceClassificationModel — text classification.
10
+ - CausalModel — chat/causal generation.
11
+ - NextTokenModel — next-token fine-tuning pipeline.
12
+
13
+ This document covers the common API, how to customize models with feature extraction and post-processing, saving/loading models and their behavior, and several concrete examples (including how ExTRI2 uses a Hugging Face model inside a Workflow).
14
+
15
+ ---
16
+
17
+ ## Core concepts and base API (ScoutModel)
18
+
19
+ ScoutModel is the foundation. You create a model object, attach blocks describing how to evaluate, train, extract features, and post-process, and optionally persist both its behavior and state in a directory.
20
+
21
+ Constructor:
22
+ - ScoutModel.new(directory = nil, options = {})
23
+ - directory (optional) — if provided, model behavior/state can be saved and later restored from here.
24
+ - options — free-form hash for your parameters (e.g., hyperparameters). These are persisted to options.json in the directory and merged on restore.
25
+
26
+ Key responsibilities:
27
+ - Provide hooks to set the model’s:
28
+ - init — how to initialize internal state (e.g., load a Python object).
29
+ - eval — how to evaluate one sample.
30
+ - eval_list — how to evaluate a list (batch) of samples (by default dispatches to eval).
31
+ - extract_features / extract_features_list — how to map raw inputs to “features” the model expects.
32
+ - post_process / post_process_list — transform raw predictions/logits to final outputs.
33
+ - train — how to fit with accumulated training data (features and labels).
34
+
35
+ - Build and hold training data:
36
+ - add(sample, label = nil)
37
+ - add_list(list, labels = nil or Hash mapping sample->label)
38
+ - Internal arrays @features and @labels are filled after feature extraction.
39
+
40
+ - Persist behavior and state:
41
+ - save — persists options, all behavior blocks (as .rb) and state (see below).
42
+ - restore — loads behavior and options; if the model has a directory, init/load_state are called on demand.
43
+
44
+ - A directory-bound state file:
45
+ - state_file — shorthand for directory.state; used by implementations to store learned parameters.
46
+
47
+ Execution helpers (util/run.rb):
48
+ - execute(method, *args) — run a stored Proc with arity checks.
49
+ - init { ... } / init() — define or execute the initialization method.
50
+ - eval(sample=nil) { ... } — define or run the eval method; calls extract_features and post_process around your block as needed.
51
+ - eval_list(list=nil) { ... } — define or run the list version; defaults to mapping eval unless you override.
52
+ - post_process(result=nil) { ... }, post_process_list(list=nil) { ... } — define or run post-processing.
53
+ - train { ... } / train() — define or run training using @features/@labels.
54
+ - extract_features(sample=nil) { ... }, extract_features_list(list=nil) { ... } — define or run feature extraction.
55
+
56
+ Persistence (util/save.rb):
57
+ - save — writes options.json; saves each defined Proc to a .rb file beside the state (using method_source); calls save_state if @state exists.
58
+ - restore — loads behavior (.rb), options, and sets up init/load_state/save_state blocks.
59
+ - save_state { |state_file, state| ... } — define or execute logic to persist the current @state.
60
+ - load_state { |state_file| ... } — define or execute logic to restore @state.
61
+
62
+ Minimal example (pure Ruby)
63
+ ```ruby
64
+ model = ScoutModel.new
65
+ model.eval do |sample, list=nil|
66
+ if list
67
+ list.map { |x| x * 2 }
68
+ else
69
+ sample * 2
70
+ end
71
+ end
72
+
73
+ model.eval(1) # => 2
74
+ model.eval_list([1, 2]) # => [2, 4]
75
+ ```
76
+
77
+ Persisting behavior/state
78
+ ```ruby
79
+ TmpFile.with_file do |dir|
80
+ model = ScoutModel.new dir, factor: 4
81
+ model.eval { |x, list=nil| list ? list.map { |v| v * @options[:factor] } : x * @options[:factor] }
82
+ model.save
83
+
84
+ # Later
85
+ reloaded = ScoutModel.new dir
86
+ reloaded.eval(1) # => 4
87
+ reloaded.eval_list([1,2]) # => [4,8]
88
+ end
89
+ ```
90
+
91
+ ---
92
+
93
+ ## PythonModel: wrap Python classes
94
+
95
+ PythonModel specializes ScoutModel to initialize a Python class instance (via ScoutPython) and keep it in @state.
96
+
97
+ Constructor:
98
+ - PythonModel.new(dir, python_class = nil, python_module = :model, options = {})
99
+ - dir — directory holding model.py or any Python package you want on sys.path.
100
+ - python_class/python_module — class and module to import; if python_module omitted, defaults to :model.
101
+ - options — additional keyword arguments passed to the Python class initializer.
102
+
103
+ Initialization:
104
+ - On init, PythonModel adjusts paths, ensures ScoutPython is initialized, and builds an instance:
105
+ - ScoutPython.class_new_obj(python_module, python_class, **options.except(...))
106
+
107
+ From tests (python/test_base.rb):
108
+ ```ruby
109
+ TmpFile.with_path do |dir|
110
+ dir['model.py'].write <<~PY
111
+ class TestModel:
112
+ def __init__(self, delta):
113
+ self.delta = delta
114
+ def eval(self, x):
115
+ return [e + self.delta for e in x]
116
+ PY
117
+
118
+ model = PythonModel.new dir, 'TestModel', :model, delta: 1
119
+
120
+ model.eval do |sample, list=nil|
121
+ init unless state
122
+ if list
123
+ state.eval(list) # Python: returns list
124
+ else
125
+ state.eval([sample])[0]
126
+ end
127
+ end
128
+
129
+ model.eval(1) # => 2
130
+ model.eval_list([3,5]) # => [4,6]
131
+
132
+ model.save
133
+ model2 = ScoutModel.new dir # generic loader from directory works too
134
+ model2.eval(1) # => 2
135
+
136
+ model3 = ScoutModel.new dir, delta: 2
137
+ model3.eval(1) # => 3
138
+ end
139
+ ```
140
+
141
+ Notes:
142
+ - Behavior blocks (eval/extract_features/train/post_process) are still Ruby procs you define; inside, you can call Python methods on state.
143
+ - Options are persisted and merged on restore, allowing default hyperparameter overrides.
144
+
145
+ ---
146
+
147
+ ## TorchModel: PyTorch convenience
148
+
149
+ TorchModel extends PythonModel with a ready-to-use setup for PyTorch nn.Modules, training loop, tensor helpers, and state I/O.
150
+
151
+ Highlights:
152
+ - torch helpers (torch/helpers.rb):
153
+ - TorchModel.init_python — imports torch and utility modules once.
154
+ - TorchModel::Tensor — wrapper adding to_ruby/to_ruby!/del for tensor lifecycle management.
155
+ - device(options) / dtype(options) — configure device/dtype from options (e.g., device: 'cuda').
156
+ - tensor(obj, device, dtype) — build a torch.tensor; result responds to .to_ruby / .del.
157
+
158
+ - Save/Load (torch/load_and_save.rb):
159
+ - TorchModel.save(state_file, state) — saves both architecture (torch.save(model)) and weights (state_dict) into state_file(.architecture).
160
+ - TorchModel.load(state_file, state=nil) — loads architecture and then weights.
161
+ - reset_state — clear current state and remove persisted files.
162
+
163
+ - Introspection (torch/introspection.rb):
164
+ - get_layer(state, layer_path = nil), get_weights(state, layer_path)
165
+ - freeze_layer(state, layer_path, requires_grad=false) — recursively freezes a submodule.
166
+
167
+ - Training loop (torch.rb):
168
+ - Provide your nn.Module as state (e.g., via model.state = ScoutPython.torch.nn.Linear.new(1,1)).
169
+ - Set criterion/optimizer or rely on defaults:
170
+ - TorchModel.optimizer(model, training_args) — default SGD(lr: 0.01).
171
+ - TorchModel.criterion(model, training_args) — default MSELoss.
172
+ - options[:training_args] may set epochs, batch_size, learning_rate, etc.
173
+
174
+ Example (from tests/test_torch.rb)
175
+ ```ruby
176
+ TorchModel.init_python
177
+ model = TorchModel.new dir
178
+ model.state = ScoutPython.torch.nn.Linear.new(1, 1)
179
+ model.criterion = ScoutPython.torch.nn.MSELoss.new()
180
+
181
+ model.extract_features { |f| [f] }
182
+ model.post_process { |v, list| list ? v.map(&:first) : v.first }
183
+
184
+ # Train y ~ 2x
185
+ model.add 5.0, [10.0]
186
+ model.add 10.0, [20.0]
187
+ model.options[:training_args][:epochs] = 1000
188
+ model.train
189
+
190
+ w = model.get_weights.to_ruby.first.first
191
+ # w between 1.8 and 2.2
192
+ ```
193
+
194
+ Persist and reuse
195
+ ```ruby
196
+ model.save
197
+ reloaded = ScoutModel.new dir
198
+ y = reloaded.eval(100.0) # ~ 200
199
+ ```
200
+
201
+ Tips:
202
+ - Manage tensor memory with Tensor#del after large batch evaluations if needed.
203
+ - You can freeze layers by name path ("encoder.layer.0") before training.
204
+
205
+ ---
206
+
207
+ ## HuggingfaceModel: Transformers integration
208
+
209
+ HuggingfaceModel is a TorchModel specializing initialization and save/load to work with transformers:
210
+ - Loads a model and tokenizer via Python functions (python/scout_ai/huggingface/model.py):
211
+ - load_model(task, checkpoint, **kwargs)
212
+ - load_tokenizer(checkpoint, **kwargs)
213
+ - Persists using save_pretrained/from_pretrained into directory.state (a directory).
214
+
215
+ Options normalization:
216
+ - fix_options: splits options into:
217
+ - training_args (or via training: …),
218
+ - tokenizer_args (or via tokenizer: …),
219
+ - plus task / checkpoint.
220
+ - Any model/tokenizer kwargs not in training_args or tokenizer_args are passed through on load.
221
+
222
+ Save/Load:
223
+ - save_state — model.save_pretrained and tokenizer.save_pretrained into state_file dir.
224
+ - load_state — model.from_pretrained and tokenizer.from_pretrained when present.
225
+
226
+ You typically use one of its specializations:
227
+
228
+ ### SequenceClassificationModel
229
+
230
+ Purpose: text classification (logits to label).
231
+
232
+ Behavior:
233
+ - eval: calls Python eval_model(model, tokenizer, texts, locate_tokens?) to produce logits (default return_logits = true).
234
+ - post_process: argmax across logits, mapping to class labels if provided.
235
+
236
+ Training:
237
+ - train: builds a TSV (text,label), constructs TrainingArguments and uses Trainer/train (python/scout_ai/huggingface/train).
238
+ - Accepts optional class_weights to weight CrossEntropy in a custom Trainer.
239
+
240
+ Example training (from tests)
241
+ ```ruby
242
+ model = SequenceClassificationModel.new 'bert-base-uncased', nil, class_labels: %w(Bad Good)
243
+ model.init
244
+
245
+ 10.times do
246
+ model.add "The dog", 'Bad'
247
+ model.add "The cat", 'Good'
248
+ end
249
+
250
+ model.train
251
+ model.eval("This is dog") # => "Bad"
252
+ model.eval("This is cat") # => "Good"
253
+ ```
254
+
255
+ Notes:
256
+ - post_process maps argmax index to options[:class_labels]. Raw logits can be left to downstream code by customizing post_process.
257
+
258
+ ### CausalModel
259
+
260
+ Purpose: chat/causal generation.
261
+
262
+ Behavior:
263
+ - eval(messages, list=nil): calls Python eval_causal_lm_chat(model, tokenizer, messages, chat_template, chat_template_kwargs, generation_kwargs) to return generated text, using tokenizer.apply_chat_template when available.
264
+
265
+ Training:
266
+ - train(pairs, labels): hooks a basic RLHF pipeline (python/scout_ai/huggingface/rlhf.py) using PPO. You supply:
267
+ - pairs: array of [messages, response] pairs,
268
+ - labels: rewards for each pair.
269
+ - After training, it reloads state from disk.
270
+
271
+ Usage example (test/test_causal.rb):
272
+ ```ruby
273
+ model = CausalModel.new 'mistralai/Mistral-7B-Instruct-v0.3'
274
+ model.init
275
+ model.eval([
276
+ {role: :system, content: "You are a calculator, just reply with the answer"},
277
+ {role: :user, content: " 1 + 2 ="}
278
+ ])
279
+ # => "3"
280
+ ```
281
+
282
+ ### NextTokenModel
283
+
284
+ Purpose: next-token fine-tuning for Causal LM.
285
+
286
+ Adds a custom train block that:
287
+ - Builds tokenized dataset from a list of strings.
288
+ - Trains with a simple language modeling loop (python/scout_ai/huggingface/train/next_token.py).
289
+ - Writes checkpoints under directory/output.
290
+
291
+ From tests (huggingface/causal/test_next_token.rb):
292
+ ```ruby
293
+ model = NextTokenModel.new model_name, tmp_dir, training_num_train_epochs: 1000, training_learning_rate: 0.1
294
+
295
+ chat = Chat.setup []
296
+ chat.user "say hi"
297
+ pp model.eval chat # generation before training
298
+
299
+ state, tok = model.init
300
+ tok.pad_token = tok.eos_token
301
+
302
+ train_texts = ["say hi, no!", "say hi, hi", ...]
303
+ model.add_list train_texts.shuffle
304
+ model.train
305
+
306
+ pp model.eval chat # improved generations
307
+ model.save
308
+ reloaded = PythonModel.new tmp_dir
309
+ pp reloaded.eval chat
310
+ ```
311
+
312
+ ---
313
+
314
+ ## Feature extraction and post-processing
315
+
316
+ A key pattern is to keep evaluation logic generic and tailor feature extraction and post‑processing for each task.
317
+
318
+ - extract_features(sample) and extract_features_list(list) let you shape inputs into the structure your model consumes.
319
+ - post_process(result) or post_process_list(list) convert raw outputs to your final format (e.g., argmax to label, logits to softmax).
320
+
321
+ ExTRI2 workflow example (SequenceClassification)
322
+ ```ruby
323
+ # tri_sentences task uses a Huggingface SequenceClassification model
324
+ tri_model = Rbbt.models[tri_model].find unless File.exist?(tri_model)
325
+ model = HuggingfaceModel.new 'SequenceClassification', tri_model, nil,
326
+ tokenizer_args: { model_max_length: 512, truncation: true },
327
+ return_logits: true
328
+
329
+ # Convert the TSV row into the sequence model expects
330
+ model.extract_features do |_, feature_list|
331
+ feature_list.collect do |text, tf, tg|
332
+ text.sub("[TF]", "<TF>#{tf}</TF>").sub("[TG]", "<TG>#{tg}</TG>")
333
+ end
334
+ end
335
+
336
+ model.init
337
+
338
+ # Evaluate as a batch (tsv.slice returns [["Text","TF","Gene"], ...])
339
+ predictions = model.eval_list tsv.slice(["Text", "TF", "Gene"]).values
340
+
341
+ # Write classifier output back to TSV
342
+ tsv.add_field "Valid score" do
343
+ non_valid, valid = predictions.shift
344
+ begin
345
+ Misc.softmax([valid, non_valid]).first
346
+ rescue
347
+ 0
348
+ end
349
+ end
350
+
351
+ tsv.add_field "Valid" do |_, values|
352
+ values.last > 0.5 ? "Valid" : "Non valid"
353
+ end
354
+ ```
355
+
356
+ Key takeaways:
357
+ - Use extract_features to canonicalize input format independent of how your rows are structured.
358
+ - Batch evaluation with eval_list on large tables; then write back into TSV columns.
359
+ - Persist the model directory to reuse across runs.
360
+
361
+ ---
362
+
363
+ ## Training data management
364
+
365
+ Collect samples:
366
+ - add(sample, label=nil)
367
+ - add_list(list, labels=nil)
368
+ - labels may be an Array aligned with list or a Hash mapping sample->label.
369
+
370
+ In Torch/HF paths, training consumes @features/@labels after feature extraction:
371
+ - SequenceClassificationModel’s train writes a TSV dataset to disk, builds TrainingArguments, tokenizes, and runs transformers.Trainer.
372
+ - TorchModel’s train uses a simple loop with SGD and MSELoss by default (override criterion/optimizer if needed).
373
+
374
+ ---
375
+
376
+ ## Persistence and restore
377
+
378
+ Behavior and state are independent:
379
+ - Behavior (Ruby Procs for eval/extract_features/train/etc.) are saved to .rb sibling files in directory; they are reloaded and instance_eval’ed on restore.
380
+ - Options are persisted to options.json and merged on restore.
381
+ - State depends on implementation:
382
+ - TorchModel: two files — state (weights) and architecture dump (.architecture).
383
+ - HuggingfaceModel: directory with tokenizer+model via save_pretrained.
384
+ - PythonModel: you define save_state/load_state (or rely on higher-level class).
385
+
386
+ Common methods:
387
+ - save — writes options, behavior files, and calls save_state if @state exists.
388
+ - restore — loads behavior files and options; state is lazy-initialized by calling init/load_state when used next.
389
+
390
+ ---
391
+
392
+ ## Devices, tensors, and memory notes (PyTorch)
393
+
394
+ - Choose device automatically or pass options: { device: 'cuda' } or { device: 'cpu' }.
395
+ - TorchModel::Tensor#to_ruby converts tensors to Ruby arrays via numpy; #to_ruby! also calls .del to free GPU memory (detach, move to CPU, clear grads and storage).
396
+ - Freeze layers if fine-tuning only a head: TorchModel.freeze_layer(state, "encoder.layer.0", false).
397
+
398
+ ---
399
+
400
+ ## Building your own specializations
401
+
402
+ You can layer new classes over PythonModel/TorchModel/HuggingfaceModel to produce high-level behaviors:
403
+
404
+ - Override initialize to:
405
+ - Call super(...) with task/checkpoint/dir/options.
406
+ - Provide eval blocks suited for your task (e.g., locate tokens, decode strategies).
407
+ - Provide post_process/post_process_list.
408
+ - Provide train with your pipeline (tokenization, trainer, or custom loop).
409
+ - Optionally override save_state/load_state.
410
+
411
+ - Or, stick with a plain ScoutModel and define init/eval/train/… blocks directly—particularly useful for lightweight pure-Ruby or ad‑hoc model logic.
412
+
413
+ ---
414
+
415
+ ## Patterns and recommendations
416
+
417
+ - Start simple with ScoutModel for logic prototyping; then move to PythonModel/TorchModel/Hugging Face when integrating Python models.
418
+ - Always isolate feature extraction from evaluation to keep eval focused on the lower-level API your model expects.
419
+ - Persist: pass a directory when you want to reuse a model and its learned parameters across runs; call save after training.
420
+ - For table‑driven workflows, use eval_list and TSV traversal to batch efficiently (see ExTRI2 usage).
421
+ - In TorchModel, explicitly set criterion/optimizer where the default (SGD + MSELoss) is not appropriate.
422
+
423
+ ---
424
+
425
+ ## API quick reference
426
+
427
+ Common (ScoutModel)
428
+ - new(directory=nil, options={})
429
+ - init { ... } / init() → @state
430
+ - eval(sample=nil) { |features| ... } → result
431
+ - eval_list(list=nil) { |list| ... } → array of results
432
+ - extract_features(sample=nil) { ... }, extract_features_list(list=nil) { ... }
433
+ - post_process(result=nil) { ... }, post_process_list(list=nil) { ... }
434
+ - train { |features, labels| ... } / train()
435
+ - add(sample, label=nil), add_list(list, labels=nil or Hash)
436
+ - save / restore
437
+ - save_state { |state_file, state| ... }, load_state { |state_file| ... }
438
+ - directory, state_file, options
439
+
440
+ PythonModel
441
+ - new(dir, python_class=nil, python_module=:model, options={})
442
+ - On init: state is an instance of the Python class.
443
+
444
+ TorchModel
445
+ - state (PyTorch nn.Module)
446
+ - criterion, optimizer, device, dtype
447
+ - TorchModel.init_python
448
+ - TorchModel.tensor(obj, device, dtype) → Tensor wrapper
449
+ - TorchModel.save(state_file, state) / TorchModel.load(state_file, state=nil)
450
+ - TorchModel.get_layer(state, path), freeze_layer(state, path, requires_grad=false)
451
+
452
+ HuggingfaceModel
453
+ - new(task=nil, checkpoint=nil, dir=nil, options={})
454
+ - options: training_args (or training: {}), tokenizer_args (or tokenizer: {})
455
+ - save_state/load_state via save_pretrained/from_pretrained
456
+
457
+ SequenceClassificationModel
458
+ - class_labels (optional)
459
+ - train(texts, labels)
460
+ - eval(text or list of texts) → label(s) or your post_process
461
+
462
+ CausalModel
463
+ - eval(messages) → generated text
464
+ - train(pairs, rewards) — RLHF pipeline
465
+
466
+ NextTokenModel
467
+ - train(texts) — next-token fine-tuning loop
468
+
469
+ ---
470
+
471
+ ## CLI
472
+
473
+ No dedicated “model” CLI commands are shipped in scout-ai. You will typically:
474
+ - Invoke models programmatically from Ruby code, or
475
+ - Use them inside Workflows (see ExTRI2 below), then drive training/eval via Workflow’s CLI (scout workflow task …).
476
+
477
+ Refer to the Workflow documentation for CLI usage if you integrate models into tasks.
478
+
479
+ ---
480
+
481
+ ## Example: using a Hugging Face classifier inside a Workflow (ExTRI2)
482
+
483
+ The ExTRI2 workflow builds sequence classification models to validate TRI sentences and determine Mode of Regulation (MoR). It uses HuggingfaceModel and custom feature extraction to mark [TF]/[TG] mentions:
484
+
485
+ ```ruby
486
+ model = HuggingfaceModel.new 'SequenceClassification', tri_model, nil,
487
+ tokenizer_args: { model_max_length: 512, truncation: true },
488
+ return_logits: true
489
+
490
+ model.extract_features do |_, rows|
491
+ rows.map do |text, tf, tg|
492
+ text.sub("[TF]", "<TF>#{tf}</TF>").sub("[TG]", "<TG>#{tg}</TG>")
493
+ end
494
+ end
495
+
496
+ model.init
497
+ predictions = model.eval_list tsv.slice(["Text", "TF", "Gene"]).values
498
+
499
+ tsv.add_field "Valid score" do
500
+ non_valid, valid = predictions.shift
501
+ Misc.softmax([valid, non_valid]).first rescue 0
502
+ end
503
+
504
+ tsv.add_field "Valid" do |_, row|
505
+ row.last > 0.5 ? "Valid" : "Non valid"
506
+ end
507
+ ```
508
+
509
+ This pattern—feature extraction tied to the row schema, batch evaluation, then TSV augmentation—is representative of how to fold models into reproducible pipelines.
510
+
511
+ ---
512
+
513
+ Model provides the minimal structure needed to adapt, persist, and reuse models across Ruby and Python ecosystems, while keeping your training/evaluation logic concise and testable. Use the base hooks for clarity, leverage Torch/HF helpers when needed, and integrate with Workflows to scale out training and inference.
data/doc/RAG.md ADDED
@@ -0,0 +1,129 @@
1
+ # RAG (Retrieval-Augmented Generation) module
2
+
3
+ This document explains how to use the RAG helper provided in Scout (lib/scout/llm/rag.rb).
4
+
5
+ Audience: AI agents and developers integrating retrieval-augmented flows into other applications.
6
+
7
+ Overview
8
+ --------
9
+ LLM::RAG provides a thin helper to build a nearest-neighbor index over embedding vectors using the hnswlib library. It expects an array of fixed-size numeric vectors (Float arrays) and returns an HNSW index that can be queried with another vector to find the nearest neighbors.
10
+
11
+ The RAG.index method is intentionally small and focused:
12
+
13
+ - It requires the `hnswlib` Ruby gem at runtime (loaded inside the method).
14
+ - It uses L2 (Euclidean) distance by default.
15
+ - It sets the index dimension to the length of the first vector and initializes the HNSW index with the number of elements supplied.
16
+ - Each vector is added in order; the integer ID stored in the index is the zero-based position in the input array.
17
+
18
+ Prerequisites
19
+ -------------
20
+ - Ruby environment with the Scout gem code available.
21
+ - The `hnswlib` Ruby gem installed (the method requires it dynamically):
22
+
23
+ gem install hnswlib
24
+
25
+ - An embedding function that produces fixed-length numeric vectors. Scout exposes LLM.embed(...) which delegates to configured backends (OpenAI, Ollama, etc.). Ensure your embedding backend is configured and working.
26
+
27
+ Basic usage
28
+ -----------
29
+ The common RAG flow is:
30
+
31
+ 1. Prepare a corpus (array of documents or chunks).
32
+ 2. Compute embeddings for each document.
33
+ 3. Build an HNSW index from those embeddings using LLM::RAG.index.
34
+ 4. For a query, compute its embedding and run a nearest-neighbor search on the index.
35
+ 5. Map matched neighbor indices back to the original documents.
36
+
37
+ Example (Ruby)
38
+ ---------------
39
+ This example shows a minimal end-to-end flow using Scout's LLM.embed helper to compute embeddings and LLM::RAG to build and query an index.
40
+
41
+ ```ruby
42
+ # `documents` is an array of strings (documents/chunks).
43
+ documents = [
44
+ "How to make espresso at home",
45
+ "Machine learning: an introduction",
46
+ "Ruby concurrency primitives and patterns",
47
+ "Cooking guide: baking sourdough"
48
+ ]
49
+
50
+ # 1) Compute embeddings for each document.
51
+ # Use whatever embed model/backend you have configured. Pass model: if needed.
52
+ embeddings = documents.map do |doc|
53
+ # returns an Array<Float> of fixed length
54
+ LLM.embed(doc, model: 'mxbai-embed-large')
55
+ end
56
+
57
+ # 2) Build the HNSW index
58
+ index = LLM::RAG.index(embeddings)
59
+
60
+ # 3) For a query, compute its embedding
61
+ query = "best way to brew espresso"
62
+ query_vec = LLM.embed(query, model: 'mxbai-embed-large')
63
+
64
+ # 4) Run nearest-neighbor search
65
+ # search_knn returns two arrays: node indices and distances/scores
66
+ k = 3
67
+ nodes, scores = index.search_knn(query_vec, k)
68
+
69
+ # 5) Map indices back to original documents
70
+ results = nodes.map { |i| documents[i] }
71
+
72
+ puts "Top #{k} results:"
73
+ results.each_with_index do |doc, idx|
74
+ puts "#{idx + 1}. #{doc} (score=#{scores[idx]})"
75
+ end
76
+ ```
77
+
78
+ Notes and best practices
79
+ ------------------------
80
+ - Vector dimensionality: All vectors passed to LLM::RAG.index must have identical length. The code inspects `data.first.length` to determine the index dimension.
81
+ - Index IDs: The HNSW index stores integer IDs equal to the input array index. Keep a mapping from those indices to your document IDs/metadata (for instance, an array of document IDs parallel to the embeddings array).
82
+ - Persistence: The RAG helper code only constructs and populates the index in memory. The underlying `hnswlib` gem typically offers persistence APIs (save/load). To persist or reload an index, consult the `hnswlib` gem documentation for the correct methods and usage patterns.
83
+ - Memory and performance: HNSW indexes keep data in memory and can be large for many vectors. Choose your chunking strategy and max dataset size accordingly.
84
+ - Distance metric: The current implementation uses the `'l2'` (Euclidean) space. If your application needs cosine similarity, either normalize vectors before indexing (common practice) or check whether the hnswlib Ruby binding supports a cosine space and adapt accordingly.
85
+
86
+ Example: utility wrapper
87
+ ------------------------
88
+ Here is a small utility that wraps the typical pattern and returns the top-k documents and scores for a query.
89
+
90
+ ```ruby
91
+ # documents: Array of items (strings or objects). If objects, provide a `to_embedding_source` or pass a block to extract text.
92
+ # embed_opts: options forwarded to LLM.embed (e.g. model: ...)
93
+ def build_rag_index(documents, embed_opts = {})
94
+ # compute embeddings in order
95
+ embeddings = documents.map { |d| LLM.embed(d, embed_opts) }
96
+ index = LLM::RAG.index(embeddings)
97
+ [index, embeddings]
98
+ end
99
+
100
+ def rag_query(index, documents, query, k = 5, embed_opts = {})
101
+ qvec = LLM.embed(query, embed_opts)
102
+ nodes, scores = index.search_knn(qvec, k)
103
+ results = nodes.map { |i| { doc: documents[i], score: scores[nodes.index(i)] } }
104
+ results
105
+ end
106
+
107
+ # Usage:
108
+ # index, embs = build_rag_index(documents, model: 'mxbai-embed-large')
109
+ # top = rag_query(index, documents, 'how to make coffee', 3, model: 'mxbai-embed-large')
110
+ ```
111
+
112
+ Troubleshooting
113
+ ---------------
114
+ - "NoMethodError" or "uninitialized constant Hnswlib": ensure the `hnswlib` gem is installed and available to your Ruby runtime.
115
+ - Inconsistent dimensions: If you see errors related to dimension mismatch, confirm every embedding vector has the same length and is numeric.
116
+ - Mapping errors: Remember the index IDs correspond to the zero-based position in the `data` array passed to LLM::RAG.index. Keep a parallel array or map to metadata (IDs, titles, etc.).
117
+
118
+ Further integration
119
+ -------------------
120
+ - Use chunking for long documents: split long documents into smaller passages, embed each passage, and keep a mapping from passage index to parent document.
121
+ - Use result reranking: after retrieval, you can rerank retrieved documents with more expensive cross-encoders or scoring functions.
122
+ - Combine with generative models: feed retrieved passages into an LLM prompt to produce answers grounded in retrieved content.
123
+
124
+ References
125
+ ----------
126
+ - lib/scout/llm/rag.rb (implementation)
127
+ - hnswlib Ruby gem (install and persistence documentation)
128
+ - Scout LLM embedding helpers (lib/scout/llm/embed.rb)
129
+