leva 0.1.9.1 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +21 -5
  3. data/app/assets/stylesheets/leva/application.css +3083 -15
  4. data/app/controllers/leva/application_controller.rb +1 -1
  5. data/app/controllers/leva/dataset_records_controller.rb +1 -1
  6. data/app/controllers/leva/datasets_controller.rb +6 -6
  7. data/app/controllers/leva/design_system_controller.rb +9 -0
  8. data/app/controllers/leva/experiments_controller.rb +8 -8
  9. data/app/controllers/leva/runner_results_controller.rb +1 -1
  10. data/app/controllers/leva/workbench_controller.rb +26 -15
  11. data/app/helpers/leva/application_helper.rb +7 -7
  12. data/app/jobs/leva/experiment_job.rb +1 -1
  13. data/app/jobs/leva/run_eval_job.rb +1 -1
  14. data/app/models/concerns/leva/recordable.rb +5 -5
  15. data/app/models/leva/dataset.rb +1 -1
  16. data/app/models/leva/evaluation_result.rb +1 -1
  17. data/app/models/leva/experiment.rb +1 -1
  18. data/app/models/leva/prompt.rb +1 -1
  19. data/app/views/layouts/leva/application.html.erb +23 -24
  20. data/app/views/leva/dataset_records/index.html.erb +70 -43
  21. data/app/views/leva/dataset_records/show.html.erb +115 -25
  22. data/app/views/leva/datasets/_dataset.html.erb +11 -18
  23. data/app/views/leva/datasets/_form.html.erb +18 -14
  24. data/app/views/leva/datasets/edit.html.erb +16 -4
  25. data/app/views/leva/datasets/index.html.erb +33 -41
  26. data/app/views/leva/datasets/new.html.erb +15 -4
  27. data/app/views/leva/datasets/show.html.erb +120 -139
  28. data/app/views/leva/design_system/index.html.erb +1731 -0
  29. data/app/views/leva/experiments/_experiment.html.erb +46 -31
  30. data/app/views/leva/experiments/_form.html.erb +62 -35
  31. data/app/views/leva/experiments/edit.html.erb +17 -3
  32. data/app/views/leva/experiments/index.html.erb +41 -36
  33. data/app/views/leva/experiments/new.html.erb +52 -4
  34. data/app/views/leva/experiments/show.html.erb +155 -98
  35. data/app/views/leva/runner_results/show.html.erb +271 -54
  36. data/app/views/leva/workbench/_evaluation_area.html.erb +18 -4
  37. data/app/views/leva/workbench/_prompt_content.html.erb +124 -73
  38. data/app/views/leva/workbench/_prompt_form.html.erb +24 -23
  39. data/app/views/leva/workbench/_prompt_sidebar.html.erb +57 -12
  40. data/app/views/leva/workbench/_results_section.html.erb +274 -112
  41. data/app/views/leva/workbench/_top_bar.html.erb +16 -6
  42. data/app/views/leva/workbench/edit.html.erb +46 -15
  43. data/app/views/leva/workbench/index.html.erb +5 -8
  44. data/app/views/leva/workbench/new.html.erb +74 -42
  45. data/config/routes.rb +11 -9
  46. data/db/migrate/20240813173033_create_leva_dataset_records.rb +1 -0
  47. data/db/migrate/20240813173035_create_leva_experiments.rb +2 -0
  48. data/db/migrate/{20240816201419_create_leva_runner_results.rb → 20240813173040_create_leva_runner_results.rb} +4 -1
  49. data/db/migrate/20240813173050_create_leva_evaluation_results.rb +3 -3
  50. data/lib/generators/leva/eval_generator.rb +4 -4
  51. data/lib/generators/leva/runner_generator.rb +4 -4
  52. data/lib/generators/leva/templates/runner.rb.erb +20 -0
  53. data/lib/leva/version.rb +1 -1
  54. data/lib/leva.rb +24 -2
  55. metadata +5 -11
  56. data/db/migrate/20240816201433_update_leva_evaluation_results.rb +0 -8
  57. data/db/migrate/20240821163608_make_experiment_optional_for_runner_results.rb +0 -6
  58. data/db/migrate/20240821181934_add_prompt_to_leva_runner_results.rb +0 -5
  59. data/db/migrate/20240821183153_add_runner_and_evaluator_to_leva_experiments.rb +0 -6
  60. data/db/migrate/20240821191713_add_actual_result_to_leva_dataset_records.rb +0 -5
  61. data/db/migrate/20240822143201_remove_actual_result_from_leva_runner_results.rb +0 -5
  62. data/db/migrate/20240912183556_add_runner_class_to_leva_runner_results.rb +0 -5
  63. data/lib/tasks/auto_annotate_models.rake +0 -59
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e84cd5316b32e9ce352eb79cab83629db28c658ab39883d6c6af9fdec16d3735
4
- data.tar.gz: d81584c99f7306bd9b238cf35dfa596927b1df4a1c5630b2462b88e058b7cef1
3
+ metadata.gz: bfa8fc5e8dcf215b553b499c40665619cf7c2828dd5596afa280d1f9f0a023bd
4
+ data.tar.gz: f7546abb593c08067becf5fd5b28f43acf03d330788440dde8bb016525996338
5
5
  SHA512:
6
- metadata.gz: 52b642176baa9374085b6625022cbf7a9e283b6772d14ef67abd36d0e1984c49d229d823c8a30a975e8ad26605827a394a093569a4efd4da3909d2dd84d9648f
7
- data.tar.gz: a59024a95bf2ed8cc26fe303361001a0b512d0f7f5a2b77d1480dbad5a146aad774f3c1d356821f76c128601be2ecd597e835f4c5ce088d9e6e288cdc273c2e2
6
+ metadata.gz: aa5ceddeadac67abb428d7ca4f98800c97eaad7706fca815f3ff93e50fa0bcca3dab1f108712fba1c9a1f919d45d1bb8e720d10c715850a186b495aa4f218aae
7
+ data.tar.gz: bbaf3caa414aaa676f0ac5038e20189da01babb2ddb2cdd025d21f0cbb0a31932dc5e22c51835b163057b22074f3473f43116367a35a60d0214f54b86b4e15e5
data/README.md CHANGED
@@ -1,11 +1,11 @@
1
1
  # Leva - Flexible Evaluation Framework for Language Models
2
2
 
3
3
  [![Gem Version](https://badge.fury.io/rb/leva.svg)](https://badge.fury.io/rb/leva)
4
+ [![CI](https://github.com/kieranklaassen/leva/actions/workflows/ci.yml/badge.svg)](https://github.com/kieranklaassen/leva/actions/workflows/ci.yml)
4
5
 
5
6
  Leva is a Ruby on Rails framework for evaluating Language Models (LLMs) using ActiveRecord datasets on production models. It provides a flexible structure for creating experiments, managing datasets, and implementing various evaluation logic on production data with security in mind.
6
7
 
7
- ![Leva - Workbench- Google Chrome](https://github.com/user-attachments/assets/ee487941-e11b-4c2a-983b-771ef27dd73c)
8
- ![Leva - rty- Google Chrome](https://github.com/user-attachments/assets/f9986a12-731b-4747-9f86-5ac6fffd5cbc)
8
+ ![Leva - Workbench- Google Chrome@2x](https://github.com/user-attachments/assets/1631a8f7-0634-4554-8f8b-e643062378a8)
9
9
 
10
10
  ## Installation
11
11
 
@@ -28,13 +28,25 @@ rails leva:install:migrations
28
28
  rails db:migrate
29
29
  ```
30
30
 
31
+ Mount the Leva engine in your application's routes file:
32
+
33
+ ```ruby
34
+ # config/routes.rb
35
+ Rails.application.routes.draw do
36
+ mount Leva::Engine => "/leva"
37
+ # your other routes...
38
+ end
39
+ ```
40
+
41
+ The Leva UI will then be available at `/leva` in your application.
42
+
31
43
  ## Usage
32
44
 
33
45
  ### 1. Setting up Datasets
34
46
 
35
47
  First, create a dataset and add any ActiveRecord records you want to evaluate against. To make your models compatible with Leva, include the `Leva::Recordable` concern in your model:
36
48
 
37
- ````ruby
49
+ ```ruby
38
50
  class TextContent < ApplicationRecord
39
51
  include Leva::Recordable
40
52
 
@@ -71,7 +83,11 @@ class TextContent < ApplicationRecord
71
83
  end
72
84
  end
73
85
 
74
- dataset = Leva::Dataset.create(name: "Sentiment Analysis Dataset") dataset.add_record TextContent.create(text: "I love this product!", expected_label: "Positive") dataset.add_record TextContent.create(text: "Terrible experience", expected_label: "Negative") dataset.add_record TextContent.create(text: "It's ok", expected_label: "Neutral")
86
+ dataset = Leva::Dataset.create(name: "Sentiment Analysis Dataset")
87
+ dataset.add_record TextContent.create(text: "I love this product!", expected_label: "Positive")
88
+ dataset.add_record TextContent.create(text: "Terrible experience", expected_label: "Negative")
89
+ dataset.add_record TextContent.create(text: "It's ok", expected_label: "Neutral")
90
+ ```
75
91
 
76
92
  ### 2. Implementing Runs
77
93
 
@@ -79,7 +95,7 @@ Create a run class to handle the execution of your inference logic:
79
95
 
80
96
  ```bash
81
97
  rails generate leva:runner sentiment
82
- ````
98
+ ```
83
99
 
84
100
  ```ruby
85
101
  class SentimentRun < Leva::BaseRun