leva 0.1.9.1 → 0.1.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +21 -5
- data/app/assets/stylesheets/leva/application.css +3083 -15
- data/app/controllers/leva/application_controller.rb +1 -1
- data/app/controllers/leva/dataset_records_controller.rb +1 -1
- data/app/controllers/leva/datasets_controller.rb +6 -6
- data/app/controllers/leva/design_system_controller.rb +9 -0
- data/app/controllers/leva/experiments_controller.rb +8 -8
- data/app/controllers/leva/runner_results_controller.rb +1 -1
- data/app/controllers/leva/workbench_controller.rb +26 -15
- data/app/helpers/leva/application_helper.rb +7 -7
- data/app/jobs/leva/experiment_job.rb +1 -1
- data/app/jobs/leva/run_eval_job.rb +1 -1
- data/app/models/concerns/leva/recordable.rb +5 -5
- data/app/models/leva/dataset.rb +1 -1
- data/app/models/leva/evaluation_result.rb +1 -1
- data/app/models/leva/experiment.rb +1 -1
- data/app/models/leva/prompt.rb +1 -1
- data/app/views/layouts/leva/application.html.erb +23 -24
- data/app/views/leva/dataset_records/index.html.erb +70 -43
- data/app/views/leva/dataset_records/show.html.erb +115 -25
- data/app/views/leva/datasets/_dataset.html.erb +11 -18
- data/app/views/leva/datasets/_form.html.erb +18 -14
- data/app/views/leva/datasets/edit.html.erb +16 -4
- data/app/views/leva/datasets/index.html.erb +33 -41
- data/app/views/leva/datasets/new.html.erb +15 -4
- data/app/views/leva/datasets/show.html.erb +120 -139
- data/app/views/leva/design_system/index.html.erb +1731 -0
- data/app/views/leva/experiments/_experiment.html.erb +46 -31
- data/app/views/leva/experiments/_form.html.erb +62 -35
- data/app/views/leva/experiments/edit.html.erb +17 -3
- data/app/views/leva/experiments/index.html.erb +41 -36
- data/app/views/leva/experiments/new.html.erb +52 -4
- data/app/views/leva/experiments/show.html.erb +155 -98
- data/app/views/leva/runner_results/show.html.erb +271 -54
- data/app/views/leva/workbench/_evaluation_area.html.erb +18 -4
- data/app/views/leva/workbench/_prompt_content.html.erb +124 -73
- data/app/views/leva/workbench/_prompt_form.html.erb +24 -23
- data/app/views/leva/workbench/_prompt_sidebar.html.erb +57 -12
- data/app/views/leva/workbench/_results_section.html.erb +274 -112
- data/app/views/leva/workbench/_top_bar.html.erb +16 -6
- data/app/views/leva/workbench/edit.html.erb +46 -15
- data/app/views/leva/workbench/index.html.erb +5 -8
- data/app/views/leva/workbench/new.html.erb +74 -42
- data/config/routes.rb +11 -9
- data/db/migrate/20240813173033_create_leva_dataset_records.rb +1 -0
- data/db/migrate/20240813173035_create_leva_experiments.rb +2 -0
- data/db/migrate/{20240816201419_create_leva_runner_results.rb → 20240813173040_create_leva_runner_results.rb} +4 -1
- data/db/migrate/20240813173050_create_leva_evaluation_results.rb +3 -3
- data/lib/generators/leva/eval_generator.rb +4 -4
- data/lib/generators/leva/runner_generator.rb +4 -4
- data/lib/generators/leva/templates/runner.rb.erb +20 -0
- data/lib/leva/version.rb +1 -1
- data/lib/leva.rb +24 -2
- metadata +5 -11
- data/db/migrate/20240816201433_update_leva_evaluation_results.rb +0 -8
- data/db/migrate/20240821163608_make_experiment_optional_for_runner_results.rb +0 -6
- data/db/migrate/20240821181934_add_prompt_to_leva_runner_results.rb +0 -5
- data/db/migrate/20240821183153_add_runner_and_evaluator_to_leva_experiments.rb +0 -6
- data/db/migrate/20240821191713_add_actual_result_to_leva_dataset_records.rb +0 -5
- data/db/migrate/20240822143201_remove_actual_result_from_leva_runner_results.rb +0 -5
- data/db/migrate/20240912183556_add_runner_class_to_leva_runner_results.rb +0 -5
- data/lib/tasks/auto_annotate_models.rake +0 -59
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: bfa8fc5e8dcf215b553b499c40665619cf7c2828dd5596afa280d1f9f0a023bd
|
|
4
|
+
data.tar.gz: f7546abb593c08067becf5fd5b28f43acf03d330788440dde8bb016525996338
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: aa5ceddeadac67abb428d7ca4f98800c97eaad7706fca815f3ff93e50fa0bcca3dab1f108712fba1c9a1f919d45d1bb8e720d10c715850a186b495aa4f218aae
|
|
7
|
+
data.tar.gz: bbaf3caa414aaa676f0ac5038e20189da01babb2ddb2cdd025d21f0cbb0a31932dc5e22c51835b163057b22074f3473f43116367a35a60d0214f54b86b4e15e5
|
data/README.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
# Leva - Flexible Evaluation Framework for Language Models
|
|
2
2
|
|
|
3
3
|
[](https://badge.fury.io/rb/leva)
|
|
4
|
+
[](https://github.com/kieranklaassen/leva/actions/workflows/ci.yml)
|
|
4
5
|
|
|
5
6
|
Leva is a Ruby on Rails framework for evaluating Language Models (LLMs) using ActiveRecord datasets on production models. It provides a flexible structure for creating experiments, managing datasets, and implementing various evaluation logic on production data with security in mind.
|
|
6
7
|
|
|
7
|
-

|
|
8
|
+

|
|
9
9
|
|
|
10
10
|
## Installation
|
|
11
11
|
|
|
@@ -28,13 +28,25 @@ rails leva:install:migrations
|
|
|
28
28
|
rails db:migrate
|
|
29
29
|
```
|
|
30
30
|
|
|
31
|
+
Mount the Leva engine in your application's routes file:
|
|
32
|
+
|
|
33
|
+
```ruby
|
|
34
|
+
# config/routes.rb
|
|
35
|
+
Rails.application.routes.draw do
|
|
36
|
+
mount Leva::Engine => "/leva"
|
|
37
|
+
# your other routes...
|
|
38
|
+
end
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
The Leva UI will then be available at `/leva` in your application.
|
|
42
|
+
|
|
31
43
|
## Usage
|
|
32
44
|
|
|
33
45
|
### 1. Setting up Datasets
|
|
34
46
|
|
|
35
47
|
First, create a dataset and add any ActiveRecord records you want to evaluate against. To make your models compatible with Leva, include the `Leva::Recordable` concern in your model:
|
|
36
48
|
|
|
37
|
-
|
|
49
|
+
```ruby
|
|
38
50
|
class TextContent < ApplicationRecord
|
|
39
51
|
include Leva::Recordable
|
|
40
52
|
|
|
@@ -71,7 +83,11 @@ class TextContent < ApplicationRecord
|
|
|
71
83
|
end
|
|
72
84
|
end
|
|
73
85
|
|
|
74
|
-
dataset = Leva::Dataset.create(name: "Sentiment Analysis Dataset")
|
|
86
|
+
dataset = Leva::Dataset.create(name: "Sentiment Analysis Dataset")
|
|
87
|
+
dataset.add_record TextContent.create(text: "I love this product!", expected_label: "Positive")
|
|
88
|
+
dataset.add_record TextContent.create(text: "Terrible experience", expected_label: "Negative")
|
|
89
|
+
dataset.add_record TextContent.create(text: "It's ok", expected_label: "Neutral")
|
|
90
|
+
```
|
|
75
91
|
|
|
76
92
|
### 2. Implementing Runs
|
|
77
93
|
|
|
@@ -79,7 +95,7 @@ Create a run class to handle the execution of your inference logic:
|
|
|
79
95
|
|
|
80
96
|
```bash
|
|
81
97
|
rails generate leva:runner sentiment
|
|
82
|
-
|
|
98
|
+
```
|
|
83
99
|
|
|
84
100
|
```ruby
|
|
85
101
|
class SentimentRun < Leva::BaseRun
|