leva 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 10570fd9a6cdbf979cf9538963033a752ef0cb7d4d1966e9a002408f875a5aa2
4
+ data.tar.gz: e194a81e5e53ddf22b3aee902aab09191ccf39836385c885692f95b11fbcdd46
5
+ SHA512:
6
+ metadata.gz: 7b534a0aceff9d67f1a0f00d3b70993fec440633ef20005adf5ac866772bd57899fb7412073a7deaa2a7dfac856def72bc51a94ed6a5eb26a21818ec405fddfc
7
+ data.tar.gz: 42551d07b819337c1ac780aa542fb41af2218e257c9f1bf5d11d21dd351e64f5c6942d2354cbce2dead1b0232e62140fe16a1c5a308c6ba2b9fe8659e3b5b88b
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright Kieran Klaassen
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,152 @@
1
+ # Leva - Flexible Evaluation Framework for Language Models
2
+
3
+ Leva is a Ruby on Rails framework for evaluating Language Models (LLMs) using ActiveRecord datasets. It provides a flexible structure for creating experiments, managing datasets, and implementing various evaluation logic.
4
+
5
+ ## Installation
6
+
7
+ Add this line to your application's Gemfile:
8
+
9
+ ```ruby
10
+ gem 'leva'
11
+ ```
12
+
13
+ And then execute:
14
+
15
+ ```bash
16
+ $ bundle install
17
+ ```
18
+
19
+ ## Usage
20
+
21
+ ### 1. Setting up Datasets
22
+
23
+ First, create a dataset and add any ActiveRecord records:
24
+
25
+ ```ruby
26
+ dataset = Dataset.create(name: "Sentiment Analysis Dataset")
27
+
28
+ dataset.records << TextContent.create(text: "I love this product!", expected_label: "Positive")
29
+ dataset.records << TextContent.create(text: "Terrible experience", expected_label: "Negative")
30
+ dataset.records << TextContent.create(text: "I's ok", expected_label: "Neutral")
31
+ ```
32
+
33
+ > In this case the TextContent model is the ActiveRecord model from your own application.
34
+
35
+ ### 2. Implementing Evals
36
+
37
+ Create evals by adding new files in `app/evals/`. Each eval implements both the evaluation logic and how to run it. Here are some examples:
38
+
39
+ ```bash
40
+ $ rails generate leva:eval Sentiment
41
+ ```
42
+
43
+ #### Sentiment Evaluation (app/evals/sentiment_eval.rb)
44
+
45
+ ```ruby
46
+ class SentimentEval < Leva::BaseEval
47
+ leva_dataset_record_class "TextContent"
48
+
49
+ def run_each(record)
50
+ prediction = label_sentiment(record.text)
51
+ score = calculate_score(prediction, record.expected_label)
52
+
53
+ Leva::Result.new(
54
+ label: 'sentiment',
55
+ prediction: prediction,
56
+ score: score
57
+ )
58
+ end
59
+
60
+ private
61
+
62
+ def label_sentiment(text)
63
+ # Simple sentiment analysis logic, use LLM to label the sentiment yourself
64
+ text = text.downcase
65
+ if text.include?('love')
66
+ 'Positive'
67
+ elsif text.include?('terrible')
68
+ 'Negative'
69
+ else
70
+ 'Neutral'
71
+ end
72
+ end
73
+
74
+ def calculate_score(prediction, expected)
75
+ prediction == expected ? 1.0 : 0.0
76
+ end
77
+ end
78
+ ```
79
+
80
+ ### 3. Running Experiments
81
+
82
+ You can run experiments with different evals:
83
+
84
+ ```ruby
85
+ sentiment_experiment = Experiment.create!(name: "Sentiment Analysis", dataset: dataset)
86
+ SentimentEval.run_experiment(sentiment_experiment)
87
+ ```
88
+
89
+ You can also run an experiment with a prompt so you can use a LLM to evaluate the dataset:
90
+
91
+ ```ruby
92
+ prompt = Leva::Prompt.create!(
93
+ name: "Sentiment Analysis",
94
+ version: 1,
95
+ system_prompt: "You are an expert at analyzing text and returning the sentiment.",
96
+ user_prompt: "Please analyze the following text and return the sentiment as Positive, Negative, or Neutral. \n\n {{TEXT}}",
97
+ metadata: {
98
+ model: "gpt-4o",
99
+ temperature: 0.5
100
+ }
101
+ )
102
+
103
+ sentiment_experiment = Experiment.create!(
104
+ name: "Sentiment Analysis with LLM",
105
+ dataset: dataset,
106
+ prompt: prompt
107
+ )
108
+
109
+ SentimentEval.run_experiment(sentiment_experiment)
110
+ ```
111
+
112
+ ### 4. Analyzing Results
113
+
114
+ After the experiments are complete, analyze the results:
115
+
116
+ ```ruby
117
+
118
+ results = experiment.evaluation_results
119
+ average_score = results.average(:score)
120
+ count = results.count
121
+
122
+ puts "Experiment: #{experiment.name}"
123
+ puts "Average Score: #{average_score}"
124
+ puts "Number of Evaluations: #{count}"
125
+ ```
126
+
127
+ ## Configuration
128
+
129
+ If your evals require API keys or other configurations, ensure you set these up in your Rails credentials or environment variables.
130
+
131
+ ## Leva's Components
132
+
133
+ ### Classes
134
+
135
+ - `Leva::BaseEval`: The base class for all evals. Override the `run` method in your eval classes.
136
+ - `Leva::Result`: The result of an evaluation.
137
+
138
+ ### Models
139
+
140
+ - `Leva::Dataset`: Represents a collection of data to be evaluated.
141
+ - `Leva::DatasetRecord`: Represents individual records within a dataset.
142
+ - `Leva::Experiment`: Represents a single run of an evaluation on a dataset.
143
+ - `Leva::EvaluationResult`: Stores the results of each evaluation.
144
+ - `Leva::Prompt`: Represents a prompt for an LLM.
145
+
146
+ ## Contributing
147
+
148
+ Bug reports and pull requests are welcome on GitHub at https://github.com/kieranklaassen/leva.
149
+
150
+ ## License
151
+
152
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,8 @@
1
+ require "bundler/setup"
2
+
3
+ APP_RAKEFILE = File.expand_path("test/dummy/Rakefile", __dir__)
4
+ load "rails/tasks/engine.rake"
5
+
6
+ load "rails/tasks/statistics.rake"
7
+
8
+ require "bundler/gem_tasks"
@@ -0,0 +1 @@
1
+ //= link_directory ../stylesheets/leva .css
@@ -0,0 +1,15 @@
1
+ /*
2
+ * This is a manifest file that'll be compiled into application.css, which will include all the files
3
+ * listed below.
4
+ *
5
+ * Any CSS and SCSS file within this directory, lib/assets/stylesheets, vendor/assets/stylesheets,
6
+ * or any plugin's vendor/assets/stylesheets directory can be referenced here using a relative path.
7
+ *
8
+ * You're free to add application-wide styles to this file and they'll appear at the bottom of the
9
+ * compiled file so the styles you add here take precedence over styles defined in any other CSS/SCSS
10
+ * files in this directory. Styles in this file should be added after the last require_* statement.
11
+ * It is generally better to create a new file per style scope.
12
+ *
13
+ *= require_tree .
14
+ *= require_self
15
+ */
@@ -0,0 +1,4 @@
1
+ module Leva
2
+ class ApplicationController < ActionController::Base
3
+ end
4
+ end
@@ -0,0 +1,4 @@
1
+ module Leva
2
+ module ApplicationHelper
3
+ end
4
+ end
@@ -0,0 +1,4 @@
1
+ module Leva
2
+ class ApplicationJob < ActiveJob::Base
3
+ end
4
+ end
@@ -0,0 +1,6 @@
1
+ module Leva
2
+ class ApplicationMailer < ActionMailer::Base
3
+ default from: "from@example.com"
4
+ layout "mailer"
5
+ end
6
+ end
@@ -0,0 +1,5 @@
1
+ module Leva
2
+ class ApplicationRecord < ActiveRecord::Base
3
+ self.abstract_class = true
4
+ end
5
+ end
@@ -0,0 +1,16 @@
1
+ # == Schema Information
2
+ #
3
+ # Table name: leva_datasets
4
+ #
5
+ # id :integer not null, primary key
6
+ # name :string
7
+ # created_at :datetime not null
8
+ # updated_at :datetime not null
9
+ #
10
+ module Leva
11
+ class Dataset < ApplicationRecord
12
+ has_many :dataset_records, dependent: :destroy
13
+ has_many :records, through: :dataset_records, source: :recordable
14
+ has_many :experiments, dependent: :destroy
15
+ end
16
+ end
@@ -0,0 +1,26 @@
1
+ # == Schema Information
2
+ #
3
+ # Table name: leva_dataset_records
4
+ #
5
+ # id :integer not null, primary key
6
+ # recordable_type :string not null
7
+ # created_at :datetime not null
8
+ # updated_at :datetime not null
9
+ # dataset_id :integer not null
10
+ # recordable_id :integer not null
11
+ #
12
+ # Indexes
13
+ #
14
+ # index_leva_dataset_records_on_dataset_id (dataset_id)
15
+ # index_leva_dataset_records_on_recordable (recordable_type,recordable_id)
16
+ #
17
+ # Foreign Keys
18
+ #
19
+ # dataset_id (dataset_id => datasets.id)
20
+ #
21
+ module Leva
22
+ class DatasetRecord < ApplicationRecord
23
+ belongs_to :dataset
24
+ belongs_to :recordable, polymorphic: true
25
+ end
26
+ end
@@ -0,0 +1,29 @@
1
+ # == Schema Information
2
+ #
3
+ # Table name: leva_evaluation_results
4
+ #
5
+ # id :integer not null, primary key
6
+ # label :string
7
+ # prediction :string
8
+ # score :float
9
+ # created_at :datetime not null
10
+ # updated_at :datetime not null
11
+ # dataset_record_id :integer not null
12
+ # experiment_id :integer not null
13
+ #
14
+ # Indexes
15
+ #
16
+ # index_leva_evaluation_results_on_dataset_record_id (dataset_record_id)
17
+ # index_leva_evaluation_results_on_experiment_id (experiment_id)
18
+ #
19
+ # Foreign Keys
20
+ #
21
+ # dataset_record_id (dataset_record_id => dataset_records.id)
22
+ # experiment_id (experiment_id => experiments.id)
23
+ #
24
+ module Leva
25
+ class EvaluationResult < ApplicationRecord
26
+ belongs_to :experiment
27
+ belongs_to :dataset_record
28
+ end
29
+ end
@@ -0,0 +1,29 @@
1
+ # == Schema Information
2
+ #
3
+ # Table name: leva_experiments
4
+ #
5
+ # id :integer not null, primary key
6
+ # metadata :text
7
+ # name :string
8
+ # status :integer
9
+ # created_at :datetime not null
10
+ # updated_at :datetime not null
11
+ # dataset_id :integer not null
12
+ # prompt_id :integer not null
13
+ #
14
+ # Indexes
15
+ #
16
+ # index_leva_experiments_on_dataset_id (dataset_id)
17
+ # index_leva_experiments_on_prompt_id (prompt_id)
18
+ #
19
+ # Foreign Keys
20
+ #
21
+ # dataset_id (dataset_id => datasets.id)
22
+ # prompt_id (prompt_id => prompts.id)
23
+ #
24
+ module Leva
25
+ class Experiment < ApplicationRecord
26
+ belongs_to :dataset
27
+ belongs_to :prompt
28
+ end
29
+ end
@@ -0,0 +1,17 @@
1
+ # == Schema Information
2
+ #
3
+ # Table name: leva_prompts
4
+ #
5
+ # id :integer not null, primary key
6
+ # metadata :text
7
+ # name :string
8
+ # system_prompt :text
9
+ # user_prompt :text
10
+ # version :integer
11
+ # created_at :datetime not null
12
+ # updated_at :datetime not null
13
+ #
14
+ module Leva
15
+ class Prompt < ApplicationRecord
16
+ end
17
+ end
@@ -0,0 +1,17 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <title>Leva</title>
5
+ <%= csrf_meta_tags %>
6
+ <%= csp_meta_tag %>
7
+
8
+ <%= yield :head %>
9
+
10
+ <%= stylesheet_link_tag "leva/application", media: "all" %>
11
+ </head>
12
+ <body>
13
+
14
+ <%= yield %>
15
+
16
+ </body>
17
+ </html>
data/config/routes.rb ADDED
@@ -0,0 +1,2 @@
1
+ Leva::Engine.routes.draw do
2
+ end
@@ -0,0 +1,9 @@
1
+ class CreateLevaDatasets < ActiveRecord::Migration[7.2]
2
+ def change
3
+ create_table :leva_datasets do |t|
4
+ t.string :name
5
+
6
+ t.timestamps
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,10 @@
1
+ class CreateLevaDatasetRecords < ActiveRecord::Migration[7.2]
2
+ def change
3
+ create_table :leva_dataset_records do |t|
4
+ t.references :dataset, null: false, foreign_key: true
5
+ t.references :recordable, polymorphic: true, null: false
6
+
7
+ t.timestamps
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,13 @@
1
+ class CreateLevaEvaluationResults < ActiveRecord::Migration[7.2]
2
+ def change
3
+ create_table :leva_evaluation_results do |t|
4
+ t.references :experiment, null: false, foreign_key: true
5
+ t.references :dataset_record, null: false, foreign_key: true
6
+ t.string :prediction
7
+ t.float :score
8
+ t.string :label
9
+
10
+ t.timestamps
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,13 @@
1
+ class CreateLevaPrompts < ActiveRecord::Migration[7.2]
2
+ def change
3
+ create_table :leva_prompts do |t|
4
+ t.string :name
5
+ t.integer :version
6
+ t.text :system_prompt
7
+ t.text :user_prompt
8
+ t.text :metadata
9
+
10
+ t.timestamps
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,13 @@
1
+ class CreateLevaExperiments < ActiveRecord::Migration[7.2]
2
+ def change
3
+ create_table :leva_experiments do |t|
4
+ t.string :name
5
+ t.references :dataset, null: false, foreign_key: true
6
+ t.references :prompt, null: false, foreign_key: true
7
+ t.integer :status
8
+ t.text :metadata
9
+
10
+ t.timestamps
11
+ end
12
+ end
13
+ end
@@ -0,0 +1,5 @@
1
+ module Leva
2
+ class Engine < ::Rails::Engine
3
+ isolate_namespace Leva
4
+ end
5
+ end
@@ -0,0 +1,3 @@
1
+ module Leva
2
+ VERSION = "0.1.0"
3
+ end
data/lib/leva.rb ADDED
@@ -0,0 +1,6 @@
1
+ require "leva/version"
2
+ require "leva/engine"
3
+
4
+ module Leva
5
+ # Your code goes here...
6
+ end
@@ -0,0 +1,59 @@
1
+ # NOTE: only doing this in development as some production environments (Heroku)
2
+ # NOTE: are sensitive to local FS writes, and besides -- it's just not proper
3
+ # NOTE: to have a dev-mode tool do its thing in production.
4
+ if Rails.env.development?
5
+ require 'annotate'
6
+ task :set_annotation_options do
7
+ # You can override any of these by setting an environment variable of the
8
+ # same name.
9
+ Annotate.set_defaults(
10
+ 'active_admin' => 'false',
11
+ 'additional_file_patterns' => [],
12
+ 'routes' => 'false',
13
+ 'models' => 'true',
14
+ 'position_in_routes' => 'before',
15
+ 'position_in_class' => 'before',
16
+ 'position_in_test' => 'before',
17
+ 'position_in_fixture' => 'before',
18
+ 'position_in_factory' => 'before',
19
+ 'position_in_serializer' => 'before',
20
+ 'show_foreign_keys' => 'true',
21
+ 'show_complete_foreign_keys' => 'false',
22
+ 'show_indexes' => 'true',
23
+ 'simple_indexes' => 'false',
24
+ 'model_dir' => 'app/models',
25
+ 'root_dir' => '',
26
+ 'include_version' => 'false',
27
+ 'require' => '',
28
+ 'exclude_tests' => 'false',
29
+ 'exclude_fixtures' => 'false',
30
+ 'exclude_factories' => 'false',
31
+ 'exclude_serializers' => 'false',
32
+ 'exclude_scaffolds' => 'true',
33
+ 'exclude_controllers' => 'true',
34
+ 'exclude_helpers' => 'true',
35
+ 'exclude_sti_subclasses' => 'false',
36
+ 'ignore_model_sub_dir' => 'false',
37
+ 'ignore_columns' => nil,
38
+ 'ignore_routes' => nil,
39
+ 'ignore_unknown_models' => 'false',
40
+ 'hide_limit_column_types' => 'integer,bigint,boolean',
41
+ 'hide_default_column_types' => 'json,jsonb,hstore',
42
+ 'skip_on_db_migrate' => 'false',
43
+ 'format_bare' => 'true',
44
+ 'format_rdoc' => 'false',
45
+ 'format_yard' => 'false',
46
+ 'format_markdown' => 'false',
47
+ 'sort' => 'false',
48
+ 'force' => 'false',
49
+ 'frozen' => 'false',
50
+ 'classified_sort' => 'true',
51
+ 'trace' => 'false',
52
+ 'wrapper_open' => nil,
53
+ 'wrapper_close' => nil,
54
+ 'with_comment' => 'true'
55
+ )
56
+ end
57
+
58
+ Annotate.load_tasks
59
+ end
@@ -0,0 +1,4 @@
1
+ # desc "Explaining what the task does"
2
+ # task :leva do
3
+ # # Task goes here
4
+ # end
metadata ADDED
@@ -0,0 +1,89 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: leva
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Kieran Klaassen
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2024-08-13 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rails
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: 7.2.0
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: 7.2.0
27
+ description: Leva is a Ruby on Rails framework for evaluating Language Models (LLMs)
28
+ using ActiveRecord datasets. It provides a flexible structure for creating experiments,
29
+ managing datasets, and implementing various evaluation logic.
30
+ email:
31
+ - kieranklaassen@gmail.com
32
+ executables: []
33
+ extensions: []
34
+ extra_rdoc_files: []
35
+ files:
36
+ - MIT-LICENSE
37
+ - README.md
38
+ - Rakefile
39
+ - app/assets/config/leva_manifest.js
40
+ - app/assets/stylesheets/leva/application.css
41
+ - app/controllers/leva/application_controller.rb
42
+ - app/helpers/leva/application_helper.rb
43
+ - app/jobs/leva/application_job.rb
44
+ - app/mailers/leva/application_mailer.rb
45
+ - app/models/leva/application_record.rb
46
+ - app/models/leva/dataset.rb
47
+ - app/models/leva/dataset_record.rb
48
+ - app/models/leva/evaluation_result.rb
49
+ - app/models/leva/experiment.rb
50
+ - app/models/leva/prompt.rb
51
+ - app/views/layouts/leva/application.html.erb
52
+ - config/routes.rb
53
+ - db/migrate/20240813172916_create_leva_datasets.rb
54
+ - db/migrate/20240813173033_create_leva_dataset_records.rb
55
+ - db/migrate/20240813173050_create_leva_evaluation_results.rb
56
+ - db/migrate/20240813173105_create_leva_prompts.rb
57
+ - db/migrate/20240813173222_create_leva_experiments.rb
58
+ - lib/leva.rb
59
+ - lib/leva/engine.rb
60
+ - lib/leva/version.rb
61
+ - lib/tasks/auto_annotate_models.rake
62
+ - lib/tasks/leva_tasks.rake
63
+ homepage: https://github.com/kieranklaassen/leva
64
+ licenses:
65
+ - MIT
66
+ metadata:
67
+ homepage_uri: https://github.com/kieranklaassen/leva
68
+ source_code_uri: https://github.com/kieranklaassen/leva
69
+ changelog_uri: https://github.com/kieranklaassen/leva/blob/main/CHANGELOG.md
70
+ post_install_message:
71
+ rdoc_options: []
72
+ require_paths:
73
+ - lib
74
+ required_ruby_version: !ruby/object:Gem::Requirement
75
+ requirements:
76
+ - - ">="
77
+ - !ruby/object:Gem::Version
78
+ version: '0'
79
+ required_rubygems_version: !ruby/object:Gem::Requirement
80
+ requirements:
81
+ - - ">="
82
+ - !ruby/object:Gem::Version
83
+ version: '0'
84
+ requirements: []
85
+ rubygems_version: 3.4.10
86
+ signing_key:
87
+ specification_version: 4
88
+ summary: Flexible Evaluation Framework for Language Models in Rails
89
+ test_files: []