ai_safety_rails 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/LICENSE.txt +21 -0
- data/README.md +104 -0
- data/exe/ai_safety_rails +42 -0
- data/lib/ai_safety_rails/evaluation/runner.rb +130 -0
- data/lib/ai_safety_rails/evaluation/set_loader.rb +70 -0
- data/lib/ai_safety_rails/evaluation.rb +8 -0
- data/lib/ai_safety_rails/guardrails/input/pii_redactor.rb +34 -0
- data/lib/ai_safety_rails/guardrails/middleware.rb +42 -0
- data/lib/ai_safety_rails/guardrails/output/schema_validator.rb +37 -0
- data/lib/ai_safety_rails/guardrails.rb +19 -0
- data/lib/ai_safety_rails/railtie.rb +13 -0
- data/lib/ai_safety_rails/version.rb +5 -0
- data/lib/ai_safety_rails.rb +19 -0
- data/lib/generators/ai_safety_rails/eval_set/eval_set_generator.rb +15 -0
- data/lib/generators/ai_safety_rails/eval_set/templates/eval_set.yml.tt +17 -0
- data/lib/generators/ai_safety_rails/guardrail/guardrail_generator.rb +21 -0
- data/lib/generators/ai_safety_rails/guardrail/templates/guardrail.rb.tt +17 -0
- data/lib/generators/ai_safety_rails/guardrail/templates/guardrail_spec.rb.tt +11 -0
- metadata +108 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: 6f0ee60e8a0819932e47216f981755ed81ff9a18f4ca48c58d5a132490c40bcf
|
|
4
|
+
data.tar.gz: 22c71785b1a1b4b09ec9540636b528d30dc4786429854c4ee18d1400ba9d2cee
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: 640ec5d1dd9d82d7b9832bacc1631d623b7a9b6121fb670f6c809bbcc1069edc75cc0c395a1deb3d78aa73f50620629d83bde6ec20be00168833317a797cc647
|
|
7
|
+
data.tar.gz: 78c0e454f76cf880aff2eaca85aa2c59b480945cbd763e6ce94e6c92468d64ced5f79ee9f795d96cebb14b373ef5a0de25b9f31cce3ce6122c4d0d7142b6e896
|
data/LICENSE.txt
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
data/README.md
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
# ai_safety_rails
|
|
2
|
+
|
|
3
|
+
A Ruby gem that adds **(1) a guardrails/safety layer** around LLM calls and **(2) an evaluation/regression harness** for prompts and models. Provider-agnostic: works with [RubyLLM](https://github.com/crmne/ruby_llm) or any client that exposes a simple request/response interface.
|
|
4
|
+
|
|
5
|
+
## Features
|
|
6
|
+
|
|
7
|
+
### Part 1 – Guardrails (middleware-style)
|
|
8
|
+
|
|
9
|
+
- **Input guardrails**
|
|
10
|
+
- **PII redaction:** mask or strip emails, phone numbers, SSN-like patterns (configurable regexes).
|
|
11
|
+
- Optional prompt-injection heuristics (e.g. "ignore previous instructions" blocklist).
|
|
12
|
+
- Optional max input length and rate limiting (in-memory or Redis).
|
|
13
|
+
- **Output guardrails**
|
|
14
|
+
- **Schema validation:** validate LLM output against a JSON schema (via `json_schemer`).
|
|
15
|
+
- Optional blocklist/allowlist for topics or sensitive keywords.
|
|
16
|
+
- **Integration:** Wraps any callable client: `input guardrails → call client → output guardrails → return`. Sync (and optionally async) support.
|
|
17
|
+
|
|
18
|
+
### Part 2 – Evaluation harness
|
|
19
|
+
|
|
20
|
+
- **Test sets:** Define evaluation sets (YAML/JSON) with input + expected output or criteria (e.g. "must include key X", "valid JSON", "no PII").
|
|
21
|
+
- **Runner:** Runs all examples, records latency, token usage (if exposed), pass/fail per criterion.
|
|
22
|
+
- **Regression:** Save baseline to JSON; compare runs and exit non-zero if metrics regress beyond a threshold.
|
|
23
|
+
- **CI-friendly:** CLI and Rake tasks (e.g. `bundle exec ai_safety_rails eval path/to/evals`).
|
|
24
|
+
|
|
25
|
+
### Part 3 – Rails-friendly (optional)
|
|
26
|
+
|
|
27
|
+
- **Generators:** `rails g guardrail pii_redaction`, `rails g eval_set support_tickets`.
|
|
28
|
+
- **Config:** Optional `config/guardrails.yml` and eval sets under `config/llm_evals/`.
|
|
29
|
+
- **Audit logging:** Optional hook to log when a guardrail fired (Rails logger or audit table).
|
|
30
|
+
|
|
31
|
+
## Installation
|
|
32
|
+
|
|
33
|
+
```ruby
|
|
34
|
+
# Gemfile
|
|
35
|
+
gem "ai_safety_rails"
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
bundle install
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Usage
|
|
43
|
+
|
|
44
|
+
### Guardrails middleware
|
|
45
|
+
|
|
46
|
+
Wrap any LLM client (callable) with guardrails:
|
|
47
|
+
|
|
48
|
+
```ruby
|
|
49
|
+
client = ->(input) { YourLLMClient.chat(input) }
|
|
50
|
+
|
|
51
|
+
guarded = AiSafetyRails::Guardrails::Middleware.wrap(client,
|
|
52
|
+
input_guardrails: [
|
|
53
|
+
AiSafetyRails::Guardrails::Input::PiiRedactor.new
|
|
54
|
+
],
|
|
55
|
+
output_guardrails: [
|
|
56
|
+
AiSafetyRails::Guardrails::Output::SchemaValidator.new(schema: my_json_schema_hash)
|
|
57
|
+
]
|
|
58
|
+
)
|
|
59
|
+
|
|
60
|
+
response = guarded.call("Hello, my email is user@example.com")
|
|
61
|
+
# Input is redacted before calling the client; output is validated before returning.
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
### Evaluation harness
|
|
65
|
+
|
|
66
|
+
1. Define an eval set (e.g. `config/llm_evals/support_tickets.yaml`):
|
|
67
|
+
|
|
68
|
+
```yaml
|
|
69
|
+
name: support_tickets
|
|
70
|
+
description: Support ticket classification
|
|
71
|
+
examples:
|
|
72
|
+
- id: 1
|
|
73
|
+
input: "Customer cannot login"
|
|
74
|
+
expectations:
|
|
75
|
+
- type: valid_json
|
|
76
|
+
- type: has_key
|
|
77
|
+
key: category
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
2. Run evals via CLI or Rake:
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
bundle exec ai_safety_rails eval config/llm_evals
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### Rails
|
|
87
|
+
|
|
88
|
+
When Rails is present, generators and config are available:
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
rails g ai_safety_rails:guardrail pii_redaction
|
|
92
|
+
rails g ai_safety_rails:eval_set support_tickets
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Optional `config/guardrails.yml` is loaded automatically.
|
|
96
|
+
|
|
97
|
+
## Development
|
|
98
|
+
|
|
99
|
+
- **Tests:** `bundle exec rake test` (minitest)
|
|
100
|
+
- **Eval CLI:** `bundle exec exe/ai_safety_rails eval path/to/evals`
|
|
101
|
+
|
|
102
|
+
## License
|
|
103
|
+
|
|
104
|
+
MIT.
|
data/exe/ai_safety_rails
ADDED
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
#!/usr/bin/env ruby
|
|
2
|
+
# frozen_string_literal: true
|
|
3
|
+
|
|
4
|
+
require "bundler/setup"
|
|
5
|
+
require "json"
|
|
6
|
+
require "ai_safety_rails"
|
|
7
|
+
|
|
8
|
+
def main
|
|
9
|
+
cmd = ARGV[0]
|
|
10
|
+
case cmd
|
|
11
|
+
when "eval"
|
|
12
|
+
run_eval(ARGV[1] || "config/llm_evals")
|
|
13
|
+
when "version"
|
|
14
|
+
puts "ai_safety_rails #{AiSafetyRails::VERSION}"
|
|
15
|
+
else
|
|
16
|
+
puts "Usage: ai_safety_rails eval <path> # run evaluation sets"
|
|
17
|
+
puts " ai_safety_rails version # show version"
|
|
18
|
+
exit(1)
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
def run_eval(path)
|
|
23
|
+
unless File.exist?(path)
|
|
24
|
+
warn "Path does not exist: #{path}"
|
|
25
|
+
exit(1)
|
|
26
|
+
end
|
|
27
|
+
|
|
28
|
+
# Default client: echo back input (for testing without an LLM)
|
|
29
|
+
client = ->(input) { input.is_a?(String) ? input : input.to_s }
|
|
30
|
+
runner = AiSafetyRails::Evaluation::Runner.new(client: client)
|
|
31
|
+
results = runner.run(path)
|
|
32
|
+
puts JSON.pretty_generate(results)
|
|
33
|
+
|
|
34
|
+
# Exit 1 if any run has failures
|
|
35
|
+
runs = results["runs"] || []
|
|
36
|
+
all_passed = runs.all? do |run|
|
|
37
|
+
(run["summary"] || {})["failed"].to_i.zero?
|
|
38
|
+
end
|
|
39
|
+
exit(all_passed ? 0 : 1)
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
main
|
|
@@ -0,0 +1,130 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "json"
|
|
4
|
+
|
|
5
|
+
module AiSafetyRails
|
|
6
|
+
module Evaluation
|
|
7
|
+
# Runs an evaluation set against a callable client and reports pass/fail and latency.
|
|
8
|
+
class Runner
|
|
9
|
+
def initialize(client:, loader: SetLoader.new)
|
|
10
|
+
@client = client
|
|
11
|
+
@loader = loader
|
|
12
|
+
end
|
|
13
|
+
|
|
14
|
+
def run(path)
|
|
15
|
+
sets = path.is_a?(Array) ? path : @loader.load(path)
|
|
16
|
+
sets = [sets] unless sets.is_a?(Array)
|
|
17
|
+
results = sets.map { |set| run_set(set) }
|
|
18
|
+
flatten_results(results)
|
|
19
|
+
end
|
|
20
|
+
|
|
21
|
+
def run_set(set)
|
|
22
|
+
examples = set["examples"] || []
|
|
23
|
+
name = set["name"] || "unnamed"
|
|
24
|
+
results = []
|
|
25
|
+
examples.each do |ex|
|
|
26
|
+
result = run_example(ex, set)
|
|
27
|
+
results << result
|
|
28
|
+
end
|
|
29
|
+
{
|
|
30
|
+
"set" => name,
|
|
31
|
+
"description" => set["description"],
|
|
32
|
+
"results" => results,
|
|
33
|
+
"summary" => summarize(results)
|
|
34
|
+
}
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
private
|
|
38
|
+
|
|
39
|
+
def run_example(example, _set)
|
|
40
|
+
input = example["input"]
|
|
41
|
+
expectations = example["expectations"] || []
|
|
42
|
+
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
43
|
+
begin
|
|
44
|
+
output = @client.call(input)
|
|
45
|
+
rescue StandardError => e
|
|
46
|
+
output = nil
|
|
47
|
+
error = e.message
|
|
48
|
+
end
|
|
49
|
+
elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time
|
|
50
|
+
|
|
51
|
+
criterion_results = expectations.map do |exp|
|
|
52
|
+
check_expectation(exp, output, error: error)
|
|
53
|
+
end
|
|
54
|
+
|
|
55
|
+
{
|
|
56
|
+
"id" => example["id"],
|
|
57
|
+
"latency_seconds" => elapsed.round(4),
|
|
58
|
+
"passed" => criterion_results.all? { |r| r["passed"] },
|
|
59
|
+
"criteria" => criterion_results,
|
|
60
|
+
"error" => error
|
|
61
|
+
}
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
def check_expectation(exp, output, error: nil)
|
|
65
|
+
type = exp["type"] || "custom"
|
|
66
|
+
passed = false
|
|
67
|
+
message = nil
|
|
68
|
+
|
|
69
|
+
if error
|
|
70
|
+
passed = false
|
|
71
|
+
message = error
|
|
72
|
+
else
|
|
73
|
+
case type
|
|
74
|
+
when "valid_json"
|
|
75
|
+
passed = valid_json?(output)
|
|
76
|
+
message = passed ? "Valid JSON" : "Invalid or missing JSON"
|
|
77
|
+
when "has_key"
|
|
78
|
+
data = parse_json_safe(output)
|
|
79
|
+
key = exp["key"]
|
|
80
|
+
passed = data.is_a?(Hash) && data.key?(key)
|
|
81
|
+
message = passed ? "Has key #{key}" : "Missing key #{key}"
|
|
82
|
+
when "not_contains_pii"
|
|
83
|
+
passed = !contains_pii?(output.to_s)
|
|
84
|
+
message = passed ? "No PII detected" : "PII may be present"
|
|
85
|
+
else
|
|
86
|
+
passed = false
|
|
87
|
+
message = "Unknown expectation type: #{type}"
|
|
88
|
+
end
|
|
89
|
+
end
|
|
90
|
+
|
|
91
|
+
{ "type" => type, "passed" => passed, "message" => message }
|
|
92
|
+
end
|
|
93
|
+
|
|
94
|
+
def valid_json?(output)
|
|
95
|
+
parse_json_safe(output) != nil
|
|
96
|
+
end
|
|
97
|
+
|
|
98
|
+
def parse_json_safe(output)
|
|
99
|
+
return output if output.is_a?(Hash) || output.is_a?(Array)
|
|
100
|
+
JSON.parse(output.to_s)
|
|
101
|
+
rescue JSON::ParserError, TypeError
|
|
102
|
+
nil
|
|
103
|
+
end
|
|
104
|
+
|
|
105
|
+
def contains_pii?(text)
|
|
106
|
+
AiSafetyRails::Guardrails::Input::PiiRedactor::DEFAULT_PATTERNS.values.any? { |re| re.match?(text) }
|
|
107
|
+
end
|
|
108
|
+
|
|
109
|
+
def summarize(results)
|
|
110
|
+
total = results.size
|
|
111
|
+
passed = results.count { |r| r["passed"] }
|
|
112
|
+
latencies = results.filter_map { |r| r["latency_seconds"] }
|
|
113
|
+
{
|
|
114
|
+
"total" => total,
|
|
115
|
+
"passed" => passed,
|
|
116
|
+
"failed" => total - passed,
|
|
117
|
+
"accuracy" => total.positive? ? (passed.to_f / total).round(4) : 0,
|
|
118
|
+
"avg_latency_seconds" => latencies.any? ? (latencies.sum / latencies.size).round(4) : nil
|
|
119
|
+
}
|
|
120
|
+
end
|
|
121
|
+
|
|
122
|
+
def flatten_results(results)
|
|
123
|
+
{
|
|
124
|
+
"runs" => results,
|
|
125
|
+
"timestamp" => Time.now.iso8601
|
|
126
|
+
}
|
|
127
|
+
end
|
|
128
|
+
end
|
|
129
|
+
end
|
|
130
|
+
end
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "pathname"
|
|
4
|
+
require "yaml"
|
|
5
|
+
require "json"
|
|
6
|
+
|
|
7
|
+
module AiSafetyRails
|
|
8
|
+
module Evaluation
|
|
9
|
+
# Loads evaluation sets from YAML or JSON files/directories.
|
|
10
|
+
class SetLoader
|
|
11
|
+
def load(path)
|
|
12
|
+
path = Pathname(path)
|
|
13
|
+
if path.directory?
|
|
14
|
+
load_directory(path)
|
|
15
|
+
else
|
|
16
|
+
load_file(path)
|
|
17
|
+
end
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
def load_file(path)
|
|
21
|
+
path = Pathname(path)
|
|
22
|
+
content = File.read(path)
|
|
23
|
+
data = case path.extname.downcase
|
|
24
|
+
when ".yaml", ".yml" then YAML.safe_load(content)
|
|
25
|
+
when ".json" then JSON.parse(content)
|
|
26
|
+
else raise ArgumentError, "Unsupported format: #{path.extname}"
|
|
27
|
+
end
|
|
28
|
+
normalize_set(data, path.to_s)
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
def load_directory(dir)
|
|
32
|
+
dir = Pathname(dir)
|
|
33
|
+
sets = []
|
|
34
|
+
Dir[dir.join("**/*.{yaml,yml,json}")].each do |f|
|
|
35
|
+
sets << load_file(f)
|
|
36
|
+
end
|
|
37
|
+
sets
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
private
|
|
41
|
+
|
|
42
|
+
def normalize_set(data, source = nil)
|
|
43
|
+
{
|
|
44
|
+
"name" => data["name"] || File.basename(source, ".*"),
|
|
45
|
+
"description" => data["description"] || "",
|
|
46
|
+
"examples" => Array(data["examples"]).map { |ex| normalize_example(ex) },
|
|
47
|
+
"source" => source
|
|
48
|
+
}
|
|
49
|
+
end
|
|
50
|
+
|
|
51
|
+
def normalize_example(ex)
|
|
52
|
+
case ex
|
|
53
|
+
when Hash
|
|
54
|
+
{
|
|
55
|
+
"id" => ex["id"] || ex["id"].to_s,
|
|
56
|
+
"input" => ex["input"],
|
|
57
|
+
"expectations" => Array(ex["expectations"] || ex["expected"]).map { |e| normalize_expectation(e) }
|
|
58
|
+
}
|
|
59
|
+
else
|
|
60
|
+
{ "id" => nil, "input" => ex.to_s, "expectations" => [] }
|
|
61
|
+
end
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
def normalize_expectation(exp)
|
|
65
|
+
return { "type" => "custom", "raw" => exp } unless exp.is_a?(Hash)
|
|
66
|
+
exp.transform_keys(&:to_s)
|
|
67
|
+
end
|
|
68
|
+
end
|
|
69
|
+
end
|
|
70
|
+
end
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module AiSafetyRails
|
|
4
|
+
module Guardrails
|
|
5
|
+
module Input
|
|
6
|
+
# Redacts PII from input text using configurable patterns.
|
|
7
|
+
# Default patterns: email, US phone (E.164-style), SSN-like (XXX-XX-XXXX).
|
|
8
|
+
class PiiRedactor < Guardrails::InputGuardrail
|
|
9
|
+
DEFAULT_PATTERNS = {
|
|
10
|
+
email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b/,
|
|
11
|
+
phone: /\b(?:\+?1[-.\s]?)?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}\b/,
|
|
12
|
+
ssn: /\b\d{3}-\d{2}-\d{4}\b/
|
|
13
|
+
}.freeze
|
|
14
|
+
|
|
15
|
+
DEFAULT_REPLACEMENT = "[REDACTED]"
|
|
16
|
+
|
|
17
|
+
def initialize(patterns: nil, replacement: DEFAULT_REPLACEMENT)
|
|
18
|
+
@patterns = patterns || DEFAULT_PATTERNS
|
|
19
|
+
@replacement = replacement
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
def process(input)
|
|
23
|
+
return input unless input.is_a?(String)
|
|
24
|
+
|
|
25
|
+
result = input.dup
|
|
26
|
+
@patterns.each_value do |regex|
|
|
27
|
+
result = result.gsub(regex, @replacement)
|
|
28
|
+
end
|
|
29
|
+
result
|
|
30
|
+
end
|
|
31
|
+
end
|
|
32
|
+
end
|
|
33
|
+
end
|
|
34
|
+
end
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module AiSafetyRails
|
|
4
|
+
module Guardrails
|
|
5
|
+
# Wraps an LLM client (callable) with input and output guardrails.
|
|
6
|
+
# Flow: input_guardrails → client.call(input) → output_guardrails → return.
|
|
7
|
+
class Middleware
|
|
8
|
+
def initialize(client:, input_guardrails: [], output_guardrails: [])
|
|
9
|
+
@client = client
|
|
10
|
+
@input_guardrails = Array(input_guardrails)
|
|
11
|
+
@output_guardrails = Array(output_guardrails)
|
|
12
|
+
end
|
|
13
|
+
|
|
14
|
+
def call(input)
|
|
15
|
+
processed_input = run_input_guardrails(input)
|
|
16
|
+
response = @client.call(processed_input)
|
|
17
|
+
run_output_guardrails(response)
|
|
18
|
+
response
|
|
19
|
+
end
|
|
20
|
+
|
|
21
|
+
class << self
|
|
22
|
+
def wrap(client, input_guardrails: [], output_guardrails: [])
|
|
23
|
+
new(
|
|
24
|
+
client: client,
|
|
25
|
+
input_guardrails: input_guardrails,
|
|
26
|
+
output_guardrails: output_guardrails
|
|
27
|
+
)
|
|
28
|
+
end
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
private
|
|
32
|
+
|
|
33
|
+
def run_input_guardrails(input)
|
|
34
|
+
@input_guardrails.reduce(input) { |acc, g| g.process(acc) }
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
def run_output_guardrails(output)
|
|
38
|
+
@output_guardrails.each { |g| g.validate(output) }
|
|
39
|
+
end
|
|
40
|
+
end
|
|
41
|
+
end
|
|
42
|
+
end
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "json"
|
|
4
|
+
require "json_schemer"
|
|
5
|
+
|
|
6
|
+
module AiSafetyRails
|
|
7
|
+
module Guardrails
|
|
8
|
+
module Output
|
|
9
|
+
# Validates LLM output against a JSON schema. Expects output to be parseable JSON.
|
|
10
|
+
class SchemaValidator < Guardrails::OutputGuardrail
|
|
11
|
+
def initialize(schema:, parse_json: true)
|
|
12
|
+
@schema = schema.is_a?(Hash) ? schema : JSON.parse(File.read(schema))
|
|
13
|
+
@parse_json = parse_json
|
|
14
|
+
@schemer = JSONSchemer.schema(@schema)
|
|
15
|
+
end
|
|
16
|
+
|
|
17
|
+
def validate(output)
|
|
18
|
+
data = @parse_json ? parse_output(output) : output
|
|
19
|
+
errors = @schemer.validate(data).to_a
|
|
20
|
+
return if errors.empty?
|
|
21
|
+
|
|
22
|
+
messages = errors.map { |e| e["error"] || e[:error] || e.to_s }
|
|
23
|
+
raise AiSafetyRails::ValidationError, "Schema validation failed: #{messages.join('; ')}"
|
|
24
|
+
end
|
|
25
|
+
|
|
26
|
+
private
|
|
27
|
+
|
|
28
|
+
def parse_output(output)
|
|
29
|
+
return output if output.is_a?(Hash) || output.is_a?(Array)
|
|
30
|
+
JSON.parse(output.to_s)
|
|
31
|
+
rescue JSON::ParserError => e
|
|
32
|
+
raise AiSafetyRails::ValidationError, "Output is not valid JSON: #{e.message}"
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
end
|
|
36
|
+
end
|
|
37
|
+
end
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module AiSafetyRails
|
|
4
|
+
module Guardrails
|
|
5
|
+
# Base for input guardrails: process input string (or messages) before calling the LLM.
|
|
6
|
+
class InputGuardrail
|
|
7
|
+
def process(input)
|
|
8
|
+
raise NotImplementedError, "#{self.class}#process(input) must be implemented"
|
|
9
|
+
end
|
|
10
|
+
end
|
|
11
|
+
|
|
12
|
+
# Base for output guardrails: validate or transform response after the LLM returns.
|
|
13
|
+
class OutputGuardrail
|
|
14
|
+
def validate(output)
|
|
15
|
+
raise NotImplementedError, "#{self.class}#validate(output) must be implemented"
|
|
16
|
+
end
|
|
17
|
+
end
|
|
18
|
+
end
|
|
19
|
+
end
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module AiSafetyRails
|
|
4
|
+
class Railtie < Rails::Railtie
|
|
5
|
+
config.after_initialize do
|
|
6
|
+
# Optional: load config/guardrails.yml when present
|
|
7
|
+
config_path = Rails.root.join("config", "guardrails.yml")
|
|
8
|
+
if config_path.exist?
|
|
9
|
+
Rails.application.config.ai_safety_rails = Rails.application.config_for(:guardrails)
|
|
10
|
+
end
|
|
11
|
+
end
|
|
12
|
+
end
|
|
13
|
+
end
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "ai_safety_rails/version"
|
|
4
|
+
require "ai_safety_rails/guardrails"
|
|
5
|
+
require "ai_safety_rails/guardrails/middleware"
|
|
6
|
+
require "ai_safety_rails/guardrails/input/pii_redactor"
|
|
7
|
+
require "ai_safety_rails/guardrails/output/schema_validator"
|
|
8
|
+
require "ai_safety_rails/evaluation"
|
|
9
|
+
require "ai_safety_rails/evaluation/runner"
|
|
10
|
+
require "ai_safety_rails/evaluation/set_loader"
|
|
11
|
+
|
|
12
|
+
# Optional Rails integration (loads only when Rails is defined)
|
|
13
|
+
require "ai_safety_rails/railtie" if defined?(Rails::Railtie)
|
|
14
|
+
|
|
15
|
+
module AiSafetyRails
|
|
16
|
+
class Error < StandardError; end
|
|
17
|
+
class GuardrailError < Error; end
|
|
18
|
+
class ValidationError < GuardrailError; end
|
|
19
|
+
end
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "rails/generators/named_base"
|
|
4
|
+
|
|
5
|
+
module AiSafetyRails
|
|
6
|
+
class EvalSetGenerator < Rails::Generators::NamedBase
|
|
7
|
+
source_root File.expand_path("templates", __dir__)
|
|
8
|
+
desc "Generate an evaluation set under config/llm_evals/"
|
|
9
|
+
|
|
10
|
+
def create_eval_set
|
|
11
|
+
empty_directory "config/llm_evals"
|
|
12
|
+
template "eval_set.yml.tt", "config/llm_evals/#{file_name}.yml"
|
|
13
|
+
end
|
|
14
|
+
end
|
|
15
|
+
end
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# Evaluation set: <%= file_name %>
|
|
2
|
+
# Run with: bundle exec ai_safety_rails eval config/llm_evals/<%= file_name %>.yml
|
|
3
|
+
# Or: rake "evals:run[config/llm_evals/<%= file_name %>.yml]"
|
|
4
|
+
|
|
5
|
+
name: <%= file_name %>
|
|
6
|
+
description: "<%= class_name %> evaluation examples"
|
|
7
|
+
|
|
8
|
+
examples:
|
|
9
|
+
- id: 1
|
|
10
|
+
input: "Example prompt or user message"
|
|
11
|
+
expectations:
|
|
12
|
+
- type: valid_json
|
|
13
|
+
- id: 2
|
|
14
|
+
input: "Another example"
|
|
15
|
+
expectations:
|
|
16
|
+
- type: has_key
|
|
17
|
+
key: category
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "rails/generators/named_base"
|
|
4
|
+
|
|
5
|
+
module AiSafetyRails
|
|
6
|
+
class GuardrailGenerator < Rails::Generators::NamedBase
|
|
7
|
+
source_root File.expand_path("templates", __dir__)
|
|
8
|
+
desc "Generate a guardrail config snippet (e.g. pii_redaction)"
|
|
9
|
+
|
|
10
|
+
def create_guardrail
|
|
11
|
+
template "guardrail.rb.tt", "app/guardrails/#{file_name}_guardrail.rb"
|
|
12
|
+
template "guardrail_spec.rb.tt", "spec/guardrails/#{file_name}_guardrail_spec.rb" if spec_style?
|
|
13
|
+
end
|
|
14
|
+
|
|
15
|
+
private
|
|
16
|
+
|
|
17
|
+
def spec_style?
|
|
18
|
+
File.directory?(Rails.root.join("spec"))
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
end
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Custom guardrail: <%= file_name %>
|
|
4
|
+
# Register in config/guardrails.yml or wrap your LLM client with AiSafetyRails::Guardrails::Middleware.
|
|
5
|
+
#
|
|
6
|
+
# Example:
|
|
7
|
+
# input_guardrails: [AiSafetyRails::Guardrails::Input::PiiRedactor.new]
|
|
8
|
+
# output_guardrails: [AiSafetyRails::Guardrails::Output::SchemaValidator.new(schema: MySchema)]
|
|
9
|
+
|
|
10
|
+
module Guardrails
|
|
11
|
+
class <%= class_name %>Guardrail
|
|
12
|
+
def process(input)
|
|
13
|
+
# Input guardrail: process input before calling the LLM
|
|
14
|
+
input
|
|
15
|
+
end
|
|
16
|
+
end
|
|
17
|
+
end
|
metadata
ADDED
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
|
2
|
+
name: ai_safety_rails
|
|
3
|
+
version: !ruby/object:Gem::Version
|
|
4
|
+
version: 0.1.0
|
|
5
|
+
platform: ruby
|
|
6
|
+
authors:
|
|
7
|
+
- Raj Panchal
|
|
8
|
+
autorequire:
|
|
9
|
+
bindir: exe
|
|
10
|
+
cert_chain: []
|
|
11
|
+
date: 2026-02-17 00:00:00.000000000 Z
|
|
12
|
+
dependencies:
|
|
13
|
+
- !ruby/object:Gem::Dependency
|
|
14
|
+
name: json_schemer
|
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
|
16
|
+
requirements:
|
|
17
|
+
- - "~>"
|
|
18
|
+
- !ruby/object:Gem::Version
|
|
19
|
+
version: '0.2'
|
|
20
|
+
type: :runtime
|
|
21
|
+
prerelease: false
|
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
23
|
+
requirements:
|
|
24
|
+
- - "~>"
|
|
25
|
+
- !ruby/object:Gem::Version
|
|
26
|
+
version: '0.2'
|
|
27
|
+
- !ruby/object:Gem::Dependency
|
|
28
|
+
name: minitest
|
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
|
30
|
+
requirements:
|
|
31
|
+
- - "~>"
|
|
32
|
+
- !ruby/object:Gem::Version
|
|
33
|
+
version: '5.0'
|
|
34
|
+
type: :development
|
|
35
|
+
prerelease: false
|
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
37
|
+
requirements:
|
|
38
|
+
- - "~>"
|
|
39
|
+
- !ruby/object:Gem::Version
|
|
40
|
+
version: '5.0'
|
|
41
|
+
- !ruby/object:Gem::Dependency
|
|
42
|
+
name: rake
|
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
|
44
|
+
requirements:
|
|
45
|
+
- - "~>"
|
|
46
|
+
- !ruby/object:Gem::Version
|
|
47
|
+
version: '13.0'
|
|
48
|
+
type: :development
|
|
49
|
+
prerelease: false
|
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
51
|
+
requirements:
|
|
52
|
+
- - "~>"
|
|
53
|
+
- !ruby/object:Gem::Version
|
|
54
|
+
version: '13.0'
|
|
55
|
+
description: Provider-agnostic guardrails (PII redaction, schema validation) and evaluation/regression
|
|
56
|
+
harness for LLM prompts and models.
|
|
57
|
+
email:
|
|
58
|
+
- rajpanchal2810@gmail.com
|
|
59
|
+
executables:
|
|
60
|
+
- ai_safety_rails
|
|
61
|
+
extensions: []
|
|
62
|
+
extra_rdoc_files: []
|
|
63
|
+
files:
|
|
64
|
+
- LICENSE.txt
|
|
65
|
+
- README.md
|
|
66
|
+
- exe/ai_safety_rails
|
|
67
|
+
- lib/ai_safety_rails.rb
|
|
68
|
+
- lib/ai_safety_rails/evaluation.rb
|
|
69
|
+
- lib/ai_safety_rails/evaluation/runner.rb
|
|
70
|
+
- lib/ai_safety_rails/evaluation/set_loader.rb
|
|
71
|
+
- lib/ai_safety_rails/guardrails.rb
|
|
72
|
+
- lib/ai_safety_rails/guardrails/input/pii_redactor.rb
|
|
73
|
+
- lib/ai_safety_rails/guardrails/middleware.rb
|
|
74
|
+
- lib/ai_safety_rails/guardrails/output/schema_validator.rb
|
|
75
|
+
- lib/ai_safety_rails/railtie.rb
|
|
76
|
+
- lib/ai_safety_rails/version.rb
|
|
77
|
+
- lib/generators/ai_safety_rails/eval_set/eval_set_generator.rb
|
|
78
|
+
- lib/generators/ai_safety_rails/eval_set/templates/eval_set.yml.tt
|
|
79
|
+
- lib/generators/ai_safety_rails/guardrail/guardrail_generator.rb
|
|
80
|
+
- lib/generators/ai_safety_rails/guardrail/templates/guardrail.rb.tt
|
|
81
|
+
- lib/generators/ai_safety_rails/guardrail/templates/guardrail_spec.rb.tt
|
|
82
|
+
homepage: https://github.com/rajpanchal01/ai_safety_rails
|
|
83
|
+
licenses:
|
|
84
|
+
- MIT
|
|
85
|
+
metadata:
|
|
86
|
+
homepage_uri: https://github.com/rajpanchal01/ai_safety_rails
|
|
87
|
+
source_code_uri: https://github.com/rajpanchal01/ai_safety_rails
|
|
88
|
+
changelog_uri: https://github.com/rajpanchal01/ai_safety_rails/blob/main/CHANGELOG.md
|
|
89
|
+
post_install_message:
|
|
90
|
+
rdoc_options: []
|
|
91
|
+
require_paths:
|
|
92
|
+
- lib
|
|
93
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
|
94
|
+
requirements:
|
|
95
|
+
- - ">="
|
|
96
|
+
- !ruby/object:Gem::Version
|
|
97
|
+
version: '3.0'
|
|
98
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
|
99
|
+
requirements:
|
|
100
|
+
- - ">="
|
|
101
|
+
- !ruby/object:Gem::Version
|
|
102
|
+
version: '0'
|
|
103
|
+
requirements: []
|
|
104
|
+
rubygems_version: 3.0.9
|
|
105
|
+
signing_key:
|
|
106
|
+
specification_version: 4
|
|
107
|
+
summary: LLM guardrails and evaluation harness for Ruby
|
|
108
|
+
test_files: []
|