braintrust 0.0.3 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 127f10c355ef8d5b0968dcb3197d9612a68455087ad704fa17e5dcb41512ad6d
4
- data.tar.gz: 0e1d31073d9d71f43a74f7d4b37cea8644b119afc282501ee4b311c6cad059ad
3
+ metadata.gz: 6321acf7b780922ed97ea3cc57dde47a52947a10650a082dcfd9af780056d99a
4
+ data.tar.gz: 67c181e53537829931de704c7503cc056646652f9c1a61d914bc1ee0b7af69a2
5
5
  SHA512:
6
- metadata.gz: 654fae04c4cf51fa32b27864b92ac832e3e37472bfaabe20871aa1899ba027ae4bff6e0a054f833fcb7afe3ef0d3870479ecb824f4c7af8180ca8ea65b21a41c
7
- data.tar.gz: c03683f9793b38477986ade0694f38178434b70c1eea7a1870c2b80a89ad45278fe54f5c7f880eec22719ca0698fab0ad8de5efba5123ee1692c18b0a258d94c
6
+ metadata.gz: bb8546fdbf0a448016a1d31ceb8729a40be59e0d8d081ef275f763a11dbb2f5df0134ec52fc3b1c15c41d9dcdf42fbbe6becaf00ab7ac882c8f2f7e173a9a61f
7
+ data.tar.gz: 41e6d13504302a3b3ec26697cb50ce4736040d910c8e293d37088311922daa77274fc4214c86543ca21209bf56e82b2f6f00d5fbc2d7d0d0baad6f1e77cc48ff
data/README.md CHANGED
@@ -1,7 +1,7 @@
1
1
  # Braintrust Ruby SDK
2
2
 
3
3
  [![Gem Version](https://img.shields.io/gem/v/braintrust.svg)](https://rubygems.org/gems/braintrust)
4
- [![Documentation](https://img.shields.io/badge/docs-rubydoc.info-blue.svg)](https://rubydoc.info/gems/braintrust)
4
+ [![Documentation](https://img.shields.io/badge/docs-gemdocs.org-blue.svg)](https://gemdocs.org/gems/braintrust/)
5
5
  ![Beta](https://img.shields.io/badge/status-beta-yellow)
6
6
 
7
7
  ## Overview
@@ -171,6 +171,56 @@ puts "View trace at: #{Braintrust::Trace.permalink(root_span)}"
171
171
  OpenTelemetry.tracer_provider.shutdown
172
172
  ```
173
173
 
174
+ ### Attachments
175
+
176
+ Attachments allow you to log binary data (images, PDFs, audio, etc.) as part of your traces. This is particularly useful for multimodal AI applications like vision models.
177
+
178
+ ```ruby
179
+ require "braintrust"
180
+ require "braintrust/trace/attachment"
181
+
182
+ Braintrust.init
183
+
184
+ tracer = OpenTelemetry.tracer_provider.tracer("vision-app")
185
+
186
+ tracer.in_span("analyze-image") do |span|
187
+ # Create attachment from file
188
+ att = Braintrust::Trace::Attachment.from_file(
189
+ Braintrust::Trace::Attachment::IMAGE_PNG,
190
+ "./photo.png"
191
+ )
192
+
193
+ # Build message with attachment (OpenAI/Anthropic format)
194
+ messages = [
195
+ {
196
+ role: "user",
197
+ content: [
198
+ {type: "text", text: "What's in this image?"},
199
+ att.to_h # Converts to {"type" => "base64_attachment", "content" => "data:..."}
200
+ ]
201
+ }
202
+ ]
203
+
204
+ # Log to trace
205
+ span.set_attribute("braintrust.input_json", JSON.generate(messages))
206
+ end
207
+
208
+ OpenTelemetry.tracer_provider.shutdown
209
+ ```
210
+
211
+ You can create attachments from bytes, files, or URLs:
212
+
213
+ ```ruby
214
+ # From bytes
215
+ att = Braintrust::Trace::Attachment.from_bytes("image/jpeg", image_data)
216
+
217
+ # From file
218
+ att = Braintrust::Trace::Attachment.from_file("application/pdf", "./doc.pdf")
219
+
220
+ # From URL
221
+ att = Braintrust::Trace::Attachment.from_url("https://example.com/image.png")
222
+ ```
223
+
174
224
  ## Features
175
225
 
176
226
  - **Evaluations**: Run systematic evaluations of your AI systems with custom scoring functions
@@ -187,13 +237,14 @@ Check out the [`examples/`](./examples/) directory for complete working examples
187
237
  - [trace.rb](./examples/trace.rb) - Manual span creation and tracing
188
238
  - [openai.rb](./examples/openai.rb) - Automatically trace OpenAI API calls
189
239
  - [anthropic.rb](./examples/anthropic.rb) - Automatically trace Anthropic API calls
240
+ - [trace/trace_attachments.rb](./examples/trace/trace_attachments.rb) - Log attachments (images, PDFs) in traces
190
241
  - [eval/dataset.rb](./examples/eval/dataset.rb) - Run evaluations using datasets stored in Braintrust
191
242
  - [eval/remote_functions.rb](./examples/eval/remote_functions.rb) - Use remote scoring functions
192
243
 
193
244
  ## Documentation
194
245
 
195
246
  - [Braintrust Documentation](https://www.braintrust.dev/docs)
196
- - [API Documentation](https://rubydoc.info/gems/braintrust)
247
+ - [API Documentation](https://gemdocs.org/gems/braintrust/)
197
248
 
198
249
  ## Contributing
199
250
 
@@ -4,14 +4,18 @@ module Braintrust
4
4
  # Configuration object that reads from environment variables
5
5
  # and allows overriding with explicit options
6
6
  class Config
7
- attr_reader :api_key, :org_name, :default_project, :app_url, :api_url
7
+ attr_reader :api_key, :org_name, :default_project, :app_url, :api_url,
8
+ :filter_ai_spans, :span_filter_funcs
8
9
 
9
- def initialize(api_key: nil, org_name: nil, default_project: nil, app_url: nil, api_url: nil)
10
+ def initialize(api_key: nil, org_name: nil, default_project: nil, app_url: nil, api_url: nil,
11
+ filter_ai_spans: nil, span_filter_funcs: nil)
10
12
  @api_key = api_key
11
13
  @org_name = org_name
12
14
  @default_project = default_project
13
15
  @app_url = app_url
14
16
  @api_url = api_url
17
+ @filter_ai_spans = filter_ai_spans
18
+ @span_filter_funcs = span_filter_funcs || []
15
19
  end
16
20
 
17
21
  # Create a Config from environment variables, with option overrides
@@ -21,14 +25,27 @@ module Braintrust
21
25
  # @param default_project [String, nil] Default project (overrides BRAINTRUST_DEFAULT_PROJECT env var)
22
26
  # @param app_url [String, nil] App URL (overrides BRAINTRUST_APP_URL env var)
23
27
  # @param api_url [String, nil] API URL (overrides BRAINTRUST_API_URL env var)
28
+ # @param filter_ai_spans [Boolean, nil] Enable AI span filtering (overrides BRAINTRUST_OTEL_FILTER_AI_SPANS env var)
29
+ # @param span_filter_funcs [Array<Proc>, nil] Custom span filter functions
24
30
  # @return [Config] the created config
25
- def self.from_env(api_key: nil, org_name: nil, default_project: nil, app_url: nil, api_url: nil)
31
+ def self.from_env(api_key: nil, org_name: nil, default_project: nil, app_url: nil, api_url: nil,
32
+ filter_ai_spans: nil, span_filter_funcs: nil)
33
+ # Parse filter_ai_spans from ENV if not explicitly provided
34
+ env_filter_ai_spans = ENV["BRAINTRUST_OTEL_FILTER_AI_SPANS"]
35
+ filter_ai_spans_value = if filter_ai_spans.nil?
36
+ env_filter_ai_spans&.downcase == "true"
37
+ else
38
+ filter_ai_spans
39
+ end
40
+
26
41
  new(
27
42
  api_key: api_key || ENV["BRAINTRUST_API_KEY"],
28
43
  org_name: org_name || ENV["BRAINTRUST_ORG_NAME"],
29
44
  default_project: default_project || ENV["BRAINTRUST_DEFAULT_PROJECT"],
30
45
  app_url: app_url || ENV["BRAINTRUST_APP_URL"] || "https://www.braintrust.dev",
31
- api_url: api_url || ENV["BRAINTRUST_API_URL"] || "https://api.braintrust.dev"
46
+ api_url: api_url || ENV["BRAINTRUST_API_URL"] || "https://api.braintrust.dev",
47
+ filter_ai_spans: filter_ai_spans_value,
48
+ span_filter_funcs: span_filter_funcs
32
49
  )
33
50
  end
34
51
  end
@@ -9,6 +9,170 @@ require "opentelemetry/sdk"
9
9
  require "json"
10
10
 
11
11
  module Braintrust
12
+ # Evaluation framework for testing AI systems with custom test cases and scoring functions.
13
+ #
14
+ # The Eval module provides tools for running systematic evaluations of your AI systems. An
15
+ # evaluation consists of:
16
+ # - **Cases**: Test inputs with optional expected outputs
17
+ # - **Task**: The code/model being evaluated
18
+ # - **Scorers**: Functions that judge the quality of outputs
19
+ #
20
+ # @example Basic evaluation with inline cases
21
+ # require "braintrust"
22
+ #
23
+ # Braintrust.init
24
+ #
25
+ # # Define a simple task (the code being evaluated)
26
+ # task = ->(input) { input.include?("a") ? "fruit" : "vegetable" }
27
+ #
28
+ # # Run evaluation with inline cases
29
+ # Braintrust::Eval.run(
30
+ # project: "my-project",
31
+ # experiment: "food-classifier",
32
+ # cases: [
33
+ # {input: "apple", expected: "fruit"},
34
+ # {input: "carrot", expected: "vegetable"},
35
+ # {input: "banana", expected: "fruit"}
36
+ # ],
37
+ # task: task,
38
+ # scorers: [
39
+ # # Named scorer with Eval.scorer
40
+ # Braintrust::Eval.scorer("exact_match") do |input, expected, output|
41
+ # output == expected ? 1.0 : 0.0
42
+ # end
43
+ # ]
44
+ # )
45
+ #
46
+ # @example Different ways to define scorers (recommended patterns)
47
+ # # Method reference (auto-uses method name as scorer name)
48
+ # def exact_match(input, expected, output)
49
+ # output == expected ? 1.0 : 0.0
50
+ # end
51
+ #
52
+ # # Named scorer with Eval.scorer
53
+ # case_insensitive = Braintrust::Eval.scorer("case_insensitive") do |input, expected, output|
54
+ # output.downcase == expected.downcase ? 1.0 : 0.0
55
+ # end
56
+ #
57
+ # # Callable class with name method
58
+ # class FuzzyMatch
59
+ # def name
60
+ # "fuzzy_match"
61
+ # end
62
+ #
63
+ # def call(input, expected, output, metadata = {})
64
+ # threshold = metadata[:threshold] || 0.8
65
+ # # scoring logic here
66
+ # 1.0
67
+ # end
68
+ # end
69
+ #
70
+ # # Anonymous lambda that returns named score object
71
+ # multi_score = ->(input, expected, output) {
72
+ # [
73
+ # {name: "exact_match", score: output == expected ? 1.0 : 0.0},
74
+ # {name: "length_match", score: output.length == expected.length ? 1.0 : 0.0}
75
+ # ]
76
+ # }
77
+ #
78
+ # # All can be used together
79
+ # Braintrust::Eval.run(
80
+ # project: "my-project",
81
+ # experiment: "scorer-examples",
82
+ # cases: [{input: "test", expected: "test"}],
83
+ # task: ->(input) { input },
84
+ # scorers: [method(:exact_match), case_insensitive, FuzzyMatch.new, multi_score]
85
+ # )
86
+ #
87
+ # @example Different ways to define tasks
88
+ # # Lambda
89
+ # task_lambda = ->(input) { "result" }
90
+ #
91
+ # # Proc
92
+ # task_proc = proc { |input| "result" }
93
+ #
94
+ # # Method reference
95
+ # def my_task(input)
96
+ # "result"
97
+ # end
98
+ # task_method = method(:my_task)
99
+ #
100
+ # # Callable class
101
+ # class MyTask
102
+ # def call(input)
103
+ # "result"
104
+ # end
105
+ # end
106
+ # task_class = MyTask.new
107
+ #
108
+ # # All of these can be used as the task parameter
109
+ # Braintrust::Eval.run(
110
+ # project: "my-project",
111
+ # experiment: "task-examples",
112
+ # cases: [{input: "test"}],
113
+ # task: task_lambda, # or task_proc, task_method, task_class
114
+ # scorers: [
115
+ # Braintrust::Eval.scorer("my_scorer") { |input, expected, output| 1.0 }
116
+ # ]
117
+ # )
118
+ #
119
+ # @example Using datasets instead of inline cases
120
+ # # Fetch cases from a dataset stored in Braintrust
121
+ # Braintrust::Eval.run(
122
+ # project: "my-project",
123
+ # experiment: "with-dataset",
124
+ # dataset: "my-dataset-name", # fetches from same project
125
+ # task: ->(input) { "result" },
126
+ # scorers: [
127
+ # Braintrust::Eval.scorer("my_scorer") { |input, expected, output| 1.0 }
128
+ # ]
129
+ # )
130
+ #
131
+ # # Or with more options
132
+ # Braintrust::Eval.run(
133
+ # project: "my-project",
134
+ # experiment: "with-dataset-options",
135
+ # dataset: {
136
+ # name: "my-dataset",
137
+ # project: "other-project",
138
+ # version: "1.0",
139
+ # limit: 100
140
+ # },
141
+ # task: ->(input) { "result" },
142
+ # scorers: [
143
+ # Braintrust::Eval.scorer("my_scorer") { |input, expected, output| 1.0 }
144
+ # ]
145
+ # )
146
+ #
147
+ # @example Using metadata and tags
148
+ # Braintrust::Eval.run(
149
+ # project: "my-project",
150
+ # experiment: "with-metadata",
151
+ # cases: [
152
+ # {
153
+ # input: "apple",
154
+ # expected: "fruit",
155
+ # tags: ["tropical", "sweet"],
156
+ # metadata: {threshold: 0.9, category: "produce"}
157
+ # }
158
+ # ],
159
+ # task: ->(input) { "fruit" },
160
+ # scorers: [
161
+ # # Scorer can access case metadata
162
+ # Braintrust::Eval.scorer("threshold_match") do |input, expected, output, metadata|
163
+ # threshold = metadata[:threshold] || 0.5
164
+ # # scoring logic using threshold
165
+ # 1.0
166
+ # end
167
+ # ],
168
+ # # Experiment-level tags and metadata
169
+ # tags: ["v1", "production"],
170
+ # metadata: {
171
+ # model: "gpt-4",
172
+ # temperature: 0.7,
173
+ # version: "1.0.0"
174
+ # }
175
+ # )
12
176
  module Eval
13
177
  class << self
14
178
  # Create a scorer with a name and callable
@@ -6,7 +6,7 @@ module Braintrust
6
6
  # State object that holds Braintrust configuration
7
7
  # Thread-safe global state management
8
8
  class State
9
- attr_reader :api_key, :org_name, :org_id, :default_project, :app_url, :api_url, :proxy_url, :logged_in
9
+ attr_reader :api_key, :org_name, :org_id, :default_project, :app_url, :api_url, :proxy_url, :logged_in, :config
10
10
 
11
11
  @mutex = Mutex.new
12
12
  @global_state = nil
@@ -20,15 +20,20 @@ module Braintrust
20
20
  # @param blocking_login [Boolean] whether to block and login synchronously (default: false)
21
21
  # @param enable_tracing [Boolean] whether to enable OpenTelemetry tracing (default: true)
22
22
  # @param tracer_provider [TracerProvider, nil] Optional tracer provider to use
23
+ # @param filter_ai_spans [Boolean, nil] Enable AI span filtering
24
+ # @param span_filter_funcs [Array<Proc>, nil] Custom span filter functions
25
+ # @param exporter [Exporter, nil] Optional exporter override (for testing)
23
26
  # @return [State] the created state
24
- def self.from_env(api_key: nil, org_name: nil, default_project: nil, app_url: nil, api_url: nil, blocking_login: false, enable_tracing: true, tracer_provider: nil)
27
+ def self.from_env(api_key: nil, org_name: nil, default_project: nil, app_url: nil, api_url: nil, blocking_login: false, enable_tracing: true, tracer_provider: nil, filter_ai_spans: nil, span_filter_funcs: nil, exporter: nil)
25
28
  require_relative "config"
26
29
  config = Config.from_env(
27
30
  api_key: api_key,
28
31
  org_name: org_name,
29
32
  default_project: default_project,
30
33
  app_url: app_url,
31
- api_url: api_url
34
+ api_url: api_url,
35
+ filter_ai_spans: filter_ai_spans,
36
+ span_filter_funcs: span_filter_funcs
32
37
  )
33
38
  new(
34
39
  api_key: config.api_key,
@@ -38,11 +43,13 @@ module Braintrust
38
43
  api_url: config.api_url,
39
44
  blocking_login: blocking_login,
40
45
  enable_tracing: enable_tracing,
41
- tracer_provider: tracer_provider
46
+ tracer_provider: tracer_provider,
47
+ config: config,
48
+ exporter: exporter
42
49
  )
43
50
  end
44
51
 
45
- def initialize(api_key: nil, org_name: nil, org_id: nil, default_project: nil, app_url: nil, api_url: nil, proxy_url: nil, blocking_login: false, enable_tracing: true, tracer_provider: nil)
52
+ def initialize(api_key: nil, org_name: nil, org_id: nil, default_project: nil, app_url: nil, api_url: nil, proxy_url: nil, blocking_login: false, enable_tracing: true, tracer_provider: nil, config: nil, exporter: nil)
46
53
  # Instance-level mutex for thread-safe login
47
54
  @login_mutex = Mutex.new
48
55
  raise ArgumentError, "api_key is required" if api_key.nil? || api_key.empty?
@@ -55,6 +62,7 @@ module Braintrust
55
62
  @api_url = api_url
56
63
  @proxy_url = proxy_url
57
64
  @logged_in = false
65
+ @config = config
58
66
 
59
67
  # Perform login after state setup
60
68
  if blocking_login
@@ -66,7 +74,7 @@ module Braintrust
66
74
  # Setup tracing if requested
67
75
  if enable_tracing
68
76
  require_relative "trace"
69
- Trace.setup(self, tracer_provider)
77
+ Trace.setup(self, tracer_provider, exporter: exporter)
70
78
  end
71
79
  end
72
80
 
@@ -0,0 +1,138 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "base64"
4
+ require "net/http"
5
+ require "uri"
6
+
7
+ module Braintrust
8
+ module Trace
9
+ # Attachment represents binary data (images, audio, PDFs, etc.) that can be logged
10
+ # as part of traces in Braintrust. Attachments are stored securely and can be viewed
11
+ # in the Braintrust UI.
12
+ #
13
+ # Attachments are particularly useful for multimodal AI applications, such as vision
14
+ # models that process images.
15
+ #
16
+ # @example Create attachment from file
17
+ # att = Braintrust::Trace::Attachment.from_file("image/png", "./photo.png")
18
+ # data_url = att.to_data_url
19
+ # # => "..."
20
+ #
21
+ # @example Create attachment from bytes
22
+ # att = Braintrust::Trace::Attachment.from_bytes("image/jpeg", image_bytes)
23
+ # message = att.to_message
24
+ # # => {"type" => "base64_attachment", "content" => "data:image/jpeg;base64,..."}
25
+ #
26
+ # @example Use in a trace span
27
+ # att = Braintrust::Trace::Attachment.from_file("image/png", "./photo.png")
28
+ # messages = [
29
+ # {
30
+ # role: "user",
31
+ # content: [
32
+ # {type: "text", text: "What's in this image?"},
33
+ # att.to_h # Converts to {"type" => "base64_attachment", "content" => "..."}
34
+ # ]
35
+ # }
36
+ # ]
37
+ # span.set_attribute("braintrust.input_json", JSON.generate(messages))
38
+ class Attachment
39
+ # Common MIME type constants for convenience
40
+ IMAGE_PNG = "image/png"
41
+ IMAGE_JPEG = "image/jpeg"
42
+ IMAGE_JPG = "image/jpg"
43
+ IMAGE_GIF = "image/gif"
44
+ IMAGE_WEBP = "image/webp"
45
+ TEXT_PLAIN = "text/plain"
46
+ APPLICATION_PDF = "application/pdf"
47
+
48
+ # @!visibility private
49
+ def initialize(content_type, data)
50
+ @content_type = content_type
51
+ @data = data
52
+ end
53
+
54
+ # Creates an attachment from raw bytes.
55
+ #
56
+ # @param content_type [String] MIME type of the data (e.g., "image/png")
57
+ # @param data [String] Binary data as a string
58
+ # @return [Attachment] New attachment instance
59
+ #
60
+ # @example
61
+ # image_data = File.binread("photo.png")
62
+ # att = Braintrust::Trace::Attachment.from_bytes("image/png", image_data)
63
+ def self.from_bytes(content_type, data)
64
+ new(content_type, data)
65
+ end
66
+
67
+ # Creates an attachment by reading from a file.
68
+ #
69
+ # @param content_type [String] MIME type of the file (e.g., "image/png")
70
+ # @param path [String] Path to the file to read
71
+ # @return [Attachment] New attachment instance
72
+ # @raise [Errno::ENOENT] If the file does not exist
73
+ #
74
+ # @example
75
+ # att = Braintrust::Trace::Attachment.from_file("image/png", "./photo.png")
76
+ def self.from_file(content_type, path)
77
+ data = File.binread(path)
78
+ new(content_type, data)
79
+ end
80
+
81
+ # Creates an attachment by fetching data from a URL.
82
+ #
83
+ # The content type is inferred from the Content-Type header in the HTTP response.
84
+ # If the header is not present, it falls back to "application/octet-stream".
85
+ #
86
+ # @param url [String] URL to fetch
87
+ # @return [Attachment] New attachment instance
88
+ # @raise [StandardError] If the HTTP request fails
89
+ #
90
+ # @example
91
+ # att = Braintrust::Trace::Attachment.from_url("https://example.com/image.png")
92
+ def self.from_url(url)
93
+ uri = URI.parse(url)
94
+ response = Net::HTTP.get_response(uri)
95
+
96
+ unless response.is_a?(Net::HTTPSuccess)
97
+ raise StandardError, "Failed to fetch URL: #{response.code} #{response.message}"
98
+ end
99
+
100
+ content_type = response.content_type || "application/octet-stream"
101
+ new(content_type, response.body)
102
+ end
103
+
104
+ # Converts the attachment to a data URL format.
105
+ #
106
+ # @return [String] Data URL in the format "data:<content-type>;base64,<encoded-data>"
107
+ #
108
+ # @example
109
+ # att = Braintrust::Trace::Attachment.from_bytes("image/png", image_data)
110
+ # att.to_data_url
111
+ # # => "..."
112
+ def to_data_url
113
+ encoded = Base64.strict_encode64(@data)
114
+ "data:#{@content_type};base64,#{encoded}"
115
+ end
116
+
117
+ # Converts the attachment to a message format suitable for LLM APIs.
118
+ #
119
+ # @return [Hash] Message hash with "type" and "content" keys
120
+ #
121
+ # @example
122
+ # att = Braintrust::Trace::Attachment.from_bytes("image/png", image_data)
123
+ # att.to_message
124
+ # # => {"type" => "base64_attachment", "content" => "data:image/png;base64,..."}
125
+ def to_message
126
+ {
127
+ "type" => "base64_attachment",
128
+ "content" => to_data_url
129
+ }
130
+ end
131
+
132
+ # Alias for {#to_message}. Converts the attachment to a hash representation.
133
+ #
134
+ # @return [Hash] Same as {#to_message}
135
+ alias_method :to_h, :to_message
136
+ end
137
+ end
138
+ end