RubyGems - ai-agents - Versions diffs - 0.8.0 → 0.9.0 - Mend

ai-agents 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

checksums.yaml +4 -4
data/.env.example +6 -0
data/CHANGELOG.md +19 -0
data/CLAUDE.md +26 -0
data/README.md +16 -0
data/docs/guides/instrumentation.md +268 -0
data/docs/guides.md +1 -0
data/examples/isp-support/interactive.rb +46 -1
data/lib/agents/agent_runner.rb +28 -1
data/lib/agents/callback_manager.rb +27 -2
data/lib/agents/instrumentation/constants.rb +35 -0
data/lib/agents/instrumentation/tracing_callbacks.rb +339 -0
data/lib/agents/instrumentation.rb +109 -0
data/lib/agents/runner.rb +33 -4
data/lib/agents/tool_wrapper.rb +3 -3
data/lib/agents/version.rb +1 -1
metadata +6 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 6391e30443ff9e226e6b3bf3f629f3cd996cbb7fe4479c7a7dc437c82544ae2f
-  data.tar.gz: 22fc6ee4f3130006c1dd1f6c8fb704a176457d35602a434088d25cea5dd3c949
+  metadata.gz: df386be7e27f87111901954d72e4caa3e26a1d789ec113fbf9e2da9d2f87587e
+  data.tar.gz: f05b6827852966d0514abae61c7732a34ea92aeebe30855132ec8bee39c1e4c2
 SHA512:
-  metadata.gz: ae3866cfbec885088c5b41b0e91bbc8532fc3115ed981c3428f56c09b40c1a75bc9bb1277ed1991f19cee4c43118c3f9a5d9ea8d6ecc07d29ad03117a77666e6
-  data.tar.gz: b7de2e98dc3ce52b4c80b7b1bcfa4fb1f07f62176d861ab61b9a1092d2de4273657d54edfa18d6afd83acccee21c89e465f18539b8966c5f9645a673b4e5a3c4
+  metadata.gz: 2180b6b495519d34ff4762cd027d6079fb39ad91117242f6f366779fd7360c7bbb55521761e151ad246eee3004363de9bf768ca953a8e69a1e39e2f2dfeb5345
+  data.tar.gz: 41dadb09fd62a2ce47b063c6b85685f1efa8be73dbee467e83ad69cdb32793926aaadacbce7bfa820960c0d5f35e57325a7078733c19bef0e1121076bc6a9002

data/.env.example ADDED Viewed

@@ -0,0 +1,6 @@
+export OPENAI_API_KEY=sk-xxx
+export OPENAI_MODEL=gpt-4.1-nano
+export RUN_LIVE_LLM=true
+export LANGFUSE_PUBLIC_KEY=pk-lf-xxx
+export LANGFUSE_SECRET_KEY=sk-lf-xxx
+export LANGFUSE_HOST=https://cloud.langfuse.com

data/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.9.0] - 2026-02-09
+### Added
+- **OpenTelemetry Instrumentation**: Optional OTel tracing for LLM calls, tool executions, and agent handoffs
+  - `Agents::Instrumentation.install(runner, tracer:)` registers tracing callbacks on any runner
+  - Produces nested spans: `agents.run` → `agents.tool.*` → `agents.llm_call` (GENERATION)
+  - Compatible with Langfuse and other OTel-compatible backends out of the box
+  - Supports `langfuse.session.id` via `context[:session_id]` for session grouping
+  - Custom static and dynamic span attributes via `span_attributes` and `attribute_provider`
+  - Idempotent installation with thread-safe mutex guard
+  - No hard dependency on `opentelemetry-api` — gracefully no-ops if the gem is absent
+- **New callback events**: `on_chat_created` and `on_end_message` for hooking into RubyLLM chat lifecycle
+- Instrumentation guide at `docs/guides/instrumentation.md`
+- Live instrumentation smoke tests against real LLM providers
+### Changed
+- `CallbackManager` extended to support `chat_created` and `end_message` event types
+- `Runner` and `ToolWrapper` now fire the new lifecycle callbacks
 ## [0.8.0] - 2026-01-07
 ### Added

data/CLAUDE.md CHANGED Viewed

@@ -237,3 +237,29 @@ The SDK includes a comprehensive callback system for monitoring agent execution
 Callbacks are thread-safe and non-blocking. If a callback raises an exception, it won't interrupt agent execution. The system uses a centralized CallbackManager for efficient event handling.
 For detailed callback documentation, see `docs/concepts/callbacks.md`.
+## OpenTelemetry Instrumentation
+The SDK includes optional OpenTelemetry instrumentation (`lib/agents/instrumentation/`) that produces spans compatible with Langfuse and other OTel backends via `Agents::Instrumentation.install(runner, tracer:, ...)`.
+### Rules for Instrumentation Code
+1. **Double-counting prevention**: NEVER set `gen_ai.request.model` on container spans (`agents.run`, `agents.tool.*`). Only `agents.llm_call` GENERATION spans get this attribute. Langfuse sums costs from every span with this attribute — setting it on both parent and child causes double counting.
+2. **Langfuse needs BOTH trace-level and observation-level I/O**: Always set both `langfuse.trace.input`/`langfuse.trace.output` AND `langfuse.observation.input`/`langfuse.observation.output` on the root span. Trace-level shows at the top of the page; observation-level shows when you click the span in the sidebar. Missing one causes null/undefined display.
+3. **Never set attributes to empty strings**: Langfuse renders empty string attributes as "undefined". If a value is empty/nil, do NOT set the attribute at all. Use guards like `unless output.empty?`.
+4. **Hash/Array content must use `.to_json`, not `.to_s`**: When `response.content` is a Hash (from `response_schema` structured output), Ruby's `.to_s` produces `{"key" => "value"}` which is unreadable. Always check `content.is_a?(Hash) || content.is_a?(Array)` and use `.to_json`.
+5. **LLM span input = `chat.messages[0...-1]` as JSON**: Use `format_chat_messages(chat)` which returns the full chat history (excluding the current response) as a JSON array of `{role, content}` messages. This naturally includes tool results since they are part of `chat.messages`. Do NOT concatenate tool results into a flat string — keep the structured role separation.
+6. **Don't use `.delete` for shared tracing state**: If a value in `context[:__otel_tracing]` needs to be read by multiple callbacks, use `tracing[:key]` not `tracing.delete(:key)`. Delete is a destructive side-effect that breaks subsequent reads.
+7. **Per-call LLM spans via `on_end_message`**: Individual GENERATION spans are created by hooking into RubyLLM's `chat.on_end_message` (registered in `on_chat_created`). Each span is created and immediately finished. There is no `current_llm_span` in tracing state — only `current_tool_span` needs single-slot tracking.
+8. **Conversation history deduplication**: `Runner#last_message_matches?` checks if the last restored message already matches the current input. If so, uses `chat.complete` instead of `chat.ask(input)` to avoid sending the user message twice.
+### Reference: Chatwoot Instrumentation
+The Chatwoot codebase at `~/work/chatwoot` has a working reference implementation in `lib/integrations/llm_instrumentation_spans.rb` and `lib/integrations/llm_instrumentation.rb`. Key patterns: `messages.to_json` for observation input, `message.content.to_s` for output, `chat.messages[0...-1]` for history.

data/README.md CHANGED Viewed

@@ -209,6 +209,22 @@ Agents.configure do |config|
 end
 ```
+## 🔍 Observability
+Optional OpenTelemetry instrumentation for tracing agent execution, compatible with
+[Langfuse](https://langfuse.com) and other OTel backends.
+```ruby
+require 'agents/instrumentation'
+tracer = OpenTelemetry.tracer_provider.tracer('my-app')
+runner = Agents::Runner.with_agents(triage, billing, support)
+Agents::Instrumentation.install(runner, tracer: tracer)
+```
+See the [Instrumentation Guide](docs/guides/instrumentation.md) for setup details.
 ## 🤝 Contributing
 1. Fork the repository

data/docs/guides/instrumentation.md ADDED Viewed

@@ -0,0 +1,268 @@
+---
+layout: default
+title: OpenTelemetry Instrumentation
+parent: Guides
+nav_order: 7
+---
+# OpenTelemetry Instrumentation
+Trace agent execution, LLM calls, tool usage, and handoffs using OpenTelemetry. Compatible with [Langfuse](https://langfuse.com) and any OTel-compatible backend.
+## Overview
+The `Agents::Instrumentation` module produces OTel spans that give you full visibility into agent execution:
+- **LLM generation spans** with model name, token counts, and input/output
+- **Tool execution spans** with arguments and results
+- **Agent container spans** grouping related LLM and tool calls
+- **Handoff events** recording agent-to-agent transfers
+Spans follow the [GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) and include Langfuse-specific attributes for rich rendering in the Langfuse dashboard.
+## Setup
+### 1. Install dependencies
+Add to your Gemfile:
+```ruby
+gem "opentelemetry-sdk"
+gem "opentelemetry-exporter-otlp"
+```
+Then run `bundle install`.
+### 2. Configure the OTel SDK
+```ruby
+require "opentelemetry-sdk"
+require "opentelemetry-exporter-otlp"
+OpenTelemetry::SDK.configure do |c|
+  c.add_span_processor(
+    OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(
+      OpenTelemetry::Exporter::OTLP::Exporter.new(
+        endpoint: "https://your-otel-endpoint/v1/traces",
+        headers: { "Authorization" => "Bearer YOUR_TOKEN" }
+      )
+    )
+  )
+end
+```
+### 3. Install on a runner
+```ruby
+require "agents/instrumentation"
+tracer = OpenTelemetry.tracer_provider.tracer("my-app")
+runner = Agents::Runner.with_agents(triage, billing, support)
+Agents::Instrumentation.install(runner, tracer: tracer)
+```
+That's it. Every `runner.run(...)` call now produces OTel spans.
+## Span Hierarchy
+```
+root (agents.run)
+├── agent.Calculator           # container span per agent (no model attr)
+│   ├── agents.run.generation  # GENERATION: model, tokens, I/O
+│   ├── agents.run.tool.add    # TOOL: arguments + result
+│   └── agents.run.generation  # second LLM call after tool result
+├── agent.Support              # after handoff
+│   └── agents.run.generation
+└── agents.run.handoff         # point event on root span
+```
+**Only GENERATION spans carry `gen_ai.request.model`**. This prevents Langfuse from double-counting costs when it sums token usage across spans with a model attribute.
+## Configuration Options
+### `trace_name`
+Custom name for the root span (default: `"agents.run"`):
+```ruby
+Agents::Instrumentation.install(runner,
+  tracer: tracer,
+  trace_name: "customer_support.run"
+)
+```
+Child spans derive their names: `customer_support.run.generation`, `customer_support.run.tool.add_numbers`, etc.
+### `span_attributes`
+Static attributes applied to the root span:
+```ruby
+Agents::Instrumentation.install(runner,
+  tracer: tracer,
+  span_attributes: {
+    "langfuse.trace.tags" => '["production","v2"]',
+    "langfuse.session.id" => session_id
+  }
+)
+```
+### `attribute_provider`
+A lambda that receives the context wrapper and returns dynamic attributes:
+```ruby
+Agents::Instrumentation.install(runner,
+  tracer: tracer,
+  attribute_provider: ->(ctx) {
+    {
+      "langfuse.user.id" => ctx.context[:user_id].to_s,
+      "langfuse.session.id" => ctx.context[:session_id].to_s
+    }
+  }
+)
+```
+## Langfuse Integration
+### Endpoint and Authentication
+Langfuse accepts OTel traces at `{LANGFUSE_HOST}/api/public/otel/v1/traces`. Authentication uses HTTP Basic with your public and secret keys:
+```ruby
+require "base64"
+langfuse_host = ENV["LANGFUSE_HOST"] # e.g. "https://cloud.langfuse.com"
+langfuse_pk   = ENV["LANGFUSE_PUBLIC_KEY"]
+langfuse_sk   = ENV["LANGFUSE_SECRET_KEY"]
+auth_token = Base64.strict_encode64("#{langfuse_pk}:#{langfuse_sk}")
+OpenTelemetry::SDK.configure do |c|
+  c.add_span_processor(
+    OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(
+      OpenTelemetry::Exporter::OTLP::Exporter.new(
+        endpoint: "#{langfuse_host}/api/public/otel/v1/traces",
+        headers: { "Authorization" => "Basic #{auth_token}" }
+      )
+    )
+  )
+end
+```
+### Attribute Mapping
+The instrumentation sets Langfuse-specific attributes that map to the Langfuse UI:
+| Attribute | Set On | Langfuse Display |
+|-----------|--------|-----------------|
+| `langfuse.trace.input` | Root span | Trace input (top of page) |
+| `langfuse.trace.output` | Root span | Trace output (top of page) |
+| `langfuse.observation.input` | All spans | Observation input (sidebar click) |
+| `langfuse.observation.output` | All spans | Observation output (sidebar click) |
+| `langfuse.observation.type` | Tool spans | `"tool"` type indicator |
+| `langfuse.user.id` | Root span (via attribute_provider) | User filter/display |
+| `langfuse.session.id` | Root span (via attribute_provider) | Session grouping |
+| `langfuse.trace.tags` | Root span (via span_attributes) | Trace tags |
+| `gen_ai.request.model` | Generation spans only | Model name + cost calculation |
+| `gen_ai.usage.input_tokens` | Generation spans | Token usage |
+| `gen_ai.usage.output_tokens` | Generation spans | Token usage |
+### EU vs US Cloud
+- **US**: `https://cloud.langfuse.com`
+- **EU**: `https://eu.cloud.langfuse.com`
+Set `LANGFUSE_HOST` accordingly. Self-hosted instances use your own URL.
+## Complete Example
+```ruby
+require "agents"
+require "agents/instrumentation"
+require "opentelemetry-sdk"
+require "opentelemetry-exporter-otlp"
+require "base64"
+# --- Configure Agents ---
+Agents.configure do |config|
+  config.openai_api_key = ENV["OPENAI_API_KEY"]
+  config.default_model = "gpt-4o-mini"
+end
+# --- Configure OTel with Langfuse ---
+langfuse_host = ENV.fetch("LANGFUSE_HOST", "https://cloud.langfuse.com")
+auth_token = Base64.strict_encode64(
+  "#{ENV["LANGFUSE_PUBLIC_KEY"]}:#{ENV["LANGFUSE_SECRET_KEY"]}"
+)
+OpenTelemetry::SDK.configure do |c|
+  c.add_span_processor(
+    OpenTelemetry::SDK::Trace::Export::BatchSpanProcessor.new(
+      OpenTelemetry::Exporter::OTLP::Exporter.new(
+        endpoint: "#{langfuse_host}/api/public/otel/v1/traces",
+        headers: { "Authorization" => "Basic #{auth_token}" }
+      )
+    )
+  )
+end
+tracer = OpenTelemetry.tracer_provider.tracer("my-app")
+# --- Build agents ---
+triage = Agents::Agent.new(name: "Triage", instructions: "Route users...")
+billing = Agents::Agent.new(name: "Billing", instructions: "Handle billing...")
+support = Agents::Agent.new(name: "Support", instructions: "Technical support...")
+triage.register_handoffs(billing, support)
+billing.register_handoffs(triage)
+support.register_handoffs(triage)
+# --- Create runner with instrumentation ---
+runner = Agents::Runner.with_agents(triage, billing, support)
+Agents::Instrumentation.install(runner,
+  tracer: tracer,
+  trace_name: "customer_support",
+  attribute_provider: ->(ctx) {
+    {
+      "langfuse.user.id" => ctx.context[:user_id].to_s,
+      "langfuse.session.id" => ctx.context[:session_id].to_s
+    }
+  }
+)
+# --- Run conversations ---
+result = runner.run("I have a billing question",
+  context: { user_id: "user_123", session_id: "sess_456" })
+puts result.output
+# Ensure spans are flushed before exit
+at_exit { OpenTelemetry.tracer_provider.force_flush }
+```
+## Troubleshooting
+### "undefined" values in Langfuse
+Langfuse renders empty string attributes as "undefined". The instrumentation guards against this by not setting attributes when values are nil or empty. If you see "undefined", check that your agents are producing output content.
+### Double-counted costs
+If token costs appear inflated, verify that `gen_ai.request.model` is only set on GENERATION spans, not on container or root spans. The built-in instrumentation handles this correctly. If you set custom `span_attributes` that include `gen_ai.request.model`, costs will be double-counted.
+### Empty spans / missing data
+- Ensure `opentelemetry-sdk` is installed (not just `opentelemetry-api`)
+- Call `OpenTelemetry.tracer_provider.force_flush` before process exit
+- Verify your OTLP endpoint is reachable and credentials are correct
+- Check that `Agents::Instrumentation.install` returns the runner (returns nil if OTel is unavailable)
+### Spans not appearing in Langfuse
+- Verify the endpoint includes `/api/public/otel/v1/traces`
+- Check that the Authorization header uses `Basic` (not `Bearer`) with base64-encoded `pk:sk`
+- Use `BatchSpanProcessor` for production; `SimpleSpanProcessor` can be useful for debugging
+- **SSL CRL errors on Ruby 3.4+**: The OTLP exporter silently fails when SSL certificate revocation list (CRL) checks fail. The exporter reports SUCCESS but no data arrives. Fix by passing `ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE` to the exporter in development, or ensure your system CA certificates are up to date

data/docs/guides.md CHANGED Viewed

@@ -18,3 +18,4 @@ Practical guides for building real-world applications with the AI Agents library
 - **[State Persistence](guides/state-persistence.html)** - Managing conversation state and context across sessions and processes
 - **[Structured Output](guides/structured-output.html)** - Enforcing JSON schema validation for reliable agent responses
 - **[Custom Request Headers](guides/request-headers.html)** - Adding custom HTTP headers for authentication, tracking, and provider-specific features
+- **[OpenTelemetry Instrumentation](guides/instrumentation.html)** - Trace agent execution with Langfuse and other OTel backends

data/examples/isp-support/interactive.rb CHANGED Viewed

@@ -3,8 +3,13 @@
 require "json"
 require "readline"
+require "securerandom"
 require_relative "../../lib/agents"
 require_relative "agents_factory"
+require_relative "../../lib/agents/instrumentation"
+require "opentelemetry-sdk"
+require "opentelemetry-exporter-otlp"
+require "base64"
 # Simple ISP Customer Support Demo
 class ISPSupportDemo
@@ -27,10 +32,15 @@ class ISPSupportDemo
     # Setup real-time callbacks for UI feedback
     setup_callbacks
-    @context = {}
+    # Setup OpenTelemetry instrumentation with Langfuse
+    setup_instrumentation
+    @session_id = SecureRandom.uuid
+    @context = { session_id: @session_id }
     @current_status = ""
     puts green("🏢 Welcome to ISP Customer Support!")
+    puts dim_text("Session ID: #{@session_id}")
     puts dim_text("Type '/help' for commands or 'exit' to quit.")
     puts
   end
@@ -89,6 +99,33 @@ class ISPSupportDemo
   private
+  def setup_instrumentation
+    host = ENV["LANGFUSE_HOST"]
+    pub_key = ENV["LANGFUSE_PUBLIC_KEY"]
+    sec_key = ENV["LANGFUSE_SECRET_KEY"]
+    unless host && pub_key && sec_key
+      return puts dim_text("⚠️  Langfuse env vars not set — running without instrumentation.")
+    end
+    configure_otel_exporter(host, pub_key, sec_key)
+    tracer = OpenTelemetry.tracer_provider.tracer("isp-support-demo")
+    Agents::Instrumentation.install(@runner, tracer: tracer, trace_name: "isp-support")
+    puts green("📡 Langfuse instrumentation enabled — traces → #{host}")
+  end
+  def configure_otel_exporter(host, pub_key, sec_key)
+    endpoint = "#{host}/api/public/otel/v1/traces"
+    auth = Base64.strict_encode64("#{pub_key}:#{sec_key}")
+    otlp = OpenTelemetry::Exporter::OTLP::Exporter.new(
+      endpoint: endpoint,
+      headers: { "Authorization" => "Basic #{auth}" },
+      ssl_verify_mode: OpenSSL::SSL::VERIFY_NONE
+    )
+    OpenTelemetry::SDK.configure do |c|
+      c.add_span_processor(OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessor.new(otlp))
+    end
+  end
   def setup_callbacks
     @callback_messages = []
@@ -132,6 +169,7 @@ class ISPSupportDemo
     case input.downcase
     when "exit", "quit"
       dump_context_and_quit
+      flush_traces
       puts "👋 Goodbye!"
       :exit
     when "/help"
@@ -155,6 +193,13 @@ class ISPSupportDemo
     end
   end
+  def flush_traces
+    OpenTelemetry.tracer_provider.force_flush
+    puts dim_text("📡 Traces flushed to Langfuse.")
+  rescue StandardError
+    # OTel not configured — nothing to flush
+  end
   def dump_context_and_quit
     project_root = File.expand_path("../..", __dir__)
     tmp_directory = File.join(project_root, "tmp")

data/lib/agents/agent_runner.rb CHANGED Viewed

@@ -50,7 +50,9 @@ module Agents
         tool_start: [],
         tool_complete: [],
         agent_thinking: [],
-        agent_handoff: []
+        agent_handoff: [],
+        llm_call_complete: [],
+        chat_created: []
       }
     end
@@ -164,6 +166,31 @@ module Agents
       self
     end
+    # Register a callback for LLM call completion events.
+    # Called after each LLM call completes with model and token usage info.
+    #
+    # @param block [Proc] Callback block that receives (agent_name, model, response, context_wrapper)
+    # @return [self] For method chaining
+    def on_llm_call_complete(&block)
+      return self unless block
+      @callbacks_mutex.synchronize { @callbacks[:llm_call_complete] << block }
+      self
+    end
+    # Register a callback for chat created events.
+    # Called when a RubyLLM Chat object is created or reconfigured after handoff.
+    # Useful for registering per-message hooks (e.g. on_end_message) on the chat.
+    #
+    # @param block [Proc] Callback block that receives (chat, agent_name, model, context_wrapper)
+    # @return [self] For method chaining
+    def on_chat_created(&block)
+      return self unless block
+      @callbacks_mutex.synchronize { @callbacks[:chat_created] << block }
+      self
+    end
     private
     # Build agent registry from provided agents only.

data/lib/agents/callback_manager.rb CHANGED Viewed

@@ -20,13 +20,20 @@ module Agents
       tool_complete
       agent_thinking
       agent_handoff
+      llm_call_complete
+      chat_created
     ].freeze
     def initialize(callbacks = {})
       @callbacks = callbacks.dup.freeze
     end
-    # Generic method to emit any callback event type
+    # Generic method to emit any callback event type.
+    # Handles arity-aware dispatch: lambdas with strict arity receive only the
+    # arguments they expect (extra trailing args are sliced off), while procs
+    # and blocks (which have flexible arity) receive all arguments.
+    # This ensures backwards compatibility when new arguments (e.g. context_wrapper)
+    # are appended to existing callback signatures.
     #
     # @param event_type [Symbol] The type of event to emit
     # @param args [Array] Arguments to pass to callbacks
@@ -34,7 +41,8 @@ module Agents
       callback_list = @callbacks[event_type] || []
       callback_list.each do |callback|
-        callback.call(*args)
+        safe_args = arity_safe_args(callback, args)
+        callback.call(*safe_args)
       rescue StandardError => e
         # Log callback errors but don't let them crash execution
         warn "Callback error for #{event_type}: #{e.message}"
@@ -53,5 +61,22 @@ module Agents
         emit(event_type, *args)
       end
     end
+    private
+    # Returns args sliced to fit the callback's accepted parameter count.
+    #
+    # Non-lambda procs/blocks silently ignore extra args, so they always get all args.
+    # Lambdas enforce strict argument counts and will raise ArgumentError on extras —
+    # even lambdas with optional params (e.g. ->(a, b, c = nil) {}) have a max.
+    # We inspect #parameters to compute the max and slice accordingly.
+    # Lambdas with a *rest parameter accept unlimited args, so we pass everything.
+    def arity_safe_args(callback, args)
+      return args unless callback.lambda?
+      return args if callback.parameters.any? { |type, _| type == :rest }
+      max = callback.parameters.count { |type, _| %i[req opt].include?(type) }
+      args.first(max)
+    end
   end
 end

data/lib/agents/instrumentation/constants.rb ADDED Viewed

@@ -0,0 +1,35 @@
+# frozen_string_literal: true
+module Agents
+  module Instrumentation
+    # OpenTelemetry attribute name constants for LLM observability.
+    # These follow the GenAI semantic conventions and Langfuse's OTel attribute mapping.
+    #
+    # @see https://langfuse.com/integrations/native/opentelemetry#property-mapping
+    module Constants
+      # Span names
+      SPAN_RUN      = "agents.run"
+      SPAN_LLM_CALL = "agents.llm_call"
+      SPAN_TOOL     = "agents.tool.%s"
+      EVENT_HANDOFF = "agents.handoff"
+      # GenAI semantic conventions (ONLY on generation spans)
+      ATTR_GEN_AI_REQUEST_MODEL = "gen_ai.request.model"
+      ATTR_GEN_AI_PROVIDER      = "gen_ai.provider.name"
+      ATTR_GEN_AI_USAGE_INPUT   = "gen_ai.usage.input_tokens"
+      ATTR_GEN_AI_USAGE_OUTPUT  = "gen_ai.usage.output_tokens"
+      # Langfuse trace-level attributes
+      ATTR_LANGFUSE_USER_ID     = "langfuse.user.id"
+      ATTR_LANGFUSE_SESSION_ID  = "langfuse.session.id"
+      ATTR_LANGFUSE_TRACE_TAGS  = "langfuse.trace.tags"
+      ATTR_LANGFUSE_TRACE_INPUT = "langfuse.trace.input"
+      ATTR_LANGFUSE_TRACE_OUTPUT = "langfuse.trace.output"
+      # Langfuse observation-level attributes
+      ATTR_LANGFUSE_OBS_TYPE   = "langfuse.observation.type"
+      ATTR_LANGFUSE_OBS_INPUT  = "langfuse.observation.input"
+      ATTR_LANGFUSE_OBS_OUTPUT = "langfuse.observation.output"
+    end
+  end
+end

data/lib/agents/instrumentation/tracing_callbacks.rb ADDED Viewed

@@ -0,0 +1,339 @@
+# frozen_string_literal: true
+require "json"
+module Agents
+  module Instrumentation
+    # Produces OTel spans for agent execution, compatible with Langfuse.
+    #
+    # Span hierarchy:
+    #   root (<trace_name>)
+    #   ├── agent.<name>        ← container per agent (no gen_ai.request.model)
+    #   │   ├── .generation     ← GENERATION with model + tokens
+    #   │   └── .tool.<name>    ← TOOL observation
+    #   └── .handoff            ← point event on root
+    #
+    # Only GENERATION spans carry gen_ai.request.model to avoid Langfuse double-counting costs.
+    # Tracing state lives in context[:__otel_tracing], unique per run (thread-safe).
+    class TracingCallbacks
+      include Constants
+      def initialize(tracer:, trace_name: SPAN_RUN, span_attributes: {}, attribute_provider: nil)
+        @tracer = tracer
+        @trace_name = trace_name
+        @llm_span_name = "#{trace_name}.generation"
+        @tool_span_name = "#{trace_name}.tool.%s"
+        @agent_span_name = "#{trace_name}.agent.%s"
+        @handoff_event_name = "#{trace_name}.handoff"
+        @span_attributes = span_attributes
+        @attribute_provider = attribute_provider
+      end
+      def on_run_start(agent_name, input, context_wrapper)
+        attributes = build_root_attributes(agent_name, input, context_wrapper)
+        root_span = @tracer.start_span(@trace_name, attributes: attributes)
+        root_context = OpenTelemetry::Trace.context_with_span(root_span)
+        store_tracing_state(context_wrapper,
+                            root_span: root_span,
+                            root_context: root_context,
+                            current_tool_span: nil,
+                            current_agent_name: nil,
+                            current_agent_span: nil,
+                            current_agent_context: nil)
+      end
+      def on_agent_thinking(agent_name, input, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        tracing[:pending_llm_input] = input.to_s
+        return if tracing[:current_agent_name] == agent_name
+        start_agent_span(tracing, agent_name)
+      end
+      # No-op: LLM spans are handled by on_end_message hook (see on_chat_created).
+      # Kept because the callback interface requires it.
+      def on_llm_call_complete(_agent_name, _model, _response, _context_wrapper); end
+      def on_agent_complete(_agent_name, _result, _error, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        finish_agent_span(tracing)
+      end
+      def on_chat_created(chat, agent_name, model, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        chat.on_end_message do |message|
+          handle_end_message(chat, agent_name, model, message, context_wrapper)
+        end
+      end
+      def on_tool_start(tool_name, args, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        span_name = format(@tool_span_name, tool_name)
+        attributes = {
+          ATTR_LANGFUSE_OBS_TYPE => "tool",
+          ATTR_LANGFUSE_OBS_INPUT => serialize_output(args)
+        }
+        parent = handoff_tool?(tool_name) ? tracing[:root_context] : parent_context(tracing)
+        tool_span = @tracer.start_span(
+          span_name,
+          with_parent: parent,
+          attributes: attributes
+        )
+        tracing[:current_tool_span] = tool_span
+      end
+      def on_tool_complete(_tool_name, result, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        tool_span = tracing[:current_tool_span]
+        return unless tool_span
+        tool_span.set_attribute(ATTR_LANGFUSE_OBS_OUTPUT, serialize_output(result))
+        tool_span.finish
+        tracing[:current_tool_span] = nil
+      end
+      def on_agent_handoff(from_agent, to_agent, reason, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        tracing[:root_span]&.add_event(
+          @handoff_event_name,
+          attributes: {
+            "handoff.from" => from_agent,
+            "handoff.to" => to_agent,
+            "handoff.reason" => reason.to_s
+          }
+        )
+      end
+      def on_run_complete(_agent_name, result, context_wrapper)
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        finish_dangling_spans(tracing)
+        root_span = tracing[:root_span]
+        return unless root_span
+        set_run_output_attributes(root_span, result)
+        set_run_error_status(root_span, result)
+        root_span.finish
+        cleanup_tracing_state(context_wrapper)
+      end
+      private
+      def handle_end_message(chat, _agent_name, model, message, context_wrapper)
+        return unless message.respond_to?(:role) && message.role == :assistant
+        tracing = tracing_state(context_wrapper)
+        return unless tracing
+        input = format_chat_messages(chat)
+        attrs = {}
+        attrs[ATTR_LANGFUSE_OBS_INPUT] = input if input
+        llm_span = @tracer.start_span(@llm_span_name, with_parent: parent_context(tracing), attributes: attrs)
+        llm_span.set_attribute(ATTR_GEN_AI_REQUEST_MODEL, model) if model
+        set_llm_response_attributes(llm_span, message)
+        output = llm_output_text(message)
+        tracing[:last_agent_output] = output unless output.empty?
+        llm_span.finish
+      end
+      def finish_dangling_spans(tracing)
+        if tracing[:current_tool_span]
+          tracing[:current_tool_span].finish
+          tracing[:current_tool_span] = nil
+        end
+        finish_agent_span(tracing)
+      end
+      def set_run_output_attributes(root_span, result)
+        return unless result.respond_to?(:output)
+        output_text = serialize_output(result.output)
+        return if output_text.empty?
+        root_span.set_attribute(ATTR_LANGFUSE_TRACE_OUTPUT, output_text)
+        root_span.set_attribute(ATTR_LANGFUSE_OBS_OUTPUT, output_text)
+      end
+      def set_run_error_status(root_span, result)
+        return unless result.respond_to?(:error)
+        error = result.error
+        return unless error
+        root_span.record_exception(error)
+        root_span.status = OpenTelemetry::Trace::Status.error(error.message)
+      end
+      def set_llm_response_attributes(span, response)
+        if response.respond_to?(:input_tokens) && response.input_tokens
+          span.set_attribute(ATTR_GEN_AI_USAGE_INPUT, response.input_tokens)
+        end
+        if response.respond_to?(:output_tokens) && response.output_tokens
+          span.set_attribute(ATTR_GEN_AI_USAGE_OUTPUT, response.output_tokens)
+        end
+        output = llm_output_text(response)
+        span.set_attribute(ATTR_LANGFUSE_OBS_OUTPUT, output) unless output.empty?
+      end
+      # Falls back to formatting tool calls when response has no text content,
+      # and uses .to_json for Hash/Array (structured output) to avoid Ruby's .to_s format.
+      def llm_output_text(response)
+        return format_tool_calls(response) unless response.respond_to?(:content)
+        content = response.content
+        return format_tool_calls(response) if content.nil?
+        text = content.is_a?(Hash) || content.is_a?(Array) ? content.to_json : content.to_s
+        return format_tool_calls(response) if text.empty?
+        text
+      end
+      # Excludes the last message (current response) — returns what was sent to the LLM.
+      def format_chat_messages(chat)
+        return nil unless chat.respond_to?(:messages)
+        messages = chat.messages
+        return nil if messages.nil? || messages.empty?
+        messages[0...-1].map { |m| format_single_message(m) }.to_json
+      end
+      def format_single_message(msg)
+        text = serialize_content(msg.content)
+        text = append_tool_calls(msg, text)
+        { role: msg.role.to_s, content: text }
+      end
+      def serialize_content(content)
+        content.is_a?(Hash) || content.is_a?(Array) ? content.to_json : content.to_s
+      end
+      def append_tool_calls(msg, text)
+        return text unless msg.role == :assistant && msg.respond_to?(:tool_calls) && msg.tool_calls&.any?
+        calls = msg.tool_calls.values.map { |tc| "#{tc.name}(#{tc.arguments.to_json})" }.join(", ")
+        text.empty? ? "Tool calls: #{calls}" : "#{text}\nTool calls: #{calls}"
+      end
+      def serialize_output(value)
+        value.is_a?(Hash) || value.is_a?(Array) ? value.to_json : value.to_s
+      end
+      def format_tool_calls(response)
+        return "" unless response.respond_to?(:tool_calls) && response.tool_calls&.any?
+        calls = response.tool_calls.values.map do |tc|
+          "#{tc.name}(#{serialize_output(tc.arguments)})"
+        end
+        "Tool calls: #{calls.join(", ")}"
+      end
+      def start_agent_span(tracing, agent_name)
+        finish_agent_span(tracing) # close previous agent span if missed
+        span_name = format(@agent_span_name, agent_name)
+        attrs = { "agent.name" => agent_name }
+        input = tracing[:pending_llm_input]
+        attrs[ATTR_LANGFUSE_OBS_INPUT] = input if input && !input.empty?
+        agent_span = @tracer.start_span(span_name,
+                                        with_parent: tracing[:root_context],
+                                        attributes: attrs)
+        agent_context = OpenTelemetry::Trace.context_with_span(agent_span)
+        tracing[:current_agent_name] = agent_name
+        tracing[:current_agent_span] = agent_span
+        tracing[:current_agent_context] = agent_context
+        tracing[:last_agent_output] = nil
+      end
+      def finish_agent_span(tracing)
+        return unless tracing[:current_agent_span]
+        last_output = tracing[:last_agent_output]
+        if last_output && !last_output.empty?
+          tracing[:current_agent_span].set_attribute(ATTR_LANGFUSE_OBS_OUTPUT, last_output)
+        end
+        tracing[:current_agent_span].finish
+        tracing[:current_agent_name] = nil
+        tracing[:current_agent_span] = nil
+        tracing[:current_agent_context] = nil
+        tracing[:last_agent_output] = nil
+      end
+      def parent_context(tracing)
+        tracing[:current_agent_context] || tracing[:root_context]
+      end
+      def handoff_tool?(tool_name)
+        tool_name.to_s.start_with?("handoff_to_")
+      end
+      def build_root_attributes(agent_name, input, context_wrapper)
+        attributes = @span_attributes.dup
+        apply_session_id(attributes, context_wrapper)
+        apply_input(attributes, input)
+        attributes["agent.name"] = agent_name
+        apply_dynamic_attributes(attributes, context_wrapper)
+        attributes
+      end
+      def apply_session_id(attributes, context_wrapper)
+        session_id = context_wrapper&.context&.dig(:session_id)&.to_s
+        attributes[ATTR_LANGFUSE_SESSION_ID] = session_id if session_id && !session_id.empty?
+      end
+      def apply_input(attributes, input)
+        serialized_input = serialize_output(input)
+        return if serialized_input.empty?
+        attributes[ATTR_LANGFUSE_TRACE_INPUT] = serialized_input
+        attributes[ATTR_LANGFUSE_OBS_INPUT] = serialized_input
+      end
+      def apply_dynamic_attributes(attributes, context_wrapper)
+        return unless @attribute_provider
+        dynamic_attrs = @attribute_provider.call(context_wrapper)
+        attributes.merge!(dynamic_attrs) if dynamic_attrs.is_a?(Hash)
+      end
+      def store_tracing_state(context_wrapper, **state)
+        context_wrapper.context[:__otel_tracing] = state
+      end
+      def tracing_state(context_wrapper)
+        context_wrapper&.context&.dig(:__otel_tracing)
+      end
+      def cleanup_tracing_state(context_wrapper)
+        context_wrapper.context.delete(:__otel_tracing)
+      end
+    end
+  end
+end

data/lib/agents/instrumentation.rb ADDED Viewed

@@ -0,0 +1,109 @@
+# frozen_string_literal: true
+require_relative "instrumentation/constants"
+require_relative "instrumentation/tracing_callbacks"
+module Agents
+  # Optional OpenTelemetry instrumentation for the ai-agents gem.
+  # Emits OTel spans for LLM calls, tool executions, and agent handoffs
+  # that render correctly in Langfuse and other OTel-compatible backends.
+  #
+  # The gem only emits spans — the consumer configures the OTel exporter
+  # and provides a tracer. The opentelemetry-api gem is NOT declared as a
+  # dependency; consumers must include it in their own bundle.
+  #
+  # @example Basic usage
+  #   require 'agents/instrumentation'
+  #
+  #   tracer = OpenTelemetry.tracer_provider.tracer('my_app')
+  #   runner = Agents::Runner.with_agents(triage, billing, support)
+  #
+  #   Agents::Instrumentation.install(runner, tracer: tracer)
+  #
+  # @example With custom trace name
+  #   Agents::Instrumentation.install(runner,
+  #     tracer: tracer,
+  #     trace_name: 'customer_support.run'
+  #   )
+  #
+  # @example With Langfuse attributes
+  #   Agents::Instrumentation.install(runner,
+  #     tracer: tracer,
+  #     span_attributes: { 'langfuse.trace.tags' => ['v2'].to_json },
+  #     attribute_provider: ->(ctx) {
+  #       { 'langfuse.user.id' => ctx.context[:account_id].to_s }
+  #     }
+  #   )
+  module Instrumentation
+    INSTALL_MUTEX = Mutex.new
+    private_constant :INSTALL_MUTEX
+    INSTRUMENTATION_FLAG_IVAR = :@__agents_otel_instrumentation_installed
+    private_constant :INSTRUMENTATION_FLAG_IVAR
+    # Install OTel tracing on a runner via callbacks.
+    # No-op if opentelemetry-api is not available.
+    # Idempotent per runner instance: first install wins.
+    #
+    # Session grouping: set `context[:session_id]` when calling `runner.run()`.
+    # TracingCallbacks automatically reads it per-request and sets `langfuse.session.id`.
+    #
+    # @param runner [Agents::AgentRunner] The runner to instrument
+    # @param tracer [OpenTelemetry::Trace::Tracer] OTel tracer instance
+    # @param trace_name [String] Name for the root span (default: "agents.run")
+    # @param span_attributes [Hash] Static attributes applied to the root span
+    # @param attribute_provider [Proc, nil] Lambda receiving context_wrapper, returning dynamic attributes
+    # @return [Agents::AgentRunner, nil] The runner (for chaining), or nil if OTel is unavailable
+    def self.install(runner, tracer:, trace_name: Constants::SPAN_RUN, span_attributes: {},
+                     attribute_provider: nil)
+      return unless otel_available?
+      INSTALL_MUTEX.synchronize do
+        return runner if instrumentation_installed?(runner)
+        callbacks = TracingCallbacks.new(
+          tracer: tracer,
+          trace_name: trace_name,
+          span_attributes: span_attributes,
+          attribute_provider: attribute_provider
+        )
+        register_callbacks(runner, callbacks)
+        mark_instrumentation_installed(runner)
+      end
+      runner
+    end
+    # Callback event types that are forwarded from the runner to TracingCallbacks.
+    TRACED_EVENTS = CallbackManager::EVENT_TYPES
+    private_constant :TRACED_EVENTS
+    # Register all tracing callback handlers on the runner.
+    def self.register_callbacks(runner, callbacks)
+      TRACED_EVENTS.each do |event|
+        runner.public_send(:"on_#{event}") { |*args| callbacks.public_send(:"on_#{event}", *args) }
+      end
+    end
+    private_class_method :register_callbacks
+    def self.instrumentation_installed?(runner)
+      runner.instance_variable_get(INSTRUMENTATION_FLAG_IVAR)
+    end
+    private_class_method :instrumentation_installed?
+    def self.mark_instrumentation_installed(runner)
+      runner.instance_variable_set(INSTRUMENTATION_FLAG_IVAR, true)
+    end
+    private_class_method :mark_instrumentation_installed
+    # Check if the opentelemetry-api gem is available.
+    #
+    # @return [Boolean] true if opentelemetry-api can be loaded
+    def self.otel_available?
+      require "opentelemetry-api"
+      true
+    rescue LoadError
+      false
+    end
+  end
+end

data/lib/agents/runner.rb CHANGED Viewed

@@ -104,6 +104,8 @@ module Agents
       apply_headers(chat, current_headers)
       configure_chat_for_agent(chat, current_agent, context_wrapper, replace: false)
       restore_conversation_history(chat, context_wrapper)
+      input_already_in_history = last_message_matches?(chat, input)
+      context_wrapper.callback_manager.emit_chat_created(chat, current_agent.name, current_agent.model, context_wrapper)
       loop do
         current_turn += 1
@@ -112,16 +114,24 @@ module Agents
         # Get response from LLM (RubyLLM handles tool execution with halting based handoff detection)
         result = if current_turn == 1
                    # Emit agent thinking event for initial message
-                   context_wrapper.callback_manager.emit_agent_thinking(current_agent.name, input)
-                   chat.ask(input)
+                   context_wrapper.callback_manager.emit_agent_thinking(current_agent.name, input, context_wrapper)
+                   # If conversation history already ends with this user message (e.g. passed
+                   # in via context from an external system), use complete to avoid duplicating it.
+                   input_already_in_history ? chat.complete : chat.ask(input)
                  else
                    # Emit agent thinking event for continuation
-                   context_wrapper.callback_manager.emit_agent_thinking(current_agent.name, "(continuing conversation)")
+                   context_wrapper.callback_manager.emit_agent_thinking(current_agent.name, "(continuing conversation)",
+                                                                        context_wrapper)
                    chat.complete
                  end
         response = result
         track_usage(response, context_wrapper)
+        # Emit LLM call complete event with model and response for instrumentation
+        context_wrapper.callback_manager.emit_llm_call_complete(
+          current_agent.name, current_agent.model, response, context_wrapper
+        )
         # Check for handoff via RubyLLM's halt mechanism
         if response.is_a?(RubyLLM::Tool::Halt) && context_wrapper.context[:pending_handoff]
           handoff_info = context_wrapper.context.delete(:pending_handoff)
@@ -155,7 +165,8 @@ module Agents
           context_wrapper.callback_manager.emit_agent_complete(current_agent.name, nil, nil, context_wrapper)
           # Emit agent handoff event
-          context_wrapper.callback_manager.emit_agent_handoff(current_agent.name, next_agent.name, "handoff")
+          context_wrapper.callback_manager.emit_agent_handoff(current_agent.name, next_agent.name, "handoff",
+                                                              context_wrapper)
           # Switch to new agent - store agent name for persistence
           current_agent = next_agent
@@ -166,6 +177,9 @@ module Agents
           agent_headers = Helpers::Headers.normalize(current_agent.headers)
           current_headers = Helpers::Headers.merge(agent_headers, runtime_headers)
           apply_headers(chat, current_headers)
+          context_wrapper.callback_manager.emit_chat_created(
+            chat, current_agent.name, current_agent.model, context_wrapper
+          )
           # Force the new agent to respond to the conversation context
           # This ensures the user gets a response from the new agent
@@ -409,6 +423,21 @@ module Agents
       chat
     end
+    # Check if the last message in the chat already matches the user's input.
+    # This happens when an external system (e.g. Chatwoot) includes the current
+    # user message in the conversation history passed via context.
+    #
+    # TODO: This .to_s == .to_s comparison is a best-effort safety net and is
+    # brittle for edge cases (trailing whitespace, Hash/JSON round-tripping).
+    # The proper fix is for callers to pass nil when input is already present
+    # in conversation history, similar to the handoff continuation path.
+    def last_message_matches?(chat, input)
+      return false unless input && chat.respond_to?(:messages)
+      last_msg = chat.messages.last
+      last_msg && last_msg.role == :user && last_msg.content.to_s == input.to_s
+    end
     def apply_headers(chat, headers)
       return if headers.empty?

data/lib/agents/tool_wrapper.rb CHANGED Viewed

@@ -47,14 +47,14 @@ module Agents
     def call(args)
       tool_context = ToolContext.new(run_context: @context_wrapper)
-      @context_wrapper.callback_manager.emit_tool_start(@tool.name, args)
+      @context_wrapper.callback_manager.emit_tool_start(@tool.name, args, @context_wrapper)
       begin
         result = @tool.execute(tool_context, **args.transform_keys(&:to_sym))
-        @context_wrapper.callback_manager.emit_tool_complete(@tool.name, result)
+        @context_wrapper.callback_manager.emit_tool_complete(@tool.name, result, @context_wrapper)
         result
       rescue StandardError => e
-        @context_wrapper.callback_manager.emit_tool_complete(@tool.name, "ERROR: #{e.message}")
+        @context_wrapper.callback_manager.emit_tool_complete(@tool.name, "ERROR: #{e.message}", @context_wrapper)
         raise
       end
     end

data/lib/agents/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Agents
-  VERSION = "0.8.0"
+  VERSION = "0.9.0"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: ai-agents
 version: !ruby/object:Gem::Version
-  version: 0.8.0
+  version: 0.9.0
 platform: ruby
 authors:
 - Shivam Mishra
@@ -32,6 +32,7 @@ extensions: []
 extra_rdoc_files: []
 files:
 - ".claude/commands/bump-version.md"
+- ".env.example"
 - ".rspec"
 - ".rubocop.yml"
 - CHANGELOG.md
@@ -58,6 +59,7 @@ files:
 - docs/concepts/tools.md
 - docs/guides.md
 - docs/guides/agent-as-tool-pattern.md
+- docs/guides/instrumentation.md
 - docs/guides/multi-agent-systems.md
 - docs/guides/rails-integration.md
 - docs/guides/request-headers.md
@@ -106,6 +108,9 @@ files:
 - lib/agents/helpers.rb
 - lib/agents/helpers/headers.rb
 - lib/agents/helpers/message_extractor.rb
+- lib/agents/instrumentation.rb
+- lib/agents/instrumentation/constants.rb
+- lib/agents/instrumentation/tracing_callbacks.rb
 - lib/agents/result.rb
 - lib/agents/run_context.rb
 - lib/agents/runner.rb