RubyGems - tracekit - Versions diffs - 0.1.0 → 0.2.0 - Mend

tracekit 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

checksums.yaml +4 -4
data/README.md +92 -1
data/lib/tracekit/security/detector.rb +35 -47
data/lib/tracekit/security/patterns.rb +29 -7
data/lib/tracekit/snapshots/client.rb +353 -6
data/lib/tracekit/version.rb +1 -1
metadata +6 -6

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: '058e090e61dfe2cb69c18c210dcf53e593f2a5f055e43f178e7a5d58f52f28e6'
-  data.tar.gz: 680249888fb2ae4a22a2e53549c45e6630fe35c9de9d85d8a8011a8fcc5feac2
+  metadata.gz: 337c03bb944151dd73e06b09bb8576cab74a557747fee9a19ca8db62ea70be00
+  data.tar.gz: 1561ed0d41ead50afe77cf45a2690c654bc7103d8ace66d377e28e92d89829ce
 SHA512:
-  metadata.gz: 1b1ab7651b55e88f4f53dd82c9634fea099937c4b66e2c0707474942ae6505b07ba1ff2a0be1b4707e776cfba85840b1c3de86810aa57e39925e0a801fa946d9
-  data.tar.gz: 2fcdbc362f4176c6907d9dc4c0089f79548d738e739248657b0c8274fe57ca020199b7d39a047bae5a6846ca842a0a8c6f549f1001cec6dfbd9d729b534a21a6
+  metadata.gz: 46a4977761f03b79ddeb24be2505ce8fdb2104f874efaa499aa61c1b077666e36f49ad37941af5b49b97ba9db59202c4e09248fe5ec899c332f2731ec7fb3ec1
+  data.tar.gz: 423417972e701e2eff1a6aa6466421525b81d2cadd09bc02523a61e14591243d78caa2516da17c860750e5aa7e3f8a795608fa998f843a1e5907e59f049639a6

data/README.md CHANGED Viewed

@@ -286,6 +286,97 @@ sdk.capture_snapshot("user-login", {
 })
 ```
+## Kill Switch
+TraceKit provides a server-side toggle to disable code monitoring per service without deploying code changes.
+### How It Works
+When the kill switch is enabled for your service, the SDK sets `@kill_switch_active = true` and suppresses all snapshot captures. The SDK detects kill switch state through two channels:
+1. **Polling** — The SDK checks for kill switch state on every poll cycle (default 30s)
+2. **SSE** — Real-time kill switch events are received instantly via Server-Sent Events
+```ruby
+sdk = Tracekit.sdk
+# No code changes needed — captures are automatically suppressed
+sdk.capture_snapshot("checkout-start", {
+  userId: 123,
+  amount: 99.99
+})
+# When kill switch is active, this is a no-op
+```
+### Behavior When Active
+- All `capture_snapshot` calls become no-ops (zero overhead)
+- Polling frequency reduces from 30s to **60s** to minimize server load
+- Distributed tracing and metrics continue to function normally
+- When the kill switch is disabled, captures resume automatically on the next poll cycle
+### Controlling the Kill Switch
+- **Dashboard**: Toggle code monitoring on/off per service in the TraceKit dashboard
+- **API**: `POST /api/services/:name/kill-switch` with `{"enabled": true}` or `{"enabled": false}`
+## SSE Real-time Updates
+The SDK supports Server-Sent Events (SSE) for receiving breakpoint changes and kill switch events in real time, without waiting for the next poll cycle.
+### How It Works
+1. The SDK auto-discovers the SSE endpoint from the poll response
+2. A background thread opens a persistent SSE connection
+3. Breakpoint activations/deactivations and kill switch events are applied instantly
+4. If SSE fails, the SDK falls back to polling seamlessly
+```ruby
+# SSE is enabled automatically when code monitoring is active.
+# No additional configuration needed.
+Tracekit.configure do |config|
+  config.enable_code_monitoring = true
+  config.code_monitoring_poll_interval = 30  # Polling still runs as fallback
+end
+```
+### Events Received via SSE
+| Event | Description |
+|-------|-------------|
+| `breakpoint.activated` | A breakpoint was enabled — start capturing |
+| `breakpoint.deactivated` | A breakpoint was disabled — stop capturing |
+| `kill_switch.enabled` | Code monitoring disabled for this service |
+| `kill_switch.disabled` | Code monitoring re-enabled for this service |
+## Circuit Breaker
+The circuit breaker protects your application if the TraceKit backend becomes unreachable.
+### How It Works
+1. The SDK tracks consecutive snapshot capture failures
+2. After **3 failures within 60 seconds**, code monitoring is automatically paused
+3. After a **5-minute cooldown**, the circuit breaker resets and captures resume
+```ruby
+# No configuration needed — circuit breaker is built into the SDK.
+# Thread-safe implementation via Mutex.
+sdk = Tracekit.sdk
+sdk.capture_snapshot("process-data", { batch_size: 100 })
+# If backend is down, circuit breaker trips after 3 failures
+# Captures resume automatically after 5-minute cooldown
+```
+### Behavior When Tripped
+- All `capture_snapshot` calls become no-ops (zero overhead)
+- Distributed tracing and metrics continue to function normally
+- The SDK automatically retries after the cooldown period
+- Thread-safe via `Mutex` — safe for multi-threaded Ruby applications (Puma, Sidekiq)
 ## Distributed Tracing
 The SDK automatically:
@@ -506,4 +597,4 @@ Built on [OpenTelemetry](https://opentelemetry.io/) - the industry standard for
 ---
 **Repository**: git@github.com:Tracekit-Dev/ruby-sdk.git
-**Version**: v0.1.0
+**Version**: v0.2.0

data/lib/tracekit/security/detector.rb CHANGED Viewed

@@ -2,15 +2,30 @@
 module Tracekit
   module Security
-    # Detects and redacts sensitive data (PII, credentials) from variable snapshots
+    # Detects and redacts sensitive data (PII, credentials) from variable snapshots.
+    # Uses typed [REDACTED:type] markers. PII scrubbing is enabled by default.
     class Detector
       SecurityFlag = Struct.new(:type, :category, :severity, :variable, :redacted, keyword_init: true)
       ScanResult = Struct.new(:sanitized_variables, :security_flags, keyword_init: true)
+      attr_accessor :pii_scrubbing
+      # @param pii_scrubbing [Boolean] whether PII scrubbing is enabled (default: true)
+      # @param custom_patterns [Array<Hash>] custom patterns, each with :pattern (Regexp) and :marker (String)
+      def initialize(pii_scrubbing: true, custom_patterns: [])
+        @pii_scrubbing = pii_scrubbing
+        @custom_patterns = custom_patterns.map { |p| [p[:pattern], p[:marker]] }
+      end
       def scan(variables)
         sanitized = {}
         flags = []
+        # If PII scrubbing is disabled, return as-is
+        unless @pii_scrubbing
+          return ScanResult.new(sanitized_variables: variables.dup, security_flags: [])
+        end
         variables.each do |key, value|
           sanitized_value, detected_flags = scan_value(key, value)
           sanitized[key] = sanitized_value
@@ -26,58 +41,31 @@ module Tracekit
         return ["[NULL]", []] if value.nil?
         flags = []
-        value_str = value.to_s
-        # Check PII
-        if Patterns::EMAIL.match?(value_str)
-          flags << SecurityFlag.new(type: "pii", category: "email", severity: "medium", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
-        if Patterns::SSN.match?(value_str)
-          flags << SecurityFlag.new(type: "pii", category: "ssn", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
-        if Patterns::CREDIT_CARD.match?(value_str)
-          flags << SecurityFlag.new(type: "pii", category: "credit_card", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
+        # Check variable name for sensitive keywords (word-boundary matching)
+        if Patterns::SENSITIVE_NAME.match?(key.to_s)
+          flags << SecurityFlag.new(type: "sensitive_name", category: "name", severity: "medium", variable: key, redacted: true)
+          return ["[REDACTED:sensitive_name]", flags]
         end
-        if Patterns::PHONE.match?(value_str)
-          flags << SecurityFlag.new(type: "pii", category: "phone", severity: "medium", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
-        # Check Credentials
-        if Patterns::API_KEY.match?(value_str)
-          flags << SecurityFlag.new(type: "credential", category: "api_key", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
-        if Patterns::AWS_KEY.match?(value_str)
-          flags << SecurityFlag.new(type: "credential", category: "aws_key", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
-        if Patterns::STRIPE_KEY.match?(value_str)
-          flags << SecurityFlag.new(type: "credential", category: "stripe_key", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
-        if Patterns::PASSWORD.match?(value_str)
-          flags << SecurityFlag.new(type: "credential", category: "password", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
-        end
+        # Serialize value to string for deep scanning
+        value_str = value.to_s
-        if Patterns::JWT.match?(value_str)
-          flags << SecurityFlag.new(type: "credential", category: "jwt", severity: "high", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
+        # Check built-in patterns with typed markers
+        Patterns::PATTERN_MARKERS.each do |pattern, marker|
+          if pattern.match?(value_str)
+            category = marker.match(/REDACTED:(\w+)/)[1]
+            flags << SecurityFlag.new(type: "sensitive_data", category: category, severity: "high", variable: key, redacted: true)
+            return [marker, flags]
+          end
         end
-        if Patterns::PRIVATE_KEY.match?(value_str)
-          flags << SecurityFlag.new(type: "credential", category: "private_key", severity: "critical", variable: key, redacted: true)
-          return ["[REDACTED]", flags]
+        # Check custom patterns
+        @custom_patterns.each do |pattern, marker|
+          if pattern.match?(value_str)
+            flags << SecurityFlag.new(type: "custom", category: "custom", severity: "high", variable: key, redacted: true)
+            return [marker, flags]
+          end
         end
         [value, flags]

data/lib/tracekit/security/patterns.rb CHANGED Viewed

@@ -2,21 +2,43 @@
 module Tracekit
   module Security
-    # Regex patterns for detecting sensitive data in snapshots
+    # Regex patterns for detecting sensitive data in snapshots.
+    # 13 standard patterns with typed [REDACTED:type] markers.
     module Patterns
       # PII Patterns
-      EMAIL = /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/
+      EMAIL = /\b[A-Za-z0-9._%+\-]+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}\b/
       SSN = /\b\d{3}-\d{2}-\d{4}\b/
       CREDIT_CARD = /\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b/
       PHONE = /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/
       # Credential Patterns
-      API_KEY = /(api[_-]?key|apikey|access[_-]?key)[\s:=]+['"  ]?([a-zA-Z0-9_-]{20,})['"]?/i
+      API_KEY = /(?:api[_\-]?key|apikey)\s*[:=]\s*['"]?[A-Za-z0-9_\-]{20,}/i
       AWS_KEY = /AKIA[0-9A-Z]{16}/
-      STRIPE_KEY = /sk_live_[0-9a-zA-Z]{24}/
-      PASSWORD = /(password|pwd|pass)[\s:=]+['" ]?([^\s'" ]{6,})['"]?/i
-      JWT = /eyJ[a-zA-Z0-9_-]+\.eyJ[a-zA-Z0-9_-]+\.[a-zA-Z0-9_-]+/
-      PRIVATE_KEY = /-----BEGIN (RSA |EC )?PRIVATE KEY-----/
+      AWS_SECRET = /aws.{0,20}secret.{0,20}[A-Za-z0-9\/+=]{40}/i
+      OAUTH_TOKEN = /(?:bearer\s+)[A-Za-z0-9._~+\/=\-]{20,}/i
+      STRIPE_KEY = /sk_live_[0-9a-zA-Z]{10,}/
+      PASSWORD = /(?:password|passwd|pwd)\s*[=:]\s*['"]?[^\s'"]{6,}/i
+      JWT = /eyJ[a-zA-Z0-9_\-]+\.eyJ[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+/
+      PRIVATE_KEY = /-----BEGIN (?:RSA |EC )?PRIVATE KEY-----/
+      # Letter-boundary pattern -- \b treats _ as word char, so api_key/user_token won't match
+      SENSITIVE_NAME = /(?:^|[^a-zA-Z])(?:password|passwd|pwd|secret|token|key|credential|api_key|apikey)(?:[^a-zA-Z]|$)/i
+      # Mapping of pattern -> typed redaction marker
+      PATTERN_MARKERS = {
+        EMAIL => "[REDACTED:email]",
+        SSN => "[REDACTED:ssn]",
+        CREDIT_CARD => "[REDACTED:credit_card]",
+        PHONE => "[REDACTED:phone]",
+        AWS_KEY => "[REDACTED:aws_key]",
+        AWS_SECRET => "[REDACTED:aws_secret]",
+        OAUTH_TOKEN => "[REDACTED:oauth_token]",
+        STRIPE_KEY => "[REDACTED:stripe_key]",
+        PASSWORD => "[REDACTED:password]",
+        JWT => "[REDACTED:jwt]",
+        PRIVATE_KEY => "[REDACTED:private_key]",
+        API_KEY => "[REDACTED:api_key]"
+      }.freeze
     end
   end
 end

data/lib/tracekit/snapshots/client.rb CHANGED Viewed

@@ -9,7 +9,12 @@ module Tracekit
   module Snapshots
     # Client for code monitoring - polls breakpoints and captures snapshots
     class Client
-      def initialize(api_key, base_url, service_name, poll_interval_seconds = 30)
+      # Opt-in capture limits (all disabled by default: nil = unlimited)
+      attr_accessor :capture_depth    # nil = unlimited depth (default)
+      attr_accessor :max_payload      # nil = unlimited payload bytes (default)
+      attr_accessor :capture_timeout  # nil = no timeout seconds (default)
+      def initialize(api_key, base_url, service_name, poll_interval_seconds = 30, **opts)
         @api_key = api_key
         @base_url = base_url
         @service_name = service_name
@@ -17,6 +22,31 @@ module Tracekit
         @breakpoints_cache = Concurrent::Hash.new
         @registration_cache = Concurrent::Hash.new
+        # Opt-in capture limits
+        @capture_depth = opts[:capture_depth]
+        @max_payload = opts[:max_payload]
+        @capture_timeout = opts[:capture_timeout]
+        # Kill switch: server-initiated monitoring disable
+        @kill_switch_active = false
+        @normal_poll_interval = poll_interval_seconds
+        # SSE (Server-Sent Events) real-time updates
+        @sse_endpoint = nil
+        @sse_active = false
+        @sse_thread = nil
+        # Circuit breaker state (Mutex-protected for thread safety)
+        cb_config = opts[:circuit_breaker] || {}
+        @cb_mutex = Mutex.new
+        @cb_failure_timestamps = []
+        @cb_state = "closed"
+        @cb_opened_at = nil
+        @cb_max_failures = cb_config[:max_failures] || 3
+        @cb_window_seconds = cb_config[:window_seconds] || 60
+        @cb_cooldown_seconds = cb_config[:cooldown_seconds] || 300
+        @pending_events = []
         # Start polling timer
         @poll_task = Concurrent::TimerTask.new(execution_interval: poll_interval_seconds) do
           fetch_active_breakpoints
@@ -27,8 +57,22 @@ module Tracekit
         fetch_active_breakpoints
       end
-      # Captures a snapshot at the caller's location
+      # Captures a snapshot at the caller's location.
+      # Crash isolation: rescues all exceptions so TraceKit never crashes the host app.
       def capture_snapshot(label, variables, caller_location = nil)
+        begin
+          do_capture_snapshot(label, variables, caller_location)
+        rescue => e
+          warn "TraceKit: error in capture_snapshot: #{e.message}" if ENV["DEBUG"]
+        end
+      end
+      private
+      def do_capture_snapshot(label, variables, caller_location)
+        # Kill switch: skip all capture when server has disabled monitoring
+        return if @kill_switch_active
         # Extract caller information
         caller_location ||= caller_locations(1, 1).first
         file_path = caller_location.path
@@ -47,6 +91,11 @@ module Tracekit
         return if breakpoint.expire_at && Time.now > breakpoint.expire_at
         return if breakpoint.max_captures > 0 && breakpoint.capture_count >= breakpoint.max_captures
+        # Apply opt-in capture depth limit
+        if @capture_depth && @capture_depth > 0
+          variables = limit_depth(variables, 0)
+        end
         # Scan for security issues
         scan_result = @security_detector.scan(variables)
@@ -79,17 +128,65 @@ module Tracekit
           captured_at: Time.now.utc.iso8601
         )
-        # Submit asynchronously
-        Thread.new { submit_snapshot(snapshot) }
+        # Apply opt-in max payload limit
+        serialized = JSON.generate(snapshot.to_h)
+        if @max_payload && @max_payload > 0 && serialized.bytesize > @max_payload
+          snapshot = Snapshot.new(
+            breakpoint_id: breakpoint.id,
+            service_name: @service_name,
+            file_path: file_path,
+            function_name: function_name,
+            label: label,
+            line_number: line_number,
+            variables: { "_truncated" => true, "_payload_size" => serialized.bytesize, "_max_payload" => @max_payload },
+            security_flags: [],
+            stack_trace: stack_trace,
+            trace_id: trace_id,
+            span_id: span_id,
+            captured_at: Time.now.utc.iso8601
+          )
+        end
+        # Submit asynchronously (with optional timeout)
+        if @capture_timeout && @capture_timeout > 0
+          thread = Thread.new { submit_snapshot(snapshot) }
+          unless thread.join(@capture_timeout)
+            warn "TraceKit: capture timeout exceeded (#{@capture_timeout}s)" if ENV["DEBUG"]
+            thread.kill
+          end
+        else
+          Thread.new { submit_snapshot(snapshot) }
+        end
       end
+      public
       # Shuts down the client
       def shutdown
         @poll_task&.shutdown
+        close_sse
       end
       private
+      # Limit variable nesting depth (opt-in)
+      def limit_depth(data, current_depth)
+        return { "_truncated" => true, "_depth" => current_depth } if current_depth >= @capture_depth
+        case data
+        when Hash
+          result = {}
+          data.each do |k, v|
+            result[k] = limit_depth(v, current_depth + 1)
+          end
+          result
+        when Array
+          data.map { |item| limit_depth(item, current_depth + 1) }
+        else
+          data
+        end
+      end
       def fetch_active_breakpoints
         url = "#{@base_url}/sdk/snapshots/active/#{@service_name}"
         uri = URI(url)
@@ -106,11 +203,36 @@ module Tracekit
         data = JSON.parse(response.body, symbolize_names: true)
         update_breakpoint_cache(data[:breakpoints]) if data[:breakpoints]
+        # SSE auto-discovery: if polling response includes sse_endpoint, start SSE connection
+        if data[:sse_endpoint] && !@sse_active
+          @sse_endpoint = data[:sse_endpoint]
+          start_sse_thread(@sse_endpoint)
+        end
+        # Handle kill switch state (missing field = false for backward compat)
+        new_kill_state = data[:kill_switch] == true
+        if new_kill_state && !@kill_switch_active
+          warn "TraceKit: Code monitoring disabled by server kill switch. Polling at reduced frequency."
+          reschedule_polling(60)
+        elsif !new_kill_state && @kill_switch_active
+          warn "TraceKit: Code monitoring re-enabled by server."
+          reschedule_polling(@normal_poll_interval)
+        end
+        @kill_switch_active = new_kill_state
       rescue => e
         # Silently ignore errors fetching breakpoints
         warn "Error fetching breakpoints: #{e.message}" if ENV["DEBUG"]
       end
+      def reschedule_polling(interval_seconds)
+        @poll_task&.shutdown
+        @poll_task = Concurrent::TimerTask.new(execution_interval: interval_seconds) do
+          fetch_active_breakpoints
+        end
+        @poll_task.execute
+      end
       def update_breakpoint_cache(breakpoints)
         @breakpoints_cache.clear
@@ -181,6 +303,9 @@ module Tracekit
       end
       def submit_snapshot(snapshot)
+        # Circuit breaker check
+        return unless circuit_breaker_should_allow?
         uri = URI("#{@base_url}/sdk/snapshots/capture")
         http = Net::HTTP.new(uri.host, uri.port)
         http.use_ssl = uri.scheme == "https"
@@ -192,11 +317,233 @@ module Tracekit
         })
         request.body = JSON.generate(snapshot.to_h)
-        http.request(request)
+        response = http.request(request)
+        # Server error (5xx) -- count as circuit breaker failure
+        if response.is_a?(Net::HTTPServerError)
+          queue_circuit_breaker_event if circuit_breaker_record_failure
+        end
+      rescue SocketError, Errno::ECONNREFUSED, Errno::EHOSTUNREACH,
+             Errno::ETIMEDOUT, Net::OpenTimeout, Net::ReadTimeout => e
+        # Network/timeout error -- count as circuit breaker failure
+        warn "Error submitting snapshot: #{e.message}" if ENV["DEBUG"]
+        queue_circuit_breaker_event if circuit_breaker_record_failure
       rescue => e
-        # Silently ignore snapshot submission errors
+        # Other errors -- do NOT count as circuit breaker failure
         warn "Error submitting snapshot: #{e.message}" if ENV["DEBUG"]
       end
+      # Start SSE connection in a daemon thread
+      def start_sse_thread(endpoint)
+        close_sse # Close any existing SSE connection
+        @sse_thread = Thread.new do
+          begin
+            connect_sse(endpoint)
+          rescue => e
+            warn "TraceKit: SSE thread error: #{e.message}" if ENV["DEBUG"]
+            @sse_active = false
+          end
+        end
+        @sse_thread.abort_on_exception = false
+      end
+      # Connect to the SSE endpoint for real-time breakpoint updates.
+      # Falls back to polling if SSE connection fails or disconnects.
+      # Crash isolation: all exceptions are rescued so TraceKit never crashes the host app.
+      def connect_sse(endpoint)
+        full_url = "#{@base_url}#{endpoint}"
+        uri = URI(full_url)
+        http = Net::HTTP.new(uri.host, uri.port)
+        http.use_ssl = uri.scheme == "https"
+        http.read_timeout = 0 # No timeout for SSE (long-lived connection)
+        http.open_timeout = 10
+        request = Net::HTTP::Get.new(uri.path)
+        request["X-API-Key"] = @api_key
+        request["Accept"] = "text/event-stream"
+        request["Cache-Control"] = "no-cache"
+        http.request(request) do |response|
+          unless response.is_a?(Net::HTTPSuccess)
+            warn "TraceKit: SSE connection failed with HTTP #{response.code}, falling back to polling" if ENV["DEBUG"]
+            @sse_active = false
+            return
+          end
+          @sse_active = true
+          warn "TraceKit: SSE connected to #{endpoint}" if ENV["DEBUG"]
+          event_type = nil
+          event_data = ""
+          response.read_body do |chunk|
+            chunk.each_line do |line|
+              line = line.chomp
+              if line.start_with?("event:")
+                event_type = line.sub(/^event:\s*/, "").strip
+              elsif line.start_with?("data:")
+                event_data += line.sub(/^data:\s*/, "")
+              elsif line.empty? && event_type
+                # Empty line signals end of event -- process it
+                handle_sse_event(event_type, event_data)
+                event_type = nil
+                event_data = ""
+              end
+            end
+          end
+        end
+        # Connection closed cleanly
+        @sse_active = false
+        warn "TraceKit: SSE connection closed, falling back to polling" if ENV["DEBUG"]
+      rescue SocketError, Errno::ECONNREFUSED, Errno::EHOSTUNREACH,
+             Errno::ETIMEDOUT, Net::OpenTimeout, Net::ReadTimeout,
+             IOError, EOFError => e
+        warn "TraceKit: SSE connection error: #{e.message}, falling back to polling" if ENV["DEBUG"]
+        @sse_active = false
+      rescue => e
+        warn "TraceKit: SSE unexpected error: #{e.message}" if ENV["DEBUG"]
+        @sse_active = false
+      end
+      # Handle a parsed SSE event
+      def handle_sse_event(event_type, data_str)
+        case event_type
+        when "init"
+          payload = JSON.parse(data_str, symbolize_names: true)
+          update_breakpoint_cache(payload[:breakpoints]) if payload[:breakpoints]
+          # Update kill switch from init event
+          if payload.key?(:kill_switch)
+            new_kill_state = payload[:kill_switch] == true
+            if new_kill_state && !@kill_switch_active
+              warn "TraceKit: Code monitoring disabled by server kill switch."
+              close_sse
+            end
+            @kill_switch_active = new_kill_state
+          end
+        when "breakpoint_created", "breakpoint_updated"
+          bp_data = JSON.parse(data_str, symbolize_names: true)
+          upsert_breakpoint(bp_data)
+        when "breakpoint_deleted"
+          bp_data = JSON.parse(data_str, symbolize_names: true)
+          remove_breakpoint(bp_data[:id])
+        when "kill_switch"
+          payload = JSON.parse(data_str, symbolize_names: true)
+          @kill_switch_active = payload[:enabled] == true
+          if @kill_switch_active
+            warn "TraceKit: Code monitoring disabled by server kill switch via SSE."
+            close_sse
+          end
+        when "heartbeat"
+          # No action needed -- keeps connection alive
+        else
+          warn "TraceKit: Unknown SSE event type: #{event_type}" if ENV["DEBUG"]
+        end
+      rescue JSON::ParserError => e
+        warn "TraceKit: SSE JSON parse error for '#{event_type}': #{e.message}" if ENV["DEBUG"]
+      rescue => e
+        warn "TraceKit: SSE event handling error: #{e.message}" if ENV["DEBUG"]
+      end
+      # Upsert a single breakpoint into the cache
+      def upsert_breakpoint(bp_data)
+        bp = BreakpointConfig.new(
+          id: bp_data[:id],
+          file_path: bp_data[:file_path],
+          line_number: bp_data[:line_number],
+          function_name: bp_data[:function_name],
+          label: bp_data[:label],
+          enabled: bp_data[:enabled],
+          max_captures: bp_data[:max_captures] || 0,
+          capture_count: bp_data[:capture_count] || 0,
+          expire_at: bp_data[:expire_at] ? Time.parse(bp_data[:expire_at]) : nil
+        )
+        # Key by function + label
+        if bp.label && bp.function_name
+          label_key = "#{bp.function_name}:#{bp.label}"
+          @breakpoints_cache[label_key] = bp
+        end
+        # Key by file + line
+        line_key = "#{bp.file_path}:#{bp.line_number}"
+        @breakpoints_cache[line_key] = bp
+      end
+      # Remove a breakpoint from the cache by ID
+      def remove_breakpoint(breakpoint_id)
+        return unless breakpoint_id
+        @breakpoints_cache.delete_if { |_key, bp| bp.id == breakpoint_id }
+      end
+      # Close the active SSE connection
+      def close_sse
+        @sse_active = false
+        if @sse_thread&.alive?
+          @sse_thread.kill
+          @sse_thread = nil
+        end
+      end
+      def circuit_breaker_should_allow?
+        @cb_mutex.synchronize do
+          return true if @cb_state == "closed"
+          # Check cooldown
+          if @cb_opened_at && (Time.now.to_f - @cb_opened_at) >= @cb_cooldown_seconds
+            @cb_state = "closed"
+            @cb_failure_timestamps.clear
+            @cb_opened_at = nil
+            warn "TraceKit: Code monitoring resumed"
+            return true
+          end
+          false
+        end
+      end
+      def circuit_breaker_record_failure
+        @cb_mutex.synchronize do
+          now = Time.now.to_f
+          @cb_failure_timestamps << now
+          # Prune old timestamps
+          cutoff = now - @cb_window_seconds
+          @cb_failure_timestamps.reject! { |ts| ts <= cutoff }
+          if @cb_failure_timestamps.size >= @cb_max_failures && @cb_state == "closed"
+            @cb_state = "open"
+            @cb_opened_at = now
+            warn "TraceKit: Code monitoring paused (#{@cb_max_failures} capture failures in #{@cb_window_seconds}s). Auto-resumes in #{@cb_cooldown_seconds / 60} min."
+            return true
+          end
+          false
+        end
+      end
+      def queue_circuit_breaker_event
+        @cb_mutex.synchronize do
+          @pending_events << {
+            type: "circuit_breaker_tripped",
+            service_name: @service_name,
+            failure_count: @cb_max_failures,
+            window_seconds: @cb_window_seconds,
+            cooldown_seconds: @cb_cooldown_seconds,
+            timestamp: Time.now.utc.iso8601
+          }
+        end
+      end
     end
   end
 end

data/lib/tracekit/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Tracekit
-  VERSION = "0.1.0"
+  VERSION = "0.2.0"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: tracekit
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.2.0
 platform: ruby
 authors:
 - TraceKit
-autorequire:
+autorequire:
 bindir: exe
 cert_chain: []
-date: 2026-02-04 00:00:00.000000000 Z
+date: 2026-03-07 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: opentelemetry-sdk
@@ -173,7 +173,7 @@ metadata:
   homepage_uri: https://github.com/Tracekit-Dev/ruby-sdk
   source_code_uri: https://github.com/Tracekit-Dev/ruby-sdk
   changelog_uri: https://github.com/Tracekit-Dev/ruby-sdk/blob/main/CHANGELOG.md
-post_install_message:
+post_install_message:
 rdoc_options: []
 require_paths:
 - lib
@@ -188,8 +188,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.0.3.1
-signing_key:
+rubygems_version: 3.5.3
+signing_key:
 specification_version: 4
 summary: TraceKit Ruby SDK - OpenTelemetry-based APM for Ruby applications
 test_files: []