RubyGems - botrytis - Versions diffs - 0.1.0 → 1.0.0 - Mend

botrytis 0.1.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

checksums.yaml +4 -4
data/.claude/settings.local.json +19 -0
data/CLAUDE.md +85 -0
data/README.md +258 -1
data/debug_step_args.rb +6 -0
data/future_tests.md +141 -0
data/lib/botrytis/cucumber.rb +151 -0
data/lib/botrytis/formatter.rb +75 -0
data/lib/botrytis/semantic_match_generator.rb +39 -1
data/lib/botrytis/semantic_matcher.rb +261 -0
data/lib/botrytis/version.rb +1 -1
metadata +8 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 98b2aad1fbfde0c8929d30456317928a419706b22bb3e27b04427ec31be28fd9
-  data.tar.gz: 365b2b618522e77e4f913e03759267555880deebf27e86a2e68cc52a014a791b
+  metadata.gz: fdc6ddc981fa9cd6507c3534d0849cc7c2e773cadd5360e3daed1b91c4b28039
+  data.tar.gz: 2010aea7e2644b3e6c0cc43abd6ef73cf2f92165af64270d9085d285f539abfe
 SHA512:
-  metadata.gz: 51d072e08697f9988c3ee17df20b484798e63c0ad7eebb12d75296de40018156a249e851fda897d939cc2e60efbaffdfbd1624d08e43b92a3a8aa58f293fbbc4
-  data.tar.gz: 50b91ccf5c4deb8d9b0015a26e910e816fda2f3b1a8a10a331bd4478cdd3c03b456c34c4e73f5cc2edff954d8eebedc0135ddb61c451c445bf7291201ecc571f
+  metadata.gz: 98ab98ee673362e2d352a1dd2f6b6ac3913611204a1e156b17915541d2873e0a8d1cdc310879f39e50aaa026d3fc9b4f1b629e0c0606fae555738cbfaac0b2e8
+  data.tar.gz: bffbf314f6a14f2bb398ed4a52c5d716dc849d5229c69e580133dd25581d661bbf5765859ec7f54b3e0edb3e38ff6d5d3695fb7dd0403e65a578860dd99c8982

data/.claude/settings.local.json ADDED Viewed

@@ -0,0 +1,19 @@
+{
+  "permissions": {
+    "allow": [
+      "Bash(bundle exec rake:*)",
+      "Bash(bundle exec rspec:*)",
+      "Bash(bundle exec irb:*)",
+      "Bash(bundle exec cucumber:*)",
+      "Bash(bundle exec ruby:*)",
+      "Bash(mkdir:*)",
+      "Bash(cp:*)",
+      "Bash(touch:*)",
+      "Bash(BOTRYTIS_LIVE_API=true bundle exec cucumber --name \"Authentication variations\")",
+      "Bash(bundle exec rails generate model:*)",
+      "Bash(bundle exec rails generate:*)",
+      "Bash(ls:*)"
+    ],
+    "deny": []
+  }
+}

data/CLAUDE.md ADDED Viewed

@@ -0,0 +1,85 @@
+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Project Overview
+Botrytis is a Ruby gem that provides LLM-powered semantic matching for Cucumber step definitions. It enables fuzzy matching of Cucumber steps using large language models, making BDD tests more flexible by matching similar but not exact step text.
+## Architecture
+### Core Components
+- **SemanticMatcher** (`lib/botrytis/semantic_matcher.rb`): Main matching engine that finds semantic matches between step text and available step definitions using LLMs
+- **SemanticMatchGenerator** (`lib/botrytis/semantic_match_generator.rb`): LLM interaction layer built on Sublayer for generating semantic matches
+- **Configuration** (`lib/botrytis/configuration.rb`): Configurable settings for LLM provider, model, confidence thresholds, and caching
+- **Formatter** (`lib/botrytis/formatter.rb`): Custom Cucumber formatter integration
+### Key Dependencies
+- **cucumber**: Core Cucumber framework (>= 9)
+- **sublayer**: LLM abstraction layer (>= 0.2.8) for AI provider interactions
+- **rspec**: Testing framework
+### Configuration System
+The gem supports configuration of:
+- LLM provider (default: :openai)
+- Model name (default: "gpt-4o")
+- Confidence threshold (default: 0.7)
+- Caching enabled/disabled (default: true)
+- Cache directory (default: ".botrytis_cache")
+## Development Commands
+### Testing
+```bash
+# Run all tests
+bundle exec rake spec
+# or
+bundle exec rspec
+# Run specific test files
+bundle exec rspec spec/botrytis_spec.rb
+# Run with specific options
+bundle exec rspec --fail-fast
+```
+### Building and Installation
+```bash
+# Build the gem
+bundle exec rake build
+# Install locally
+bundle exec rake install:local
+# Clean build artifacts
+bundle exec rake clean
+```
+### Development Setup
+```bash
+# Install dependencies
+bundle install
+# Run interactive console with gem loaded
+bundle exec irb -r botrytis
+```
+## Semantic Matching Flow
+1. Step text is compared against available step definition patterns
+2. If caching is enabled, check for cached results first
+3. Query LLM through Sublayer with step text and available patterns
+4. LLM returns confidence score and best match pattern
+5. If confidence meets threshold, return Cucumber::Glue::StepMatch
+6. Cache result if caching enabled
+## File Structure Notes
+- Main entry point: `lib/botrytis.rb`
+- Core logic in `lib/botrytis/` directory
+- RSpec tests in `spec/` directory
+- Gem configuration in `botrytis.gemspec`
+- Rake tasks defined in `Rakefile` (default task is `:spec`)

data/README.md CHANGED Viewed

@@ -1 +1,258 @@
-# Botrytis BDD
+# Botrytis
+[![Gem Version](https://badge.fury.io/rb/botrytis.svg)](https://badge.fury.io/rb/botrytis)
+**LLM-powered semantic matching for your Cucumber steps**
+Botrytis makes your BDD tests more flexible by using Large Language Models to match semantically similar Cucumber steps, even when they don't match exactly.
+## What it does
+Instead of your Cucumber tests failing when step text doesn't match exactly:
+```gherkin
+# ❌ This fails without Botrytis
+Given the user has authenticated successfully  # No matching step definition
+# ✅ But this step definition exists:
+Given(/^the user has logged in to their account$/) do
+  # implementation
+end
+```
+With Botrytis, the LLM understands that "authenticated successfully" and "logged in to their account" are semantically equivalent, so your test passes!
+## Features
+- 🧠 **Semantic step matching** using OpenAI, Claude, or Gemini
+- 🎯 **Confidence-based matching** with configurable thresholds
+- ⚡ **Intelligent caching** to avoid repeated LLM calls
+- 🔄 **Parameter extraction** from semantically matched steps
+- 📊 **Match reporting** shows how many fuzzy matches were found
+- 🧪 **Live API testing** mode for development
+## Installation
+Add this line to your application's Gemfile:
+```ruby
+gem 'botrytis'
+```
+And then execute:
+```bash
+$ bundle install
+```
+Or install it yourself as:
+```bash
+$ gem install botrytis
+```
+## Quick Start
+1. **Add to your Cucumber support files**:
+```ruby
+# features/support/env.rb
+require 'botrytis/cucumber'
+```
+2. **Configure your LLM provider** (create `features/support/botrytis.rb`):
+```ruby
+require 'botrytis'
+Botrytis.configure do |config|
+  config.llm_provider = :openai  # or :claude, :gemini
+  config.model_name = "gpt-4o"
+  config.confidence_threshold = 0.7
+  config.cache_enabled = true
+end
+```
+3. **Set your API key** (e.g., in `.env` or environment):
+```bash
+export OPENAI_API_KEY=your_api_key_here
+```
+4. **Run your tests** and see semantic matching in action!
+```bash
+$ bundle exec cucumber
+# Output shows:
+# 6 scenarios (6 passed)
+# 24 steps (24 passed)
+# 🎯 Botrytis Semantic Matching Summary: 10 fuzzy matches found
+```
+## Examples
+### Authentication Variations
+```gherkin
+# All of these match the same step definition:
+Given the user has logged in to their account      # Exact match
+Given the user has authenticated successfully      # Semantic match ✨
+Given the user has signed in to their account      # Semantic match ✨
+```
+### Action Variations
+```gherkin
+# Step definition:
+When(/^they click the "([^"]*)" button$/) do |button_name|
+  # implementation
+end
+# These all work:
+When they click the "Buy Now" button               # Exact match
+When they press the "Buy Now" button               # Semantic match ✨
+When they tap the "Buy Now" button                 # Semantic match ✨
+When they hit the purchase button                  # Semantic match ✨
+When they mash the buy button                      # Semantic match ✨
+When they gently caresses the "Buy Now" button     # Semantic match ✨
+```
+### Assertion Variations
+```gherkin
+# Step definition:
+Then(/^they should see a confirmation message$/) do
+  # implementation
+end
+# These all work:
+Then they should see a confirmation message         # Exact match
+Then they should view a confirmation message        # Semantic match ✨
+Then they receive a success notification           # Semantic match ✨
+Then they get a notification                       # Semantic match ✨
+```
+## Configuration
+```ruby
+Botrytis.configure do |config|
+  # LLM Provider (required)
+  config.llm_provider = :openai     # :openai, :claude, or :gemini
+  # Model name (required)
+  config.model_name = "gpt-4o"      # or "claude-3-sonnet", "gemini-pro", etc.
+  # Confidence threshold (0.0 - 1.0)
+  config.confidence_threshold = 0.7  # Only matches above this confidence
+  # Caching
+  config.cache_enabled = true        # Cache LLM responses
+  config.cache_directory = ".botrytis_cache"  # Cache location
+end
+```
+### LLM Provider Setup
+**OpenAI**:
+```ruby
+config.llm_provider = :openai
+config.model_name = "gpt-4o"  # or "gpt-4", "gpt-3.5-turbo"
+# Set OPENAI_API_KEY environment variable
+```
+**Claude**:
+```ruby
+config.llm_provider = :claude
+config.model_name = "claude-3-sonnet-20240229"
+# Set ANTHROPIC_API_KEY environment variable
+```
+**Gemini**:
+```ruby
+config.llm_provider = :gemini
+config.model_name = "gemini-pro"
+# Set GOOGLE_API_KEY environment variable
+```
+## Development & Testing
+### Running Tests
+```bash
+# Run all tests with mocked responses (fast)
+bundle exec cucumber
+# Run with live API calls (requires API key)
+BOTRYTIS_LIVE_API=true bundle exec cucumber
+# Run RSpec unit tests
+bundle exec rspec
+```
+### Understanding the Output
+When semantic matching occurs, you'll see a summary at the end:
+```bash
+🎯 Botrytis Semantic Matching Summary: 10 fuzzy matches found
+```
+This tells you how many steps were matched semantically vs. exactly.
+### Cache Management
+Botrytis caches LLM responses to improve performance:
+```bash
+# Clear cache
+rm -rf .botrytis_cache
+# Disable caching for development
+Botrytis.configure do |config|
+  config.cache_enabled = false
+end
+```
+## How It Works
+1. **Step Execution**: When Cucumber can't find an exact step match, Botrytis intervenes
+2. **LLM Query**: The step text and available step patterns are sent to your configured LLM
+3. **Semantic Analysis**: The LLM analyzes semantic similarity and extracts parameters
+4. **Confidence Check**: Only matches above the confidence threshold are used
+5. **Execution**: The matched step definition runs with extracted parameters
+6. **Caching**: Results are cached to avoid repeated API calls
+## Requirements
+- Ruby 3.1.0 or higher
+- Cucumber 9.0 or higher
+- Sublayer 0.2.8 or higher
+- API key for your chosen LLM provider
+## Contributing
+Bug reports and pull requests are welcome on GitHub at https://github.com/sublayerapp/botrytis.
+### Development Setup
+```bash
+git clone https://github.com/sublayerapp/botrytis.git
+cd botrytis
+bundle install
+# Run tests
+bundle exec rake spec
+bundle exec cucumber
+# Build gem
+bundle exec rake build
+```
+## License
+The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
+## Why "Botrytis"?
+Botrytis is a genus of fungi known for being both beneficial and parasitic - much like how this gem helps your tests pass by being a little "fuzzy" with step matching! 🍄

data/debug_step_args.rb ADDED Viewed

@@ -0,0 +1,6 @@
+require 'cucumber'
+# Create a simple test to see what step_arguments should look like
+puts "Testing step argument format..."
+# Let's run a very simple cucumber test and see what gets created

data/future_tests.md ADDED Viewed

@@ -0,0 +1,141 @@
+# Future Test Ideas for Botrytis
+## Parameter Translation & Extraction Testing
+These are advanced test scenarios for future development that go beyond the basic semantic matching demonstrated in the blog post.
+### Complex Parameter Extraction
+Test the ability to extract parameters from semantically similar but structurally different steps.
+**Example:**
+- **Defined step**: `When I (buy|sell) (\d+) (.*) for \$(\d+\.\d+)`
+- **Test cases**:
+  - "When I purchase 3 bananas for $2.50" → should match with params `["purchase", "3", "bananas", "2.50"]`
+  - "When I acquire 5 oranges for $10.00" → should match with params `["acquire", "5", "oranges", "10.00"]`
+  - "When I sell 2 cars for $15000.00" → should match with params `["sell", "2", "cars", "15000.00"]`
+### Text-to-Number Parameter Conversion
+Test semantic understanding of different number representations.
+**Example:**
+- **Defined step**: `Given I have (\d+) apples`
+- **Test cases**:
+  - "Given I have five apples" → should extract `"5"`
+  - "Given I possess a dozen apples" → should extract `"12"`
+  - "Given I own several apples" → should handle ambiguous quantities
+### Date/Time Parameter Semantic Matching
+Test interpretation of different time expressions.
+**Example:**
+- **Defined step**: `When I schedule a meeting for (\d{4}-\d{2}-\d{2})`
+- **Test cases**:
+  - "When I schedule a meeting for tomorrow" → should convert to actual date
+  - "When I schedule a meeting for next Friday" → should convert to appropriate date
+  - "When I schedule a meeting for Christmas" → should handle holiday conversion
+### Multiple Parameter Reordering
+Test ability to match steps where parameters appear in different orders.
+**Example:**
+- **Defined step**: `Given (\w+) has (\d+) (.*) in their (\w+)`
+- **Test cases**:
+  - "Given Alice has 5 books in their backpack" → `["Alice", "5", "books", "backpack"]`
+  - "Given there are 3 pencils in Bob's drawer" → should reorder to `["Bob", "3", "pencils", "drawer"]`
+## Advanced Confidence Testing
+### Confidence Threshold Edge Cases
+Test behavior at various confidence levels to validate threshold settings.
+- Steps that should match at 0.9+ confidence
+- Steps that should match at 0.7-0.8 confidence
+- Steps that should be rejected below 0.7 confidence
+- Borderline cases that test threshold boundaries
+### Ambiguous Step Resolution
+Test handling when multiple step definitions could match with similar confidence.
+**Example:**
+- **Defined steps**:
+  - `When I click the save button`
+  - `When I click the submit button`
+- **Ambiguous input**: "When I click the confirm button"
+- **Expected behavior**: Either pick highest confidence match or request clarification
+## Performance & Scale Testing
+### Large Step Definition Sets
+Test performance with hundreds of step definitions to ensure semantic matching scales.
+### Caching Effectiveness
+Validate that caching improves performance for repeated semantic matches.
+### LLM Provider Comparison
+Test semantic matching quality across different LLM providers (OpenAI, Anthropic, local models).
+## Error Handling & Resilience
+### LLM Service Failures
+Test graceful degradation when LLM service is unavailable:
+- Should fall back to exact matching
+- Should provide helpful error messages
+- Should not crash the test suite
+### Malformed LLM Responses
+Test handling of unexpected LLM response formats:
+- Invalid JSON responses
+- Missing required fields
+- Confidence scores outside 0.0-1.0 range
+### Network Timeout Scenarios
+Test behavior under poor network conditions:
+- Slow LLM responses
+- Connection timeouts
+- Retry logic validation
+## Integration with BDD Tools
+### Multiple Cucumber Versions
+Test compatibility across different versions of Cucumber gem.
+### Other BDD Frameworks
+Explore integration with:
+- RSpec feature specs
+- Turnip
+- Spinach
+### IDE Integration
+Test semantic matching in development environments:
+- Step definition discovery in IDEs
+- Autocomplete with semantic suggestions
+- Real-time matching feedback
+## Real-World Scenario Testing
+### Business Domain Vocabularies
+Test semantic matching within specific business contexts:
+- E-commerce scenarios (buy/purchase/order)
+- Financial scenarios (pay/transfer/deposit)
+- Healthcare scenarios (diagnose/treat/prescribe)
+### Multi-language Step Definitions
+Test semantic matching across different natural languages:
+- English variations
+- Formal vs informal language
+- Technical vs business terminology
+## Security & Privacy Considerations
+### Sensitive Data in Steps
+Ensure no sensitive information is sent to LLM providers:
+- Test with steps containing mock credentials
+- Validate data sanitization
+- Test privacy-preserving modes
+### LLM Provider Data Retention
+Understand and test implications of different LLM providers' data policies.
+## Conclusion
+These advanced test scenarios will help ensure Botrytis becomes a robust, production-ready tool for semantic step matching. They build upon the basic functionality demonstrated in the blog post examples.

data/lib/botrytis/cucumber.rb ADDED Viewed

@@ -0,0 +1,151 @@
+require 'cucumber'
+require 'botrytis'
+module Botrytis
+  module CucumberIntegration
+    @@step_definitions = []
+    @@semantic_matcher = nil
+    @@semantic_matches_count = 0
+    @@total_step_attempts = 0
+    def self.install!
+      # Hook into step matching process instead of undefined creation
+      Cucumber::Glue::RegistryAndMore.prepend(SemanticStepMatcher)
+      # Hook into step definition registration to collect them
+      Cucumber::Glue::StepDefinition.prepend(StepDefinitionCollector)
+      # Initialize semantic matcher
+      @@semantic_matcher = Botrytis::SemanticMatcher.new
+    end
+    def self.step_definitions
+      @@step_definitions
+    end
+    def self.semantic_matcher
+      @@semantic_matcher
+    end
+    def self.record_semantic_match!
+      @@semantic_matches_count += 1
+    end
+    def self.record_step_attempt!
+      @@total_step_attempts += 1
+    end
+    def self.semantic_matches_count
+      @@semantic_matches_count
+    end
+    def self.total_step_attempts
+      @@total_step_attempts
+    end
+    def self.print_summary
+      if @@semantic_matches_count > 0
+        puts "\n🎯 Botrytis Semantic Matching Summary: #{@@semantic_matches_count} fuzzy matches found"
+      end
+    end
+    module StepDefinitionCollector
+      def initialize(*args)
+        super
+        # Add this step definition to our collection
+        Botrytis::CucumberIntegration.add_step_definition(self)
+      end
+    end
+    def self.add_step_definition(step_def)
+      # Create an adapter to make the step definition compatible with semantic matcher
+      adapter = StepDefinitionAdapter.new(step_def)
+      @@step_definitions << adapter
+    end
+    # Adapter to make modern Cucumber StepDefinition compatible with semantic matcher
+    class StepDefinitionAdapter
+      def initialize(step_def)
+        @step_def = step_def
+      end
+      def regexp_source
+        @step_def.expression.to_s
+      end
+      def proc
+        # Create a proc that delegates to the step definition
+        lambda { |*args| @step_def.invoke(nil, *args) }
+      end
+      def method_missing(method, *args, &block)
+        @step_def.send(method, *args, &block)
+      end
+      def respond_to_missing?(method, include_private = false)
+        @step_def.respond_to?(method, include_private)
+      end
+    end
+    module SemanticStepMatcher
+      def step_matches(name_to_match)
+        # First try the normal step matching
+        matches = super(name_to_match)
+        if matches.any?
+          return matches
+        end
+        # If no exact matches, try semantic matching
+        # Track semantic match attempts
+        Botrytis::CucumberIntegration.record_step_attempt!
+        semantic_match = attempt_semantic_match(name_to_match)
+        if semantic_match
+          # Semantic match found
+          Botrytis::CucumberIntegration.record_semantic_match!
+          return [semantic_match]
+        else
+          # No semantic match found
+          return matches # Return empty array, which will result in undefined step
+        end
+      end
+      private
+      def attempt_semantic_match(step_name)
+        step_definitions = Botrytis::CucumberIntegration.step_definitions
+        semantic_matcher = Botrytis::CucumberIntegration.semantic_matcher
+        return nil if step_definitions.empty? || semantic_matcher.nil?
+        # Use the semantic matcher to find a match
+        match = semantic_matcher.find_match(step_name, step_definitions)
+        return match if match
+        nil
+      end
+    end
+  end
+end
+# Install the semantic matching when this file is loaded
+Botrytis::CucumberIntegration.install!
+# Custom StepMatch class for semantic matches that avoids display corruption
+# The core issue is that Cucumber tries to highlight parameters in step text
+# based on parameter positions from the constructed text, but the positions
+# don't align with the original step text, causing garbled display.
+class SemanticStepMatch < Cucumber::StepMatch
+  def replace_arguments(step_name, format, colour)
+    # For semantic matches, don't try to replace/highlight arguments
+    # Just return the original step name to avoid garbled text from
+    # parameter position mismatches
+    step_name
+  end
+end
+# Print summary at exit
+at_exit do
+  Botrytis::CucumberIntegration.print_summary
+end

data/lib/botrytis/formatter.rb ADDED Viewed

@@ -0,0 +1,75 @@
+require 'cucumber/formatter/console'
+require_relative '../botrytis'
+module Botrytis
+  class Formatter
+    include Cucumber::Formatter::Console
+    def initialize(config)
+      @config = config
+      # Initialize Botrytis configuration if not already done
+      unless defined?(Botrytis.configuration)
+        Botrytis.configure do |botrytis_config|
+          botrytis_config.confidence_threshold = 0.7
+          botrytis_config.cache_enabled = false
+          botrytis_config.llm_provider = :openai
+          botrytis_config.model_name = "gpt-4o"
+        end
+      end
+      @semantic_matcher = SemanticMatcher.new
+      @step_definitions = []
+      # Use the modern event system to collect step definitions
+      config.on_event(:step_definition_registered) do |event|
+        @step_definitions << event.step_definition
+      end
+      # Register semantic matcher once all setup is done
+      config.on_event(:test_run_started) do |event|
+        register_semantic_matcher
+      end
+    end
+    def register_semantic_matcher
+      # Find the correct StepDefinition class to monkey patch
+      step_def_class = if defined?(Cucumber::Glue::StepDefinition)
+                         Cucumber::Glue::StepDefinition
+                       elsif defined?(Cucumber::StepDefinition)
+                         Cucumber::StepDefinition
+                       else
+                         # Try to find any step definition class
+                         Cucumber.constants.select do |c|
+                           Cucumber.const_get(c).is_a?(Class) && c.to_s.include?('Step')
+                         end.first&.then { |c| Cucumber.const_get(c) }
+                       end
+      return unless step_def_class
+      @original_match_method = step_def_class.instance_method(:match)
+      semantic_matcher = @semantic_matcher
+      step_definitions = @step_definitions
+      step_def_class.define_method(:match) do |step_name|
+        result = @original_match_method.bind(self).call(step_name)
+        if result.nil?
+          puts "\n🥒 Botrytis is looking for a fuzzy match for: \"#{step_name}\""
+          match = semantic_matcher.find_match(step_name, step_definitions)
+          if match
+            puts "✅ Found a semantic match!"
+            return match
+          else
+            puts "❌ No semantic match found"
+          end
+        end
+        result
+      end
+    end
+  end
+end

data/lib/botrytis/semantic_match_generator.rb CHANGED Viewed

@@ -6,7 +6,45 @@ module Botrytis
       name: "step_match_result",
       description: "Results of semantic matching for a cucumber step",
       attributes: [
-        { name: "match_found", description: "
+        { name: "step_text_analysis", description: "Analysis of the step text, including semantic meaning and intent" },
+        { name: "match_found", description: "Indicates if a match was found, either the string yes or no" },
+        { name: "best_match_pattern", description: "The pattern that best matches semantically" },
+        { name: "confidence", description: "Confidence score of the match (0.0 - 1.0)" },
+        { name: "parameter_values", description: "A comma separated list of parameter values extracted from the match" }
       ]
+      def initialize(step_text:, available_patterns:)
+        @step_text = step_text
+        @available_patterns = available_patterns
+      end
+      def generate
+        super
+      end
+      def prompt
+        <<-PROMPT
+        You are a semantic matcher for Cucumber step definitions. Your task is to determine if a step text semantically matches one of the available regex patterns, even if it doesn't match exactly.
+        Step Text: "#{@step_text}"
+        Available Step Definition Patterns:
+        #{@available_patterns.join("\n")}
+        For each pattern, consider:
+        1. The semantic meaning/intent of the step
+        2. The structure of the pattern
+        3. Any parameters that would need to be extracted
+        Choose the pattern that best matches the step text semantically.
+        If no pattern is a good semantic match, indicate that no match was found.
+        If you find a match, extract any parameters that would be captured by the pattern. And return them as a comma separated list.
+        For example, if the pattern is "I have (\\d+) cucumbers" and the step is "I have 5 cucumbers",
+        the parameter value would be "5".
+        Provide your confidence in the match as a value between 0.0 (no confidence) and 1.0 (absolute certainty).
+        PROMPT
+      end
   end
 end

data/lib/botrytis/semantic_matcher.rb ADDED Viewed

@@ -0,0 +1,261 @@
+require 'digest'
+require 'fileutils'
+require 'json'
+require 'botrytis/semantic_match_generator'
+module Botrytis
+  class SemanticMatcher
+    def initialize
+      ensure_cache_directory if Botrytis.configuration.cache_enabled
+    end
+    def find_match(step_text, available_step_definitions)
+      patterns = available_step_definitions.map do  |step_def|
+        {
+          pattern: step_def.regexp_source,
+          proc: step_def.proc,
+          step_def: step_def
+        }
+      end
+      # Filter out test verification steps for semantic matching
+      # These are steps that end with "should have executed" or similar test patterns
+      business_patterns = patterns.reject do |p|
+        pattern_text = p[:pattern].to_s
+        pattern_text.include?("should have executed") ||
+        pattern_text.include?("configured for testing") ||
+        pattern_text.include?("test") ||
+        pattern_text.include?("verification")
+      end
+      # Use business patterns for LLM matching, but keep all patterns for final matching
+      query_patterns = business_patterns.empty? ? patterns : business_patterns
+      if Botrytis.configuration.cache_enabled
+        cache_result = check_cache(step_text, patterns.map { |p| p[:pattern] })
+        return cache_result if cache_result
+      end
+      match_result = query_llm(step_text, query_patterns.map { |p| p[:pattern] })
+      save_to_cache(step_text, patterns.map { |p| p[:pattern] }, match_result) if Botrytis.configuration.cache_enabled
+      if match_result.match_found == "yes" && match_result.confidence.to_f >= Botrytis.configuration.confidence_threshold
+        # Handle different pattern formats from LLM response
+        # The LLM might return escaped quotes or slightly different formats
+        matching_pattern = patterns.find do |p|
+          original_pattern = p[:pattern]
+          llm_pattern = match_result.best_match_pattern
+          # Try exact match first
+          original_pattern == llm_pattern ||
+          # Try with/without surrounding slashes
+          original_pattern == "/#{llm_pattern}/" ||
+          original_pattern.gsub(/^\/|\/$/,'') == llm_pattern.gsub(/^\/|\/$/,'') ||
+          # Try with unescaped quotes (LLM might escape them)
+          original_pattern == llm_pattern.gsub(/\\\"/, '"') ||
+          original_pattern.gsub(/^\/|\/$/,'') == llm_pattern.gsub(/^\/|\/$/,'').gsub(/\\\"/, '"')
+        end
+        if matching_pattern
+          # Convert comma-separated parameter_values string to array
+          parameter_values = if match_result.parameter_values.nil? || match_result.parameter_values.empty?
+                              []
+                            else
+                              match_result.parameter_values.split(',').map(&:strip)
+                            end
+          return create_match_result(matching_pattern[:step_def], step_text, parameter_values)
+        end
+      end
+      nil
+    end
+    private
+    def query_llm(step_text, patterns)
+      generator = SemanticMatchGenerator.new(
+        step_text: step_text,
+        available_patterns: patterns
+      )
+      case Botrytis.configuration.llm_provider
+      when :openai
+        Sublayer.configuration.ai_provider = Sublayer::Providers::OpenAI
+      when :claude
+        Sublayer.configuration.ai_provider = Sublayer::Providers::Claude
+      when :gemini
+        Sublayer.configuration.ai_provider = Sublayer::Providers::Gemini
+      end
+      Sublayer.configuration.ai_model = Botrytis.configuration.model_name
+      begin
+        result = generator.generate
+        result
+      rescue => e
+        # LLM API Error occurred, falling back to no match
+        # Return a "no match" response if API fails
+        OpenStruct.new(
+          match_found: "no",
+          best_match_pattern: "",
+          confidence: "0.0",
+          parameter_values: ""
+        )
+      end
+    end
+    def ensure_cache_directory
+      FileUtils.mkdir_p(Botrytis.configuration.cache_directory) unless Dir.exist?(Botrytis.configuration.cache_directory)
+    end
+    def cache_key(step_text, patterns)
+      Digest::MD5.hexdigest("#{step_text}-#{patterns.sort.join('-')}")
+    end
+    def check_cache(step_text, patterns)
+      key = cache_key(step_text, patterns)
+      cache_file = File.join(Botrytis.configuration.cache_directory, "#{key}.json")
+      if File.exist?(cache_file)
+        data = JSON.parse(File.read(cache_file))
+      end
+      nil
+    end
+    def create_match_result(step_definition, step_text, parameter_values)
+      # Instead of trying to manually create step arguments, let's use Cucumber's
+      # normal matching mechanism by having the step definition actually match
+      # a constructed step text that would produce the right parameters
+      begin
+        # Try to construct a step text that the step definition would actually match
+        constructed_step_text = construct_matching_step_text_for_step_def(step_definition, parameter_values)
+        # Use the step definition's normal matching mechanism
+        if step_definition.respond_to?(:arguments_from)
+          # This is the normal way Cucumber creates step matches
+          step_arguments = step_definition.arguments_from(constructed_step_text)
+          return SemanticStepMatch.new(step_definition, step_text, step_arguments)
+        else
+          # Fallback to manual creation
+          step_arguments = create_original_step_arguments(step_definition, parameter_values)
+          return SemanticStepMatch.new(step_definition, step_text, step_arguments)
+        end
+      rescue => e
+        # Error in create_match_result, falling back
+        # Fallback to simple creation with empty arguments
+        return SemanticStepMatch.new(step_definition, step_text, [])
+      end
+    end
+    def create_proper_step_arguments(step_definition, step_text, parameter_values)
+      # Create step arguments that don't interfere with display formatting
+      # For semantic matching, we just need the parameter values to be passed to the step
+      # We don't need complex MatchData objects since Cucumber will handle display
+      return parameter_values || []
+    end
+    def construct_matching_step_text_for_step_def(step_definition, parameter_values)
+      # This creates a step text that would actually match the step definition's regex
+      # and produce the desired parameter values
+      if parameter_values.nil? || parameter_values.empty?
+        # For steps without parameters, just use the regexp source without anchors
+        if step_definition.respond_to?(:regexp_source)
+          source = step_definition.regexp_source.to_s
+          return source.gsub(/^\/\^/, '').gsub(/\$\/$/, '').gsub(/[\^$\/]/, '')
+        end
+      end
+      # For the button example: /^they click the "([^"]*)" button$/
+      # We want to produce: they click the "Buy Now" button
+      # So that when matched, it captures "Buy Now"
+      if step_definition.respond_to?(:expression) && step_definition.expression.is_a?(Regexp)
+        regex = step_definition.expression
+      elsif step_definition.respond_to?(:regexp_source)
+        source = step_definition.regexp_source.to_s.gsub(/^\/\^/, '').gsub(/\$\/$/, '')
+        regex = Regexp.new("^#{source}$")
+      else
+        # Fallback - return something simple
+        return parameter_values.join(' ')
+      end
+      # Better approach: build step text by understanding the regex structure
+      pattern = regex.source
+      # For simple cases like they click the "([^"]*)" button
+      # We want to replace ([^"]*) with the actual parameter value
+      if pattern.include?('"([^"]*)"') && parameter_values.length == 1
+        # Handle quoted parameter patterns specifically
+        result = pattern.gsub(/\([^)]+\)/, parameter_values[0])
+      else
+        # Fallback to general replacement
+        parameter_values.each do |value|
+          pattern = pattern.sub(/\([^)]+\)/, value)
+        end
+        result = pattern
+      end
+      # Clean up anchors and regex chars
+      result.gsub(/[\^$]/, '').gsub(/^\//, '').gsub(/\/$/, '')
+    end
+    def create_original_step_arguments(step_definition, parameter_values)
+      # For semantic matching, create simple argument objects that work with Cucumber
+      # This avoids the complex text construction that causes display issues
+      return [] if parameter_values.nil? || parameter_values.empty?
+      # Create simple argument objects that just hold the parameter values
+      # without trying to map them to text positions
+      parameter_values.map do |value|
+        # Create a minimal object that responds to the methods Cucumber expects
+        StepArgument.new(value)
+      end
+    end
+    # Minimal step argument class for semantic matching
+    class StepArgument
+      def initialize(value)
+        @value = value
+      end
+      def group(index = 0)
+        index == 0 ? @value : nil
+      end
+      def to_s
+        @value.to_s
+      end
+      def value
+        @value
+      end
+      def captures
+        [@value]
+      end
+    end
+    def construct_matching_step_text(regex, parameter_values)
+      # This is a simple approach: take the regex pattern and substitute
+      # capture groups with our parameter values
+      pattern = regex.source
+      # Replace capture groups like ([^"]*) with actual values
+      parameter_values.each_with_index do |value, index|
+        # Replace the first capture group with the parameter value
+        pattern = pattern.sub(/\([^)]+\)/, value)
+      end
+      # Clean up the pattern to make it a valid step text
+      pattern = pattern.gsub(/[\^$]/, '') # Remove anchors
+      pattern
+    end
+  end
+end

data/lib/botrytis/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Botrytis
-  VERSION = "0.1.0"
+  VERSION = "1.0.0"
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: botrytis
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 1.0.0
 platform: ruby
 authors:
 - Scott Werner
@@ -59,13 +59,20 @@ executables: []
 extensions: []
 extra_rdoc_files: []
 files:
+- ".claude/settings.local.json"
 - ".rspec"
+- CLAUDE.md
 - LICENSE
 - README.md
 - Rakefile
+- debug_step_args.rb
+- future_tests.md
 - lib/botrytis.rb
 - lib/botrytis/configuration.rb
+- lib/botrytis/cucumber.rb
+- lib/botrytis/formatter.rb
 - lib/botrytis/semantic_match_generator.rb
+- lib/botrytis/semantic_matcher.rb
 - lib/botrytis/version.rb
 - sig/botrytis.rbs
 homepage: https://github.com/sublayerapp/botrytis