RubyGems - ruby-skill-bench - Versions diffs - 0.1.0 → 1.0.1 - Mend

ruby-skill-bench 0.1.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

checksums.yaml +4 -4
data/README.md +86 -0
data/lib/skill_bench/cli/compare_command.rb +91 -0
data/lib/skill_bench/cli/help_printer.rb +9 -1
data/lib/skill_bench/cli/run_command.rb +6 -4
data/lib/skill_bench/cli.rb +7 -4
data/lib/skill_bench/clients/all.rb +1 -0
data/lib/skill_bench/clients/providers/mock.rb +56 -0
data/lib/skill_bench/commands/run.rb +6 -2
data/lib/skill_bench/config/applier.rb +1 -0
data/lib/skill_bench/config/defaults.rb +1 -0
data/lib/skill_bench/config/facade_readers.rb +7 -0
data/lib/skill_bench/config/json_loader.rb +3 -3
data/lib/skill_bench/config/store.rb +5 -0
data/lib/skill_bench/config.rb +10 -1
data/lib/skill_bench/delta_report.rb +20 -0
data/lib/skill_bench/execution/source_path_resolver.rb +59 -3
data/lib/skill_bench/registry/pack_resolver.rb +119 -0
data/lib/skill_bench/services/agent_spawner_service.rb +114 -0
data/lib/skill_bench/services/compare_option_parser.rb +55 -0
data/lib/skill_bench/services/comparison_reporter.rb +97 -0
data/lib/skill_bench/services/comparison_runner.rb +49 -0
data/lib/skill_bench/services/context_loader_service.rb +42 -0
data/lib/skill_bench/services/error_response_builder.rb +119 -0
data/lib/skill_bench/services/eval_resolver.rb +33 -0
data/lib/skill_bench/services/exit_code_calculator.rb +39 -0
data/lib/skill_bench/services/judge_params_builder.rb +54 -0
data/lib/skill_bench/services/manifest_finder.rb +36 -0
data/lib/skill_bench/services/output_formatter.rb +28 -0
data/lib/skill_bench/services/prompt_builder_service.rb +98 -0
data/lib/skill_bench/services/provider_resolver.rb +73 -0
data/lib/skill_bench/services/runner_service.rb +84 -315
data/lib/skill_bench/services/skill_resolver.rb +37 -9
data/lib/skill_bench/services/skill_resolver_service.rb +70 -0
data/lib/skill_bench/services/source_path_resolver_service.rb +45 -0
data/lib/skill_bench/services/trend_recorder_service.rb +67 -0
data/lib/skill_bench/services/variant_parser.rb +32 -0
data/lib/skill_bench/services/variant_resolver.rb +63 -0
data/lib/skill_bench/version.rb +1 -1
metadata +23 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 58713d379ef4db5ce99a695309440159115257fee5b995ed6c6b8f1cbdca13b7
-  data.tar.gz: 52cab6f8913582728c66fa1d34d3109958c5c5c33bbf9ffd5969f5b5cb13908e
+  metadata.gz: d3c4edfe40e04251d2e7b758e7c630ee9affaa9e8170ceb0fa379d61bacc81e6
+  data.tar.gz: e9ef2eb8ef7a524d607c6e44705df772feec8939a376b516adff032eeeb8b535
 SHA512:
-  metadata.gz: 74cc26703d9cb9da5362ed0450987e02af86847025c7c687d6519b191f921a300bce66a166a5596a956bedb7c6b52c45cc29a003bf3f93df8e491565608ca9af
-  data.tar.gz: 1dce2943558b3c0672be0950905e140892546928f0ddb89ddcd5b6457d99e9b562915f77239e3ab71d290737ed1705977f5f4abfc4f5eed505ec9783fa28d46f
+  metadata.gz: b92554c769e34205d1c197bd67a9ca2ae61876b83c5429e202c667831100470fa9f1ed48a297ea184855e33e7ac3945fb513909b2344634078b8090750325dc9
+  data.tar.gz: 7ae92f1331f2061cccf42a1f27f80cbe41c73d54d0909499900efa84ad3984edada8e7df10b5a018717861234974918cdfac80b5242483df108272093eec8deb

data/README.md CHANGED Viewed

@@ -7,6 +7,21 @@
 *A high-fidelity evaluation engine for benchmarking AI agent skills across any stack (Rails-first, but extensible).*
+## Part of the AI Skill Ecosystem
+This repo is one of 6 in a composable AI skill ecosystem:
+| Repo | Role |
+|------|------|
+| [`ruby-core-skills`](https://github.com/igmarin/ruby-core-skills) | 15 shared Ruby skills + process discipline |
+| [`rails-agent-skills`](https://github.com/igmarin/rails-agent-skills) | 28 Rails-specific skills + 9 agents |
+| [`hanakai-yaku`](https://github.com/igmarin/hanakai-yaku) | 35 Hanami/dry-rb skills + 10 agents |
+| [`agnostic-planning-skills`](https://github.com/igmarin/agnostic-planning-skills) | 10 planning skills + 4 agents |
+| [`agent-mcp-runtime`](https://github.com/igmarin/agent-mcp-runtime) | Rust CLI runtime (pack resolution, MCP) |
+| [**`ruby-skill-bench`**](https://github.com/igmarin/ruby-skill-bench) | Benchmark/eval engine |
+See the [Ecosystem Overview](https://github.com/igmarin/agent-mcp-runtime/blob/main/docs/ecosystem.md) for the full architecture.
 ---
 ## Features
@@ -343,6 +358,77 @@ Both skill contexts are concatenated and sent to the agent. The judge evaluates
 ---
+## Multi-Repo Skill Benchmarking
+Skills in the ecosystem are split across multiple repos:
+- `ruby-core-skills` — 15 shared Ruby skills (DDD, patterns, process discipline)
+- `rails-agent-skills` — 28 Rails-specific skills
+- `hanakai-yaku` — 35 Hanami/dry-rb skills
+To benchmark a skill from an external repo, use the `--skill` flag:
+```bash
+# Benchmark a core skill
+skill-bench run evals/skills/write-yard-docs/basic \
+  --skill /path/to/ruby-core-skills/skills/patterns/write-yard-docs
+# Benchmark a Rails skill
+skill-bench run evals/skills/code-review/pr-review \
+  --skill /path/to/rails-agent-skills/skills/code-quality/code-review
+```
+### Config-Based Multi-Repo Resolution
+Configure `skill_sources` in `skill-bench.json` to automatically resolve skills across repos without `--skill` every time:
+```json
+{
+  "provider": "openai",
+  "model": "gpt-4o",
+  "skill_sources": {
+    "core": "../ruby-core-skills/skills",
+    "rails": "../rails-agent-skills/skills",
+    "hanami": "../hanakai-yaku/skills"
+  }
+}
+```
+Each key is a source name (for logging), each value is a path to a `skills/` directory. When a skill is not found locally, SkillBench iterates through `skill_sources` and uses the first match.
+### Pack-Based Resolution (`--pack`)
+Resolve skills via the ecosystem registry manifest (from `agent-mcp-runtime`):
+```bash
+# Run an eval using the Rails pack's version of code-review
+skill-bench run evals/skills/code-review/basic \
+  --skill code-review \
+  --pack rails
+# Override the default registry manifest path
+skill-bench run evals/skills/code-review/basic \
+  --skill code-review \
+  --pack rails \
+  --registry-manifest /path/to/registry.json
+```
+### Variant Comparison (`compare`)
+Compare the same skill across two pack variants to measure context-dependent performance:
+```bash
+skill-bench compare code-review \
+  --variant-a "pack:rails" \
+  --variant-b "pack:hanami" \
+  --eval evals/skills/code-review/basic
+```
+The `--variant` spec supports two forms:
+- `pack:<name>` — resolve via registry manifest
+- `/absolute/path` or `relative/path` — use a direct path
+---
 ## File Reference: What Lives on Disk
 SkillBench creates and manages three files in your project. Understanding them helps you iterate faster.

data/lib/skill_bench/cli/compare_command.rb ADDED Viewed

@@ -0,0 +1,91 @@
+# frozen_string_literal: true
+require_relative '../services/compare_option_parser'
+require_relative '../services/variant_parser'
+require_relative '../services/comparison_runner'
+require_relative '../services/comparison_reporter'
+require_relative '../services/exit_code_calculator'
+module SkillBench
+  module Cli
+    # Handles the `skill-bench compare` command.
+    # Runs the same eval with two skill variants and reports the comparison.
+    class CompareCommand
+      # Parses argv and executes the comparison.
+      #
+      # @param argv [Array<String>] Raw CLI arguments
+      # @return [Integer] Exit code
+      def self.call(argv)
+        new(argv).call
+      end
+      # @param argv [Array<String>] Raw CLI arguments
+      def initialize(argv)
+        @argv = argv
+      end
+      # Parses options, runs both variants, and prints a comparison report.
+      #
+      # @return [Integer] Exit code (0 if both pass, 1 otherwise)
+      def call
+        options = Services::CompareOptionParser.call(@argv)
+        skill_name = @argv.shift
+        return error_missing_skill unless skill_name
+        return error_missing_variant_a unless options[:variant_a]
+        return error_missing_variant_b unless options[:variant_b]
+        return error_missing_eval unless options[:eval]
+        variant_a = Services::VariantParser.call(options[:variant_a])
+        variant_b = Services::VariantParser.call(options[:variant_b])
+        puts "--- Running Variant A: #{options[:variant_a]} ---"
+        puts "--- Running Variant B: #{options[:variant_b]} ---"
+        results = Services::ComparisonRunner.call(
+          variant_a,
+          variant_b,
+          skill_name,
+          options[:eval]
+        )
+        Services::ComparisonReporter.call(
+          results[:result_a],
+          results[:result_b],
+          options[:variant_a],
+          options[:variant_b]
+        )
+        Services::ExitCodeCalculator.call(results[:result_a], results[:result_b])
+      rescue SkillBench::HelpRequested
+        0
+      rescue StandardError => e
+        warn "Error: #{e.message}"
+        1
+      end
+      private
+      def error_missing_skill
+        warn 'Error: skill name is required'
+        warn 'Usage: skill-bench compare <skill-name> --variant-a <spec> --variant-b <spec> --eval <path>'
+        1
+      end
+      def error_missing_variant_a
+        warn 'Error: --variant-a is required'
+        1
+      end
+      def error_missing_variant_b
+        warn 'Error: --variant-b is required'
+        1
+      end
+      def error_missing_eval
+        warn 'Error: --eval is required'
+        1
+      end
+    end
+  end
+end

data/lib/skill_bench/cli/help_printer.rb CHANGED Viewed

@@ -19,11 +19,19 @@ module SkillBench
               Providers: #{providers}
               --force    Overwrite existing config file
-            run <eval> --skill <name> [--skill <name>] [--format FORMAT]
+            run <eval> --skill <name> [--skill <name>] [--format FORMAT] [--pack NAME]
               Run an evaluation
               --skill    Skill to use (can be specified multiple times)
+              --pack     Pack context for registry-based skill resolution
+              --registry-manifest PATH  Path to registry.json manifest
               --format   Output format: human, json, junit (default: human)
+            compare <skill-name> --variant-a SPEC --variant-b SPEC --eval PATH
+              Compare the same skill across two pack variants
+              --variant-a  First variant (e.g., "pack:rails" or "/path/to/skill")
+              --variant-b  Second variant (e.g., "pack:hanami")
+              --eval       Path to the eval directory
             skill new <name> [--mode MODE] [--template TYPE]
               Create a new skill
               --mode     simple, advanced, or rails (default: simple)

data/lib/skill_bench/cli/run_command.rb CHANGED Viewed

@@ -29,7 +29,7 @@ module SkillBench
         eval_name = @argv.shift
         return error_missing_eval unless eval_name
-        return error_missing_skill if options[:skill_names].empty?
+        return error_missing_skill if options[:skill_names].empty? && !options[:pack]
         options[:eval_name] = eval_name
         exec_options = options.reject { |key| key == :format }
@@ -48,6 +48,8 @@ module SkillBench
         OptionParser.new do |opts|
           opts.banner = 'Usage: skill-bench run <eval> [options]'
           opts.on('--skill NAME', 'Skill to use (can be specified multiple times)') { |v| options[:skill_names] << v }
+          opts.on('--pack NAME', 'Pack context for skill resolution') { |v| options[:pack] = v }
+          opts.on('--registry-manifest PATH', 'Path to registry.json manifest') { |v| options[:registry_manifest] = v }
           opts.on('--format FORMAT', 'Output format (human, json, junit)') { |v| options[:format] = v.to_sym }
           opts.on('-h', '--help', 'Prints this help') do
             puts opts
@@ -58,13 +60,13 @@ module SkillBench
       def error_missing_eval
         warn 'Error: eval name is required'
-        warn 'Usage: skill-bench run <eval> --skill <name>'
+        warn 'Usage: skill-bench run <eval> [--skill <name>] [--pack <name>]'
         1
       end
       def error_missing_skill
-        warn 'Error: skill name is required'
-        warn 'Usage: skill-bench run <eval> --skill <name>'
+        warn 'Error: skill name or pack is required'
+        warn 'Usage: skill-bench run <eval> --skill <name> [--pack <name>]'
         1
       end
     end

data/lib/skill_bench/cli.rb CHANGED Viewed

@@ -2,6 +2,7 @@
 require_relative 'cli/init_command'
 require_relative 'cli/run_command'
+require_relative 'cli/compare_command'
 require_relative 'cli/skill_command'
 require_relative 'cli/eval_command'
 require_relative 'cli/help_printer'
@@ -18,6 +19,7 @@ module SkillBench
     # @param argv [Array<String>] Raw CLI arguments.
     # @return [Integer] Exit code.
     def self.call(argv)
+      Config.reset
       new(argv).call
     end
@@ -35,10 +37,11 @@ module SkillBench
       subcommand = @argv.shift
       case subcommand
-      when 'init'  then Cli::InitCommand.call(@argv)
-      when 'run'   then Cli::RunCommand.call(@argv)
-      when 'skill' then Cli::SkillCommand.call(@argv)
-      when 'eval'  then Cli::EvalCommand.call(@argv)
+      when 'init'    then Cli::InitCommand.call(@argv)
+      when 'run'     then Cli::RunCommand.call(@argv)
+      when 'compare' then Cli::CompareCommand.call(@argv)
+      when 'skill'   then Cli::SkillCommand.call(@argv)
+      when 'eval'    then Cli::EvalCommand.call(@argv)
       when '-h', '--help', 'help'
         help.call
       else

data/lib/skill_bench/clients/all.rb CHANGED Viewed

@@ -17,3 +17,4 @@ require_relative 'providers/opencode'
 require_relative 'providers/groq'
 require_relative 'providers/deepseek'
 require_relative 'providers/openrouter'
+require_relative 'providers/mock'

data/lib/skill_bench/clients/providers/mock.rb ADDED Viewed

@@ -0,0 +1,56 @@
+# frozen_string_literal: true
+require_relative '../provider_registry'
+require 'json'
+module SkillBench
+  module Clients
+    module Providers
+      # Mock LLM client for testing and local validation.
+      class Mock
+        SkillBench::Clients::ProviderRegistry.register(:mock, self)
+        # Mock call implementation to simulate LLM responses for test suites.
+        #
+        # @param system_prompt [String] system prompt instructions.
+        # @param messages [Array<Hash>] chat history messages.
+        # @param _options [Hash] additional keyword options.
+        # @return [Hash] mock response hash.
+        def self.call(system_prompt:, messages:, **_options)
+          _ = system_prompt
+          prompt = messages.first[:content] || messages.first['content'] || ''
+          # Parse dimensions from prompt
+          dimensions = {}
+          prompt.scan(/-\s+([^:]+):\s+max_score=(\d+)/).each do |name, max_score|
+            max = max_score.to_i
+            # Give baseline slightly lower score than context to simulate improvement
+            is_context = prompt.match?(/## Skill Context\s+\S+/)
+            score = is_context ? (max * 0.95).round : (max * 0.8).round
+            dimensions[name] = {
+              'score' => score,
+              'max_score' => max,
+              'reasoning' => "Mock evaluation for #{name}"
+            }
+          end
+          dimensions['correctness'] = { 'score' => 8, 'max_score' => 10, 'reasoning' => 'Mock correctness' } if dimensions.empty?
+          content = {
+            'dimensions' => dimensions,
+            'overall_reasoning' => 'Mock evaluation overall reasoning'
+          }.to_json
+          {
+            success: true,
+            response: {
+              message: {
+                content: content
+              }
+            }
+          }
+        end
+      end
+    end
+  end
+end

data/lib/skill_bench/commands/run.rb CHANGED Viewed

@@ -9,11 +9,15 @@ module SkillBench
       # Run an eval with specified skill(s)
       # @param eval_name [String] Name of eval to run (e.g., 'test-eval' or 'evals/test-eval')
       # @param skill_names [Array<String>] Names of skills to use
+      # @param pack [String, nil] Optional pack name for registry-based skill resolution
+      # @param registry_manifest [String, nil] Optional path to registry.json manifest
       # @return [Hash] Result with pass/fail and score
-      def self.run(eval_name:, skill_names:)
+      def self.run(eval_name:, skill_names:, pack: nil, registry_manifest: nil)
         Services::RunnerService.call(
           eval_name: eval_name,
-          skill_names: skill_names
+          skill_names: skill_names,
+          pack: pack,
+          registry_manifest: registry_manifest
         )
       end
     end

data/lib/skill_bench/config/applier.rb CHANGED Viewed

@@ -41,6 +41,7 @@ module SkillBench
         assign_current_provider
         @store.assign_max_execution_time(@data[:max_execution_time]) if @data.key?(:max_execution_time)
         @store.assign_allowed_commands(@data[:allowed_commands]) if @data.key?(:allowed_commands)
+        @store.skill_sources = @data[:skill_sources] if @data.key?(:skill_sources)
       end
       def apply_provider_values

data/lib/skill_bench/config/defaults.rb CHANGED Viewed

@@ -19,6 +19,7 @@ module SkillBench
           current_llm_provider: :openai,
           max_execution_time: 30,
           allowed_commands: nil,
+          skill_sources: {},
           llm_providers_config: {
             openai: { api_key: nil, model: 'gpt-4o' },
             anthropic: { api_key: nil, model: 'claude-sonnet-4-20250514' },

data/lib/skill_bench/config/facade_readers.rb CHANGED Viewed

@@ -32,6 +32,13 @@ module SkillBench
         store.llm_providers_config
       end
+      # Returns skill sources mapping.
+      #
+      # @return [Hash, nil] skill source name → directory path
+      def skill_sources
+        store.skill_sources
+      end
       # Returns the API key for the current LLM provider.
       #
       # @return [String, nil] API key for the current provider

data/lib/skill_bench/config/json_loader.rb CHANGED Viewed

@@ -29,9 +29,9 @@ module SkillBench
         data = JSON.parse(File.read(@path), symbolize_names: true)
         return warn_invalid_config unless data.is_a?(Hash)
-        success(data.slice(:current_llm_provider, :max_execution_time, :allowed_commands)
-                    .compact
-                    .merge(providers: normalized_providers(data[:providers])))
+        success_data = data.slice(:current_llm_provider, :max_execution_time, :allowed_commands, :skill_sources).compact
+        success_data[:current_llm_provider] ||= data[:provider] if data.key?(:provider)
+        success(success_data.merge(providers: normalized_providers(data[:providers])))
       rescue JSON::ParserError => e
         log_parse_error(e)
         failure('Failed to parse config file')

data/lib/skill_bench/config/store.rb CHANGED Viewed

@@ -24,6 +24,11 @@ module SkillBench
       # @return [Hash, nil] provider configuration by provider name
       attr_accessor :llm_providers_config
+      # Returns skill sources mapping.
+      #
+      # @return [Hash, nil] skill source name → directory path
+      attr_accessor :skill_sources
       # Initializes a new configuration store with empty provider settings.
       def initialize
         @llm_providers_config = {}

data/lib/skill_bench/config.rb CHANGED Viewed

@@ -74,7 +74,9 @@ module SkillBench
         @store = Config::Store.new
         apply_defaults
         apply_json_config(home_config_path)
-        apply_json_config(Pathname.new(Dir.pwd).join(CONFIG_FILENAME))
+        local_path = Pathname.new(Dir.pwd).join(CONFIG_FILENAME)
+        is_workspace_file = File.exist?(File.join(Dir.pwd, 'ruby-skill-bench.gemspec'))
+        apply_json_config(local_path) unless defined?(Minitest) && is_workspace_file
         apply_env_overrides
       end
@@ -122,6 +124,13 @@ module SkillBench
         store.llm_providers_config || {}
       end
+      # Returns skill sources mapping.
+      #
+      # @return [Hash, nil] skill source name → directory path
+      def skill_sources
+        store.skill_sources || {}
+      end
       # Returns API key from configuration.
       #
       # @return [String, nil] API key

data/lib/skill_bench/delta_report.rb CHANGED Viewed

@@ -49,6 +49,26 @@ module SkillBench
       { success: false, response: { error: { message: e.message } } }
     end
+    # Compatibility methods for ComparisonReporter
+    # Returns the list of dimensions from the context run.
+    #
+    # @return [Array<Object>] List of objects responding to name and score
+    def dimensions
+      return [] unless context_dimensions
+      context_dimensions.map do |name, dim_hash|
+        Struct.new(:name, :score).new(name.to_s, dim_hash[:score] || dim_hash['score'])
+      end
+    end
+    # Returns the total context score.
+    #
+    # @return [Numeric, nil]
+    def total
+      context_total
+    end
     private
     attr_reader :baseline, :context

data/lib/skill_bench/execution/source_path_resolver.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 # frozen_string_literal: true
+require 'pathname'
 module SkillBench
   module Execution
     # Resolves the source skill or workflow path for a given evaluation target.
@@ -8,6 +10,8 @@ module SkillBench
       #
       # @param eval_folder_path [String] Relative path to the eval directory.
       # @param skill_path [String, nil] Optional explicit override for the source directory.
+      # @param skill_sources [Hash] Optional skill source name → directory path mapping for fallback.
+      #   When provided and local resolution does not yield an existing path, each source is checked.
       # @return [String, nil] The resolved source path relative to the evaluator repo root, or nil if unmappable.
       # @example Infer a skill source path (NEW format):
       #   SkillBench::Execution::SourcePathResolver.call(
@@ -19,12 +23,57 @@ module SkillBench
       #     eval_folder_path: 'evals/skills/code-quality/rails-code-review/review-order'
       #   )
       #   # => "skills/code-quality/rails-code-review"
-      def self.call(eval_folder_path:, skill_path: nil)
+      def self.call(eval_folder_path:, skill_path: nil, skill_sources: {})
         return skill_path if skill_path && !skill_path.empty?
-        segments = eval_folder_path.to_s.split('/').reject(&:empty?)
+        segments = Pathname.new(eval_folder_path.to_s).each_filename.to_a
+        local = resolve_skills_path(segments) || resolve_workflows_path(segments)
+        unless local.nil? || skill_sources.empty?
+          skill_name = extract_skill_name(segments)
+          return local unless skill_name
+          return local if skill_exists_at?(local)
+          skill_sources.each_value do |source_path|
+            candidate = find_skill_in_source(source_path, skill_name)
+            return candidate if candidate
+          end
+        end
+        local
+      end
+      # Extracts the skill name from the eval path segments.
+      #
+      # @param segments [Array<String>] Path segments
+      # @return [String, nil] Skill name or nil
+      def self.extract_skill_name(segments)
+        index = segments.rindex('skills')
+        return nil unless index
+        remaining = segments[(index + 1)..]
+        return nil if remaining.empty?
-        resolve_skills_path(segments) || resolve_workflows_path(segments)
+        remaining[0]
+      end
+      # Finds a skill directory within a source path by name.
+      #
+      # @param source_path [String] Root directory containing skill categories
+      # @param skill_name [String] Name of the skill to find
+      # @return [String, nil] Path to the skill directory or nil
+      def self.find_skill_in_source(source_path, skill_name)
+        return nil unless source_path && Dir.exist?(source_path)
+        Dir.glob(File.join(source_path, '*')).each do |entry|
+          next unless Dir.exist?(entry)
+          candidate = File.join(entry, skill_name)
+          return candidate if Dir.exist?(candidate) && File.exist?(File.join(candidate, 'SKILL.md'))
+        end
+        nil
       end
       private_class_method def self.resolve_skills_path(segments)
@@ -55,6 +104,13 @@ module SkillBench
         workflow_name = segments[index + 1]
         "workflows/#{workflow_name}" if workflow_name
       end
+      private_class_method def self.skill_exists_at?(path)
+        return false unless path
+        full_path = path.end_with?('SKILL.md') ? path : File.join(path, 'SKILL.md')
+        File.exist?(full_path)
+      end
     end
   end
 end