ruby-ai-gem-context 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: b0dc306be58bfc24b0b508a23cd444aba6682c041ca60ef46b8db016aafed892
4
+ data.tar.gz: 5ff91acacb8dcb6d4b4d137800b0fd04ec7da93a1600e8ac5d55c517b8cc9b8b
5
+ SHA512:
6
+ metadata.gz: 3fd5f1c0470fdd08b11f5a944d63b544e7234b3d80645d329263862794d77b24b4c464df59ddaedfcca93c1d014ae5551685125c2fb0b3cedd3e3a84da338a51
7
+ data.tar.gz: c4e22122c4e218ee4735309c57543a8bc75861992abe4a43f4eea7b148df38aae72414771a553aecc32237b7ee5f2137a5d4b5d28e19467087b2e76bfac71471
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --format documentation
2
+ --color
3
+ --require spec_helper
data/CHANGELOG.md ADDED
@@ -0,0 +1,81 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [Unreleased]
9
+
10
+ ## [0.4.0] - 2026-02-27
11
+
12
+ ### Added
13
+
14
+ - **Multi-platform support**: Generate context files for Claude, Cursor, Windsurf, Codex, and llm.txt
15
+ - **Interactive rake tasks** with colored terminal UI using `tty-prompt`:
16
+ - `rake ai_context:setup` - Scaffolds boilerplate context files for chosen platforms
17
+ - `rake ai_context:generate` - Fills context files with AI-generated content
18
+ - `rake ai_context:generate_for_gem[name]` - Generate context for third-party gems
19
+ - `rake ai_context:list` - List generated context files
20
+ - `rake ai_context:clear` - Remove generated files
21
+ - **RubyLLM integration** for AI generation with support for multiple providers (Anthropic, OpenAI, etc.)
22
+ - **Platform system** with dedicated classes for each AI tool:
23
+ - `Platforms::Claude` - CLAUDE.md
24
+ - `Platforms::Cursor` - .cursorrules
25
+ - `Platforms::Windsurf` - .windsurfrules
26
+ - `Platforms::Codex` - AGENTS.md
27
+ - `Platforms::LlmTxt` - llm.txt
28
+ - **ConfigFile** tracks generated files in `.ai_context/config.yml`
29
+ - **FileCollector** scans project files with configurable include/exclude patterns
30
+ - **Collision handling** - Backup, skip, or overwrite existing files
31
+ - **Model documentation** with guidance on choosing models for quality vs. cost
32
+ - **Review reminder** emphasizing the importance of manually reviewing AI output
33
+
34
+ ### Changed
35
+
36
+ - Complete architecture redesign focused on interactive rake tasks
37
+ - Replaced custom AI providers with RubyLLM gem
38
+ - Files now placed at native locations (project root) instead of managed subfolder
39
+ - Simplified configuration to focus on model selection and file patterns
40
+
41
+ ### Removed
42
+
43
+ - CLI binary (`exe/ruby_ai_gem_context`) - replaced by rake tasks
44
+ - Custom provider classes (Anthropic, OpenAI) - now uses RubyLLM
45
+ - Extractors (README, Code, Test) - simplified to FileCollector
46
+ - Cache system - regenerate as needed
47
+ - Scanner and Aggregator - no longer needed for new architecture
48
+ - AuthorContext and SkillBuilder - simplified approach
49
+
50
+ ## [0.3.0] - 2026-02-13
51
+
52
+ ### Added
53
+
54
+ - CLI binary with commands: `generate`, `init`, `list`, `clear`, `help`
55
+ - AI-powered generation using Anthropic or OpenAI providers
56
+ - AuthorContext detection from gems
57
+ - Generator with extractors
58
+ - SkillBuilder for installing to ~/.claude/skills/
59
+ - Cache system
60
+
61
+ ### Changed
62
+
63
+ - Architecture overhaul to full generation pipeline
64
+
65
+ ## [0.2.0] - 2026-02-12
66
+
67
+ ### Added
68
+
69
+ - Agent Skills Standard Support
70
+
71
+ ### Changed
72
+
73
+ - Default output path to `.claude/skills/`
74
+
75
+ ## [0.1.0] - 2026-02-03
76
+
77
+ ### Added
78
+
79
+ - Initial release with Scanner and Aggregator
80
+ - Rails integration via Railtie
81
+ - Basic rake tasks
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 Alvaro Delgado
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,236 @@
1
+ # 🤖 ruby-ai-gem-context
2
+
3
+ **Make AI coding assistants actually understand your Ruby project.**
4
+
5
+ Stop explaining your codebase over and over. This gem generates context files that AI assistants read *before* they help you — so they already know your architecture, conventions, and patterns.
6
+
7
+ [![Gem Version](https://badge.fury.io/rb/ruby-ai-gem-context.svg)](https://rubygems.org/gems/ruby-ai-gem-context)
8
+ [![MIT License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE.txt)
9
+
10
+ ---
11
+
12
+ ## The Problem
13
+
14
+ Every time you start a conversation with an AI coding assistant, you waste time explaining:
15
+ - "We use service objects, not fat models"
16
+ - "Authentication is handled by Devise with custom strategies"
17
+ - "Don't suggest RSpec — we use Minitest"
18
+
19
+ **The solution**: Context files that AI assistants read automatically.
20
+
21
+ ## Supported Platforms
22
+
23
+ | Platform | File | Description |
24
+ |----------|------|-------------|
25
+ | Claude Code | `CLAUDE.md` | Project context for Claude AI |
26
+ | Cursor | `.cursorrules` | Rules and context for Cursor IDE |
27
+ | Windsurf | `.windsurfrules` | Rules for Windsurf IDE |
28
+ | OpenAI Codex/ChatGPT | `AGENTS.md` | Documentation for AI agents |
29
+ | llm.txt | `llm.txt` | Universal LLM context (like robots.txt for AI) |
30
+
31
+ ## Installation
32
+
33
+ Add to your Gemfile:
34
+
35
+ ```ruby
36
+ gem 'ruby-ai-gem-context'
37
+ ```
38
+
39
+ Or install directly:
40
+
41
+ ```bash
42
+ gem install ruby-ai-gem-context
43
+ ```
44
+
45
+ ## ⚡ Quick Start
46
+
47
+ ```bash
48
+ # 1. Setup: Create boilerplate files for your chosen platforms
49
+ rake ai_context:setup
50
+
51
+ # 2. Generate: Fill files with AI-generated content from your code
52
+ rake ai_context:generate
53
+
54
+ # 3. Review and edit the generated files!
55
+ ```
56
+
57
+ That's it. Your AI assistant now understands your project *before* you ask your first question.
58
+
59
+ ## 🛠️ Rake Tasks
60
+
61
+ ### `rake ai_context:setup`
62
+
63
+ Interactive wizard to create boilerplate context files:
64
+
65
+ ```
66
+ $ rake ai_context:setup
67
+
68
+ ══════════════════════════════════════════════════
69
+ AI Context Setup
70
+ ══════════════════════════════════════════════════
71
+
72
+ Which platforms do you want to generate context files for?
73
+ ◉ Claude Code (CLAUDE.md)
74
+ ◯ Cursor (.cursorrules)
75
+ ◉ Windsurf (.windsurfrules)
76
+ ◯ OpenAI Codex / ChatGPT (AGENTS.md)
77
+ ◯ llm.txt (llm.txt)
78
+
79
+ ✓ Created CLAUDE.md
80
+ ✓ Created .windsurfrules
81
+ ```
82
+
83
+ If a file already exists, you'll be asked whether to skip, backup, or overwrite it.
84
+
85
+ ### `rake ai_context:generate`
86
+
87
+ Fill your context files with AI-generated content:
88
+
89
+ ```
90
+ $ rake ai_context:generate
91
+
92
+ Where should we read source files from?
93
+ › Project root (scan everything)
94
+ Specific folders (you'll provide a list)
95
+
96
+ Will scan 47 files (23.5 KB)
97
+ File types: .rb: 35, .rake: 5, .md: 7
98
+
99
+ Using model: claude-sonnet-4-20250514
100
+
101
+ ⠋ Generating Claude Code context...
102
+ ✓ Generated CLAUDE.md
103
+
104
+ ⚠ IMPORTANT: Review the generated files carefully!
105
+ AI-generated context is a first draft. The quality of your future
106
+ AI interactions depends on the accuracy of these files.
107
+ ```
108
+
109
+ ### `rake ai_context:generate_for_gem[gem_name]`
110
+
111
+ Generate context for a third-party gem:
112
+
113
+ ```bash
114
+ rake ai_context:generate_for_gem[devise]
115
+ ```
116
+
117
+ This reads the installed gem's source and generates context files saved to `.ai_context/gems/devise/`.
118
+
119
+ ### Other Tasks
120
+
121
+ ```bash
122
+ rake ai_context:list # List generated context files
123
+ rake ai_context:clear # Remove generated files
124
+ ```
125
+
126
+ ## ⚙️ Configuration
127
+
128
+ Configure via Ruby:
129
+
130
+ ```ruby
131
+ RubyAiGemContext.configure do |config|
132
+ # AI model to use (see "Choosing a Model" below)
133
+ config.model = "claude-sonnet-4-20250514"
134
+
135
+ # Generation parameters
136
+ config.temperature = 0.3
137
+ config.max_tokens = 4000
138
+
139
+ # File patterns to scan
140
+ config.include_patterns = %w[**/*.rb README* CHANGELOG*]
141
+ config.exclude_patterns = %w[vendor/** spec/** test/**]
142
+ end
143
+ ```
144
+
145
+ Or in Rails (`config/application.rb`):
146
+
147
+ ```ruby
148
+ config.ruby_ai_gem_context.model = "gpt-4o"
149
+ ```
150
+
151
+ ## 🧠 Choosing a Model
152
+
153
+ The model you choose directly affects the quality of generated context:
154
+
155
+ | Model | Quality | Speed | Cost | Best For |
156
+ |-------|---------|-------|------|----------|
157
+ | `claude-sonnet-4-20250514` | High | Fast | Medium | **Recommended** - good balance |
158
+ | `claude-opus-4-20250514` | Highest | Slow | High | Complex codebases |
159
+ | `claude-haiku-4-20250514` | Good | Fastest | Low | Quick iterations |
160
+ | `gpt-4o` | High | Fast | Medium | OpenAI preference |
161
+ | `gpt-4o-mini` | Good | Fastest | Low | Budget-conscious |
162
+
163
+ **Important**: Quality matters! The context files help AI understand your codebase. Poor context leads to poor AI assistance. Consider using a higher-quality model and reviewing the output carefully.
164
+
165
+ ### Setting Your API Key
166
+
167
+ The gem uses [RubyLLM](https://github.com/crmne/ruby_llm) which auto-detects your API key from environment variables:
168
+
169
+ ```bash
170
+ # For Anthropic Claude models
171
+ export ANTHROPIC_API_KEY=sk-ant-...
172
+
173
+ # For OpenAI models
174
+ export OPENAI_API_KEY=sk-...
175
+ ```
176
+
177
+ ## ⚠️ Review Your Generated Files
178
+
179
+ **This is important**: AI-generated context is a first draft, not a finished product.
180
+
181
+ The quality of your future AI interactions depends on accurate context files. Take time to:
182
+
183
+ 1. **Read through each generated file** — Does it accurately describe your project?
184
+ 2. **Fix inaccuracies** — Remove wrong assumptions, correct misunderstandings
185
+ 3. **Add project-specific details** — Coding conventions, architecture decisions, gotchas
186
+ 4. **Remove generic content** — Replace boilerplate with specifics
187
+ 5. **Keep it updated** — Regenerate or manually update as your project evolves
188
+
189
+ > 💡 **Think of it as an investment**: 30 minutes reviewing context now saves hours of correcting AI mistakes later.
190
+
191
+ ## 🏗️ Architecture
192
+
193
+ ```
194
+ lib/ruby_ai_gem_context/
195
+ ├── configuration.rb # Global settings
196
+ ├── platform.rb # Base class for platforms
197
+ ├── platforms/
198
+ │ ├── claude.rb # CLAUDE.md
199
+ │ ├── cursor.rb # .cursorrules
200
+ │ ├── windsurf.rb # .windsurfrules
201
+ │ ├── codex.rb # AGENTS.md
202
+ │ └── llm_txt.rb # llm.txt
203
+ ├── config_file.rb # Track generated files in .ai_context/config.yml
204
+ ├── file_collector.rb # Scan and read project files
205
+ ├── generator.rb # AI generation via RubyLLM
206
+ ├── interactive.rb # Terminal UI with tty-prompt
207
+ ├── railtie.rb # Rails integration
208
+ └── tasks/
209
+ └── ruby_ai_gem_context.rake
210
+ ```
211
+
212
+ ## 📋 Requirements
213
+
214
+ - Ruby 3.0+
215
+ - Rails 7.0+ (optional, for Railtie integration)
216
+
217
+ ## 🧑‍💻 Development
218
+
219
+ ```bash
220
+ git clone https://github.com/AAlvAAro/ruby-ai-gem-context
221
+ cd ruby-ai-gem-context
222
+ bundle install
223
+ rake spec
224
+ ```
225
+
226
+ ## 🤝 Contributing
227
+
228
+ 1. Fork it
229
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
230
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
231
+ 4. Push to the branch (`git push origin my-new-feature`)
232
+ 5. Create a Pull Request
233
+
234
+ ## 📄 License
235
+
236
+ This gem is available as open source under the terms of the [MIT License](LICENSE.txt).
data/Rakefile ADDED
@@ -0,0 +1,59 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "rspec/core/rake_task"
5
+
6
+ RSpec::Core::RakeTask.new(:spec)
7
+
8
+ task default: :spec
9
+
10
+ # Load gem tasks for non-Rails usage
11
+ namespace :ruby_ai_gem_context do
12
+ require_relative "lib/ruby_ai_gem_context"
13
+
14
+ desc "Aggregate mcp_context/ from all gems into .ai_context/"
15
+ task :aggregate do
16
+ puts "Scanning gems for mcp_context/ folders..."
17
+
18
+ result = RubyAiGemContext::Aggregator.aggregate
19
+ gems = result["gems"]
20
+
21
+ if gems.empty?
22
+ puts "No gems with mcp_context/ folders found."
23
+ else
24
+ puts "\nAggregated context from #{gems.size} gem(s):"
25
+ gems.each do |name, info|
26
+ puts " - #{name} (#{info['version']}): #{info['files'].size} file(s)"
27
+ end
28
+ puts "\nContext written to: #{RubyAiGemContext.configuration.output_path}/"
29
+ end
30
+ end
31
+
32
+ desc "Remove the .ai_context/ folder"
33
+ task :clear do
34
+ output_path = RubyAiGemContext.configuration.output_path
35
+ if File.exist?(output_path)
36
+ RubyAiGemContext::Aggregator.clear
37
+ puts "Removed #{output_path}/"
38
+ else
39
+ puts "#{output_path}/ does not exist."
40
+ end
41
+ end
42
+
43
+ desc "List all gems that have mcp_context/ folders"
44
+ task :list do
45
+ puts "Scanning gems for mcp_context/ folders..."
46
+
47
+ gems = RubyAiGemContext::Scanner.list_gems_with_context
48
+
49
+ if gems.empty?
50
+ puts "No gems with mcp_context/ folders found."
51
+ else
52
+ puts "\nFound #{gems.size} gem(s) with mcp_context/:"
53
+ gems.each do |gem_info|
54
+ puts " - #{gem_info[:name]} (#{gem_info[:version]})"
55
+ puts " Path: #{gem_info[:path]}"
56
+ end
57
+ end
58
+ end
59
+ end
@@ -0,0 +1,97 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RubyAiGemContext
4
+ # Tracks generated context files in .ai_context/config.yml
5
+ #
6
+ # This helps us know:
7
+ # - Which files we generated vs user-created
8
+ # - When they were generated
9
+ # - Which model was used
10
+ class ConfigFile
11
+ CONFIG_FILENAME = "config.yml"
12
+
13
+ attr_reader :project_root
14
+
15
+ def initialize(project_root = Dir.pwd)
16
+ @project_root = project_root
17
+ end
18
+
19
+ def config_dir
20
+ File.join(project_root, RubyAiGemContext.configuration.config_dir)
21
+ end
22
+
23
+ def config_path
24
+ File.join(config_dir, CONFIG_FILENAME)
25
+ end
26
+
27
+ # Load existing config or return empty structure
28
+ def load
29
+ return default_config unless File.exist?(config_path)
30
+
31
+ YAML.safe_load(File.read(config_path), permitted_classes: [Time, Symbol]) || default_config
32
+ rescue StandardError
33
+ default_config
34
+ end
35
+
36
+ # Save config to file
37
+ def save(config)
38
+ FileUtils.mkdir_p(config_dir)
39
+ File.write(config_path, YAML.dump(config))
40
+ end
41
+
42
+ # Record that a file was generated
43
+ def record_generated(platform_key, model: nil)
44
+ config = load
45
+ config["generated"] ||= {}
46
+ config["generated"][platform_key.to_s] = {
47
+ "created_at" => Time.now.utc.iso8601,
48
+ "model" => model || RubyAiGemContext.configuration.model,
49
+ "path" => RubyAiGemContext.platform(platform_key).path
50
+ }
51
+ save(config)
52
+ end
53
+
54
+ # Check if we generated a specific platform's file
55
+ def generated?(platform_key)
56
+ config = load
57
+ config.dig("generated", platform_key.to_s).is_a?(Hash)
58
+ end
59
+
60
+ # Get info about a generated file
61
+ def generated_info(platform_key)
62
+ config = load
63
+ config.dig("generated", platform_key.to_s)
64
+ end
65
+
66
+ # List all platforms we've generated files for
67
+ def generated_platforms
68
+ config = load
69
+ (config["generated"] || {}).keys.map(&:to_sym)
70
+ end
71
+
72
+ # Record source folders used for generation
73
+ def record_source_folders(folders)
74
+ config = load
75
+ config["source_folders"] = folders
76
+ config["last_generated_at"] = Time.now.utc.iso8601
77
+ save(config)
78
+ end
79
+
80
+ # Get previously used source folders
81
+ def source_folders
82
+ config = load
83
+ config["source_folders"]
84
+ end
85
+
86
+ private
87
+
88
+ def default_config
89
+ {
90
+ "version" => RubyAiGemContext::VERSION,
91
+ "generated" => {},
92
+ "source_folders" => nil,
93
+ "last_generated_at" => nil
94
+ }
95
+ end
96
+ end
97
+ end
@@ -0,0 +1,61 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RubyAiGemContext
4
+ class Configuration
5
+ # RubyLLM model configuration
6
+ attr_accessor :model, :temperature, :max_tokens
7
+
8
+ # File patterns to include/exclude when scanning
9
+ attr_accessor :include_patterns, :exclude_patterns
10
+
11
+ # Config directory for tracking generated files
12
+ attr_accessor :config_dir
13
+
14
+ def initialize
15
+ # Default to Claude Sonnet - good balance of quality and cost
16
+ @model = "claude-sonnet-4-20250514"
17
+ @temperature = 0.3
18
+ @max_tokens = 4000
19
+
20
+ # Default file patterns
21
+ @include_patterns = %w[
22
+ **/*.rb
23
+ **/*.rake
24
+ **/*.gemspec
25
+ **/Gemfile
26
+ **/Rakefile
27
+ README*
28
+ CHANGELOG*
29
+ LICENSE*
30
+ ]
31
+
32
+ @exclude_patterns = %w[
33
+ vendor/**/*
34
+ node_modules/**/*
35
+ tmp/**/*
36
+ log/**/*
37
+ .git/**/*
38
+ coverage/**/*
39
+ spec/**/*
40
+ test/**/*
41
+ ]
42
+
43
+ @config_dir = ".ai_context"
44
+ end
45
+
46
+ # Supported models with descriptions for documentation
47
+ def self.supported_models
48
+ {
49
+ # Anthropic Claude models
50
+ "claude-sonnet-4-20250514" => "Claude Sonnet 4 - Recommended balance of quality and cost",
51
+ "claude-opus-4-20250514" => "Claude Opus 4 - Highest quality, higher cost",
52
+ "claude-haiku-4-20250514" => "Claude Haiku 4 - Fast and cheap, lower quality",
53
+
54
+ # OpenAI models
55
+ "gpt-4o" => "GPT-4o - OpenAI's flagship model",
56
+ "gpt-4o-mini" => "GPT-4o Mini - Faster and cheaper",
57
+ "gpt-4-turbo" => "GPT-4 Turbo - Previous generation"
58
+ }
59
+ end
60
+ end
61
+ end
@@ -0,0 +1,110 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RubyAiGemContext
4
+ # Collects and reads files from the project for context generation
5
+ class FileCollector
6
+ attr_reader :root_path, :folders
7
+
8
+ # @param root_path [String] Project root directory
9
+ # @param folders [Array<String>, String, nil] Specific folders to scan, or nil for whole project
10
+ def initialize(root_path = Dir.pwd, folders: nil)
11
+ @root_path = root_path
12
+ @folders = normalize_folders(folders)
13
+ end
14
+
15
+ # Collect all relevant files based on configuration
16
+ # @return [Hash] Hash with file paths as keys and contents as values
17
+ def collect
18
+ files = {}
19
+
20
+ scan_paths.each do |scan_path|
21
+ collect_from_path(scan_path, files)
22
+ end
23
+
24
+ files
25
+ end
26
+
27
+ # Get a summary of what will be collected
28
+ # @return [Hash] Stats about files to be collected
29
+ def summary
30
+ files = collect
31
+ {
32
+ total_files: files.size,
33
+ total_size: files.values.sum(&:bytesize),
34
+ by_extension: count_by_extension(files)
35
+ }
36
+ end
37
+
38
+ private
39
+
40
+ def normalize_folders(folders)
41
+ case folders
42
+ when nil then nil
43
+ when String then [folders]
44
+ when Array then folders
45
+ else raise ArgumentError, "folders must be nil, String, or Array"
46
+ end
47
+ end
48
+
49
+ def scan_paths
50
+ if folders.nil?
51
+ [root_path]
52
+ else
53
+ folders.map { |f| File.join(root_path, f) }.select { |p| File.directory?(p) }
54
+ end
55
+ end
56
+
57
+ def collect_from_path(path, files)
58
+ include_patterns.each do |pattern|
59
+ Dir.glob(File.join(path, pattern)).each do |file_path|
60
+ next unless File.file?(file_path)
61
+ next if excluded?(file_path)
62
+ next if binary?(file_path)
63
+ next if too_large?(file_path)
64
+
65
+ relative_path = file_path.sub("#{root_path}/", "")
66
+ files[relative_path] = safe_read(file_path)
67
+ end
68
+ end
69
+ end
70
+
71
+ def include_patterns
72
+ RubyAiGemContext.configuration.include_patterns
73
+ end
74
+
75
+ def exclude_patterns
76
+ RubyAiGemContext.configuration.exclude_patterns
77
+ end
78
+
79
+ def excluded?(file_path)
80
+ relative = file_path.sub("#{root_path}/", "")
81
+ exclude_patterns.any? do |pattern|
82
+ File.fnmatch?(pattern, relative, File::FNM_PATHNAME)
83
+ end
84
+ end
85
+
86
+ def binary?(file_path)
87
+ # Simple binary check: look for null bytes in first 8KB
88
+ chunk = File.read(file_path, 8192) || ""
89
+ chunk.include?("\x00")
90
+ rescue StandardError
91
+ true
92
+ end
93
+
94
+ def too_large?(file_path, max_size: 100_000)
95
+ File.size(file_path) > max_size
96
+ rescue StandardError
97
+ true
98
+ end
99
+
100
+ def safe_read(file_path)
101
+ File.read(file_path, encoding: "UTF-8", invalid: :replace, undef: :replace)
102
+ rescue StandardError => e
103
+ "# Error reading file: #{e.message}"
104
+ end
105
+
106
+ def count_by_extension(files)
107
+ files.keys.group_by { |f| File.extname(f) }.transform_values(&:count)
108
+ end
109
+ end
110
+ end