better_translate 0.5.0 → 1.0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.env.example +14 -0
- data/.rspec +3 -0
- data/.rubocop.yml +8 -0
- data/.yardopts +10 -0
- data/CHANGELOG.md +125 -114
- data/CLAUDE.md +385 -0
- data/README.md +656 -244
- data/RELEASE_NOTES_v1.0.0.md +240 -0
- data/Rakefile +7 -1
- data/Steepfile +29 -0
- data/docs/implementation/00-overview.md +220 -0
- data/docs/implementation/01-setup_dependencies.md +668 -0
- data/docs/implementation/02-error_handling.md +65 -0
- data/docs/implementation/03-core_components.md +457 -0
- data/docs/implementation/03.5-variable_preservation.md +509 -0
- data/docs/implementation/04-provider_architecture.md +571 -0
- data/docs/implementation/05-translation_logic.md +1065 -0
- data/docs/implementation/06-main_module_api.md +122 -0
- data/docs/implementation/07-direct_translation_helpers.md +582 -0
- data/docs/implementation/08-rails_integration.md +323 -0
- data/docs/implementation/09-testing_suite.md +228 -0
- data/docs/implementation/10-documentation_examples.md +150 -0
- data/docs/implementation/11-quality_security.md +65 -0
- data/docs/implementation/12-cli_standalone.md +698 -0
- data/exe/better_translate +9 -0
- data/lib/better_translate/cache.rb +126 -0
- data/lib/better_translate/cli.rb +304 -0
- data/lib/better_translate/configuration.rb +201 -0
- data/lib/better_translate/direct_translator.rb +131 -0
- data/lib/better_translate/errors.rb +101 -0
- data/lib/better_translate/progress_tracker.rb +157 -0
- data/lib/better_translate/provider_factory.rb +45 -0
- data/lib/better_translate/providers/anthropic_provider.rb +155 -0
- data/lib/better_translate/providers/base_http_provider.rb +239 -0
- data/lib/better_translate/providers/chatgpt_provider.rb +139 -44
- data/lib/better_translate/providers/gemini_provider.rb +124 -61
- data/lib/better_translate/railtie.rb +19 -0
- data/lib/better_translate/rate_limiter.rb +93 -0
- data/lib/better_translate/strategies/base_strategy.rb +58 -0
- data/lib/better_translate/strategies/batch_strategy.rb +56 -0
- data/lib/better_translate/strategies/deep_strategy.rb +45 -0
- data/lib/better_translate/strategies/strategy_selector.rb +43 -0
- data/lib/better_translate/translator.rb +119 -284
- data/lib/better_translate/utils/hash_flattener.rb +104 -0
- data/lib/better_translate/validator.rb +105 -0
- data/lib/better_translate/variable_extractor.rb +259 -0
- data/lib/better_translate/version.rb +2 -9
- data/lib/better_translate/yaml_handler.rb +168 -0
- data/lib/better_translate.rb +97 -73
- data/lib/generators/better_translate/analyze/USAGE +12 -0
- data/lib/generators/better_translate/analyze/analyze_generator.rb +95 -0
- data/lib/generators/better_translate/install/USAGE +13 -0
- data/lib/generators/better_translate/install/install_generator.rb +72 -0
- data/lib/generators/better_translate/install/templates/README +20 -0
- data/lib/generators/better_translate/install/templates/initializer.rb.tt +79 -0
- data/lib/generators/better_translate/translate/USAGE +13 -0
- data/lib/generators/better_translate/translate/translate_generator.rb +115 -0
- data/lib/tasks/better_translate.rake +136 -0
- data/regenerate_vcr.rb +47 -0
- data/sig/better_translate/cache.rbs +28 -0
- data/sig/better_translate/cli.rbs +24 -0
- data/sig/better_translate/configuration.rbs +83 -0
- data/sig/better_translate/direct_translator.rbs +18 -0
- data/sig/better_translate/errors.rbs +46 -0
- data/sig/better_translate/progress_tracker.rbs +29 -0
- data/sig/better_translate/provider_factory.rbs +8 -0
- data/sig/better_translate/providers/anthropic_provider.rbs +27 -0
- data/sig/better_translate/providers/base_http_provider.rbs +44 -0
- data/sig/better_translate/providers/chatgpt_provider.rbs +25 -0
- data/sig/better_translate/providers/gemini_provider.rbs +22 -0
- data/sig/better_translate/railtie.rbs +7 -0
- data/sig/better_translate/rate_limiter.rbs +20 -0
- data/sig/better_translate/strategies/base_strategy.rbs +19 -0
- data/sig/better_translate/strategies/batch_strategy.rbs +13 -0
- data/sig/better_translate/strategies/deep_strategy.rbs +11 -0
- data/sig/better_translate/strategies/strategy_selector.rbs +10 -0
- data/sig/better_translate/translator.rbs +24 -0
- data/sig/better_translate/utils/hash_flattener.rbs +14 -0
- data/sig/better_translate/validator.rbs +14 -0
- data/sig/better_translate/variable_extractor.rbs +40 -0
- data/sig/better_translate/version.rbs +4 -0
- data/sig/better_translate/yaml_handler.rbs +29 -0
- data/sig/better_translate.rbs +33 -2
- data/sig/faraday.rbs +22 -0
- data/sig/generators/better_translate/analyze/analyze_generator.rbs +18 -0
- data/sig/generators/better_translate/install/install_generator.rbs +14 -0
- data/sig/generators/better_translate/translate/translate_generator.rbs +10 -0
- data/sig/optparse.rbs +9 -0
- data/sig/psych.rbs +5 -0
- data/sig/rails.rbs +34 -0
- metadata +91 -203
- data/lib/better_translate/helper.rb +0 -83
- data/lib/better_translate/providers/base_provider.rb +0 -102
- data/lib/better_translate/service.rb +0 -144
- data/lib/better_translate/similarity_analyzer.rb +0 -218
- data/lib/better_translate/utils.rb +0 -55
- data/lib/better_translate/writer.rb +0 -75
- data/lib/generators/better_translate/analyze_generator.rb +0 -57
- data/lib/generators/better_translate/install_generator.rb +0 -14
- data/lib/generators/better_translate/templates/better_translate.rb +0 -56
- data/lib/generators/better_translate/translate_generator.rb +0 -84
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b02561def13795980279169998c3a28170f0c2b1c4c7e4ffc58ae61b20b98ff5
|
|
4
|
+
data.tar.gz: d44be305f231f038918bb8dbf722db94328d9d9e71f1bcdd8bbe32aefceeb2f1
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 98e5a8fb7c3103220b6b2102a2242a8bb4d103a8e3b2df24298e3fa4a3f8e75dae3b48ebbb578a59de84e7c547f3bc12cab71a77c789aa9ab347b7d424aea113
|
|
7
|
+
data.tar.gz: 6aeaafff6b3ff756c620e229d93582ab64a48dca322d21b32d5d38a408e5f6d085ef5ba280da141d0c65b02ca54c3b5559bda29e204aff81b0a6562fac0b860f
|
data/.env.example
ADDED
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
# API Keys for Translation Providers
|
|
2
|
+
# Copy this file to .env and replace with your actual API keys
|
|
3
|
+
|
|
4
|
+
# OpenAI API Key (for ChatGPT provider)
|
|
5
|
+
# Get your key from: https://platform.openai.com/api-keys
|
|
6
|
+
OPENAI_API_KEY=your_openai_api_key_here
|
|
7
|
+
|
|
8
|
+
# Google Gemini API Key
|
|
9
|
+
# Get your key from: https://aistudio.google.com/app/apikey
|
|
10
|
+
GEMINI_API_KEY=your_gemini_api_key_here
|
|
11
|
+
|
|
12
|
+
# Anthropic API Key (for Claude provider)
|
|
13
|
+
# Get your key from: https://console.anthropic.com/settings/keys
|
|
14
|
+
ANTHROPIC_API_KEY=your_anthropic_api_key_here
|
data/.rspec
ADDED
data/.rubocop.yml
ADDED
data/.yardopts
ADDED
data/CHANGELOG.md
CHANGED
|
@@ -5,127 +5,138 @@ All notable changes to BetterTranslate will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
-
## [0.
|
|
8
|
+
## [1.0.0] - 2025-10-22
|
|
9
9
|
|
|
10
|
-
###
|
|
11
|
-
- Custom Translation Provider support:
|
|
12
|
-
- New generator: `rails generate better_translate:provider YourProviderName`
|
|
13
|
-
- Provider registration system via `BetterTranslate::Service.register_provider`
|
|
14
|
-
- Dynamic API key configuration for custom providers
|
|
15
|
-
- Comprehensive documentation and examples for creating custom providers
|
|
16
|
-
- Template files with implementation instructions
|
|
17
|
-
- Enhanced Service class:
|
|
18
|
-
- Provider registry for managing custom providers
|
|
19
|
-
- Improved error handling with better error messages
|
|
20
|
-
- Method to list all available providers
|
|
21
|
-
- Updated documentation:
|
|
22
|
-
- New section in README for custom providers
|
|
23
|
-
- Step-by-step guide for implementing custom translation services
|
|
24
|
-
- Example implementation for DeepL integration
|
|
25
|
-
|
|
26
|
-
## [0.4.2] - 2025-03-11
|
|
27
|
-
|
|
28
|
-
### Added
|
|
29
|
-
- Comprehensive YARD-style documentation across the entire codebase:
|
|
30
|
-
- Added detailed class and method documentation for all core components
|
|
31
|
-
- Added parameter and return type documentation
|
|
32
|
-
- Added usage examples for key classes and methods
|
|
33
|
-
- Improved inline comments for complex logic
|
|
34
|
-
|
|
35
|
-
### Changed
|
|
36
|
-
- Improved code readability and maintainability through better documentation
|
|
37
|
-
- Enhanced developer experience with clearer API documentation
|
|
38
|
-
|
|
39
|
-
## [0.4.1] - 2025-03-11
|
|
40
|
-
|
|
41
|
-
### Fixed
|
|
42
|
-
- Migliorati i test RSpec per garantire maggiore affidabilità:
|
|
43
|
-
- Corretti gli stub per i provider di traduzione
|
|
44
|
-
- Migliorata la gestione delle richieste HTTP nei test
|
|
45
|
-
- Ottimizzati i file YAML temporanei per i test di similarità
|
|
46
|
-
- Risolti problemi di compatibilità con WebMock
|
|
47
|
-
|
|
48
|
-
### Changed
|
|
49
|
-
- Sostituito l'approccio di stubbing specifico con pattern più flessibili
|
|
50
|
-
- Migliorata la struttura dei test per il SimilarityAnalyzer
|
|
51
|
-
|
|
52
|
-
## [0.4.0] - 2025-03-11
|
|
53
|
-
|
|
54
|
-
### Added
|
|
55
|
-
- New Translation Similarity Analyzer:
|
|
56
|
-
- Identifies similar translations across language files
|
|
57
|
-
- Generates detailed JSON reports and human-readable summaries
|
|
58
|
-
- Uses Levenshtein distance for similarity calculation
|
|
59
|
-
- Configurable similarity threshold
|
|
60
|
-
- New Rails generator: `rails generate better_translate:analyze`
|
|
61
|
-
- Analyzes all YAML files in the locales directory
|
|
62
|
-
- Provides immediate feedback in the console
|
|
63
|
-
- Generates comprehensive similarity reports
|
|
64
|
-
|
|
65
|
-
## [0.3.1] - 2025-03-11
|
|
10
|
+
### Complete Rewrite 🎉
|
|
66
11
|
|
|
67
|
-
|
|
68
|
-
- Comprehensive RSpec test suite covering:
|
|
69
|
-
- Core translation functionality
|
|
70
|
-
- LRU cache implementation
|
|
71
|
-
- Provider selection and initialization
|
|
72
|
-
- Error handling
|
|
73
|
-
- Configuration management
|
|
74
|
-
- Improved documentation with badges and testing information
|
|
75
|
-
|
|
76
|
-
### Changed
|
|
77
|
-
- Made `translate` method public in `Service` class for better testability
|
|
78
|
-
- Reorganized README.md with better structure and modern layout
|
|
79
|
-
|
|
80
|
-
## [0.3.0] - 2025-03-11
|
|
81
|
-
|
|
82
|
-
### Added
|
|
83
|
-
- New translation helper methods:
|
|
84
|
-
- `translate_text_to_languages`: Translate single text to multiple languages
|
|
85
|
-
- `translate_texts_to_languages`: Translate multiple texts to multiple languages
|
|
86
|
-
- LRU caching for improved performance
|
|
87
|
-
|
|
88
|
-
### Changed
|
|
89
|
-
- Enhanced error handling in translation providers
|
|
90
|
-
- Improved method documentation
|
|
12
|
+
This version represents a complete architectural rewrite of BetterTranslate with improved design, better testing, and enhanced features.
|
|
91
13
|
|
|
92
|
-
|
|
14
|
+
**Note:** This release is not backward compatible with versions 0.x.x
|
|
93
15
|
|
|
94
16
|
### Added
|
|
95
|
-
- Two-step filtering process:
|
|
96
|
-
- Global exclusions using `global_exclusions`
|
|
97
|
-
- Language-specific exclusions using `exclusions_per_language`
|
|
98
17
|
|
|
99
|
-
|
|
100
|
-
-
|
|
101
|
-
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
-
|
|
110
|
-
-
|
|
18
|
+
#### Core Infrastructure
|
|
19
|
+
- **Configuration System**: Type-safe configuration with comprehensive validation
|
|
20
|
+
- Support for multiple providers (ChatGPT, Gemini, Anthropic)
|
|
21
|
+
- Configurable timeouts, retries, and concurrency
|
|
22
|
+
- Translation modes: override and incremental
|
|
23
|
+
- Optional translation context for domain-specific terminology
|
|
24
|
+
|
|
25
|
+
- **LRU Cache**: Intelligent caching system for improved performance
|
|
26
|
+
- Configurable capacity (default: 1000 items)
|
|
27
|
+
- Optional TTL (Time To Live) support
|
|
28
|
+
- Thread-safe with Mutex protection
|
|
29
|
+
- Cache key format: `"#{text}:#{target_lang_code}"`
|
|
30
|
+
|
|
31
|
+
- **Rate Limiter**: Thread-safe request throttling
|
|
32
|
+
- Configurable delay between requests (default: 0.5s)
|
|
33
|
+
- Prevents API overload and rate limit errors
|
|
34
|
+
- Mutex-based synchronization
|
|
35
|
+
|
|
36
|
+
- **Validator**: Comprehensive input validation
|
|
37
|
+
- Language code validation (2-letter ISO codes)
|
|
38
|
+
- Text validation for translation
|
|
39
|
+
- File path validation
|
|
40
|
+
- API key validation
|
|
41
|
+
|
|
42
|
+
- **Error Handling**: Custom exception hierarchy
|
|
43
|
+
- `ConfigurationError`: Configuration issues
|
|
44
|
+
- `ValidationError`: Input validation failures
|
|
45
|
+
- `TranslationError`: Translation failures
|
|
46
|
+
- `ProviderError`: Provider-specific errors
|
|
47
|
+
- `ApiError`: API call failures
|
|
48
|
+
- `RateLimitError`: Rate limit exceeded
|
|
49
|
+
- `FileError`: File operation failures
|
|
50
|
+
- `YamlError`: YAML parsing errors
|
|
51
|
+
- `ProviderNotFoundError`: Unknown provider
|
|
52
|
+
|
|
53
|
+
#### Provider Architecture
|
|
54
|
+
- **BaseHttpProvider**: Abstract base class for HTTP-based providers
|
|
55
|
+
- Faraday-based HTTP client (required for all providers)
|
|
56
|
+
- Retry logic with exponential backoff (3 attempts, 2s base delay, 60s max)
|
|
57
|
+
- Built-in rate limiting (0.5s between requests)
|
|
58
|
+
- Configurable timeouts (default: 30s)
|
|
59
|
+
|
|
60
|
+
- **ChatGPT Provider**: OpenAI GPT integration
|
|
61
|
+
- Model: GPT-5-nano
|
|
62
|
+
- Temperature: 1.0
|
|
63
|
+
|
|
64
|
+
- **Gemini Provider**: Google Gemini integration
|
|
65
|
+
- Model: gemini-2.0-flash-exp
|
|
66
|
+
|
|
67
|
+
- **Anthropic Provider**: Claude integration (planned)
|
|
68
|
+
- Model: claude-3-5-sonnet-20241022
|
|
69
|
+
|
|
70
|
+
#### Translation Features
|
|
71
|
+
- **Translation Strategies**: Automatic strategy selection based on content size
|
|
72
|
+
- Deep Translation (< 50 strings): Individual translation with detailed progress
|
|
73
|
+
- Batch Translation (≥ 50 strings): Processes in batches of 10 for performance
|
|
74
|
+
|
|
75
|
+
- **Exclusion System**: Two-tier exclusion mechanism
|
|
76
|
+
- Global exclusions: Apply to all target languages (e.g., brand names)
|
|
77
|
+
- Language-specific exclusions: Exclude keys only for specific languages
|
|
78
|
+
|
|
79
|
+
- **Translation Modes**:
|
|
80
|
+
- Override mode: Replaces entire target YAML files
|
|
81
|
+
- Incremental mode: Merges with existing files, only translates missing keys
|
|
82
|
+
|
|
83
|
+
- **Translation Context**: Domain-specific context for improved accuracy
|
|
84
|
+
- Medical terminology
|
|
85
|
+
- Legal terminology
|
|
86
|
+
- Financial terminology
|
|
87
|
+
- E-commerce
|
|
88
|
+
- Technical documentation
|
|
89
|
+
|
|
90
|
+
#### Rails Integration
|
|
91
|
+
- **Install Generator**: `rails generate better_translate:install`
|
|
92
|
+
- Creates initializer with example configuration
|
|
93
|
+
- Configures all supported providers
|
|
94
|
+
|
|
95
|
+
- **Translate Generator**: `rails generate better_translate:translate`
|
|
96
|
+
- Runs translation process
|
|
111
97
|
- Displays progress messages
|
|
112
98
|
- Integrates with existing configuration
|
|
113
99
|
|
|
114
|
-
|
|
100
|
+
- **Analyze Generator**: `rails generate better_translate:analyze`
|
|
101
|
+
- Analyzes translation similarities using Levenshtein distance
|
|
102
|
+
- Generates detailed JSON reports
|
|
103
|
+
- Provides human-readable summaries
|
|
104
|
+
- Configurable similarity threshold
|
|
115
105
|
|
|
116
|
-
|
|
117
|
-
-
|
|
118
|
-
-
|
|
119
|
-
-
|
|
120
|
-
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
-
|
|
128
|
-
|
|
129
|
-
-
|
|
130
|
-
-
|
|
131
|
-
-
|
|
106
|
+
#### Utilities
|
|
107
|
+
- **HashFlattener**: Converts nested YAML to flat structure and vice versa
|
|
108
|
+
- Flatten with dot-notation keys
|
|
109
|
+
- Unflatten back to nested structure
|
|
110
|
+
- Preserves data types and structure
|
|
111
|
+
|
|
112
|
+
### Development
|
|
113
|
+
- **YARD Documentation**: Comprehensive documentation for all public APIs
|
|
114
|
+
- `@param` with types
|
|
115
|
+
- `@return` with types
|
|
116
|
+
- `@raise` for exceptions
|
|
117
|
+
- `@example` blocks
|
|
118
|
+
|
|
119
|
+
- **RSpec Test Suite**: Full test coverage for core components
|
|
120
|
+
- Configuration tests
|
|
121
|
+
- Cache tests
|
|
122
|
+
- Rate limiter tests
|
|
123
|
+
- Validator tests
|
|
124
|
+
- Error handling tests
|
|
125
|
+
- Hash flattener tests
|
|
126
|
+
|
|
127
|
+
- **RuboCop**: Code style compliance
|
|
128
|
+
- Ruby 3.0+ target
|
|
129
|
+
- Frozen string literals required
|
|
130
|
+
- Double quotes for strings
|
|
131
|
+
|
|
132
|
+
### Security
|
|
133
|
+
- Environment variable-based API key management
|
|
134
|
+
- No hardcoded credentials
|
|
135
|
+
- Input validation for all user-provided data
|
|
136
|
+
- VCR cassettes with automatic API key anonymization
|
|
137
|
+
|
|
138
|
+
### Performance
|
|
139
|
+
- LRU caching reduces API costs
|
|
140
|
+
- Batch processing for large files (≥50 strings)
|
|
141
|
+
- Configurable concurrent requests (default: 3)
|
|
142
|
+
- Rate limiting prevents API overload
|
data/CLAUDE.md
ADDED
|
@@ -0,0 +1,385 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Development Commands
|
|
6
|
+
|
|
7
|
+
### Testing
|
|
8
|
+
```bash
|
|
9
|
+
# Run all tests (unit + integration)
|
|
10
|
+
bundle exec rake spec
|
|
11
|
+
# or
|
|
12
|
+
bundle exec rspec
|
|
13
|
+
|
|
14
|
+
# Run only unit tests (fast, no API calls)
|
|
15
|
+
bundle exec rspec spec/better_translate/
|
|
16
|
+
|
|
17
|
+
# Run only integration tests (with real API calls via VCR)
|
|
18
|
+
bundle exec rspec spec/integration/ --tag integration
|
|
19
|
+
|
|
20
|
+
# Run specific test file
|
|
21
|
+
bundle exec rspec spec/better_translate_spec.rb
|
|
22
|
+
|
|
23
|
+
# Run with specific example (line number)
|
|
24
|
+
bundle exec rspec spec/better_translate_spec.rb:42
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
### VCR Cassettes & API Testing
|
|
28
|
+
|
|
29
|
+
**Setup API Keys**:
|
|
30
|
+
1. Copy `.env.example` to `.env`:
|
|
31
|
+
```bash
|
|
32
|
+
cp .env.example .env
|
|
33
|
+
```
|
|
34
|
+
2. Edit `.env` and add your real API keys:
|
|
35
|
+
```env
|
|
36
|
+
OPENAI_API_KEY=sk-...
|
|
37
|
+
GEMINI_API_KEY=...
|
|
38
|
+
ANTHROPIC_API_KEY=sk-ant-...
|
|
39
|
+
```
|
|
40
|
+
3. **IMPORTANT**: Never commit `.env` file (already in `.gitignore`)
|
|
41
|
+
|
|
42
|
+
**VCR Cassette Modes**:
|
|
43
|
+
- `:once` (default): Use existing cassettes, record new interactions
|
|
44
|
+
- `:new_episodes`: Record new interactions, keep existing ones
|
|
45
|
+
- `:all`: Re-record all cassettes (use when API changes)
|
|
46
|
+
|
|
47
|
+
**Re-recording Cassettes**:
|
|
48
|
+
```bash
|
|
49
|
+
# Delete existing cassettes and re-record with real API calls
|
|
50
|
+
rm -rf spec/vcr_cassettes/
|
|
51
|
+
bundle exec rspec spec/integration/ --tag integration
|
|
52
|
+
|
|
53
|
+
# Re-record specific provider
|
|
54
|
+
rm -rf spec/vcr_cassettes/chatgpt/
|
|
55
|
+
bundle exec rspec spec/integration/chatgpt_integration_spec.rb
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Cassette Location**: `spec/vcr_cassettes/`
|
|
59
|
+
- Cassettes are automatically anonymized (API keys replaced with placeholders)
|
|
60
|
+
- Cassettes should be committed to git for CI/CD pipelines
|
|
61
|
+
- Tests run without API keys when cassettes exist
|
|
62
|
+
|
|
63
|
+
### Code Quality
|
|
64
|
+
```bash
|
|
65
|
+
# Run RuboCop linter
|
|
66
|
+
bundle exec rake rubocop
|
|
67
|
+
# or
|
|
68
|
+
bundle exec rubocop
|
|
69
|
+
|
|
70
|
+
# Auto-fix RuboCop violations
|
|
71
|
+
bundle exec rubocop -a
|
|
72
|
+
|
|
73
|
+
# Run type checking with Steep
|
|
74
|
+
bundle exec rake steep
|
|
75
|
+
# or
|
|
76
|
+
bundle exec steep check
|
|
77
|
+
|
|
78
|
+
# Run default rake task (runs spec, rubocop, and steep)
|
|
79
|
+
bundle exec rake
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Documentation
|
|
83
|
+
```bash
|
|
84
|
+
# Generate YARD documentation
|
|
85
|
+
bundle exec yard doc
|
|
86
|
+
|
|
87
|
+
# Start YARD server (view docs at http://localhost:8808)
|
|
88
|
+
bundle exec yard server
|
|
89
|
+
|
|
90
|
+
# Check documentation coverage
|
|
91
|
+
bundle exec yard stats
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
### Security
|
|
95
|
+
```bash
|
|
96
|
+
# Check for security vulnerabilities in dependencies
|
|
97
|
+
bundle exec bundler-audit check --update
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### Type Checking (RBS/Steep)
|
|
101
|
+
```bash
|
|
102
|
+
# Run type checking
|
|
103
|
+
bundle exec steep check
|
|
104
|
+
|
|
105
|
+
# Type check specific files
|
|
106
|
+
bundle exec steep check lib/better_translate/cache.rb
|
|
107
|
+
|
|
108
|
+
# Show statistics
|
|
109
|
+
bundle exec steep stats
|
|
110
|
+
|
|
111
|
+
# Validate RBS syntax only
|
|
112
|
+
bundle exec rbs validate
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
**RBS Files**: Type signatures are in `sig/` directory
|
|
116
|
+
- All public APIs have RBS signatures
|
|
117
|
+
- Steep is integrated in CI/CD pipeline
|
|
118
|
+
- Default rake task includes type checking
|
|
119
|
+
|
|
120
|
+
**Status**: 51 type errors remaining (down from 112 initial)
|
|
121
|
+
- Most errors are related to empty collection annotations
|
|
122
|
+
- All critical paths are type-checked
|
|
123
|
+
- Continuous improvement in progress
|
|
124
|
+
|
|
125
|
+
### Gem Management
|
|
126
|
+
```bash
|
|
127
|
+
# Install dependencies
|
|
128
|
+
bundle install
|
|
129
|
+
|
|
130
|
+
# Install gem locally for testing
|
|
131
|
+
bundle exec rake install
|
|
132
|
+
|
|
133
|
+
# Interactive console with gem loaded
|
|
134
|
+
bin/console
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
## Architecture Overview
|
|
138
|
+
|
|
139
|
+
### Provider-Based System
|
|
140
|
+
The gem uses a provider architecture to support multiple AI translation services:
|
|
141
|
+
|
|
142
|
+
- **BaseHttpProvider**: Abstract base class for all HTTP-based providers
|
|
143
|
+
- Uses Faraday for all HTTP connections (REQUIRED - do not use Net::HTTP or other libraries)
|
|
144
|
+
- Implements retry logic with exponential backoff (3 attempts, 2s base delay, 60s max)
|
|
145
|
+
- Handles rate limiting (0.5s between requests, thread-safe with Mutex)
|
|
146
|
+
- Configurable timeouts (default: 30s)
|
|
147
|
+
|
|
148
|
+
- **Providers**:
|
|
149
|
+
- ChatGPT (OpenAI): GPT-5-nano model, temperature=1.0
|
|
150
|
+
- Google Gemini: gemini-2.0-flash-exp model
|
|
151
|
+
- Anthropic Claude: Planned support
|
|
152
|
+
|
|
153
|
+
### Translation Strategies
|
|
154
|
+
The gem automatically selects the optimal strategy based on content size:
|
|
155
|
+
|
|
156
|
+
- **Deep Translation** (< 50 strings): Individual translation with detailed progress
|
|
157
|
+
- **Batch Translation** (≥ 50 strings): Processes in batches of 10 for performance
|
|
158
|
+
|
|
159
|
+
### Configuration System
|
|
160
|
+
Type-safe `Configuration` class with mandatory validation:
|
|
161
|
+
- Required: provider, API keys, source language, target languages, file paths
|
|
162
|
+
- Optional: translation mode (override/incremental), context, caching, rate limiting
|
|
163
|
+
- Validation enforced via `config.validate!` before translation
|
|
164
|
+
|
|
165
|
+
### Caching System
|
|
166
|
+
LRU cache implementation:
|
|
167
|
+
- Default capacity: 1000 items (configurable)
|
|
168
|
+
- Cache key format: `"#{text}:#{target_lang_code}"`
|
|
169
|
+
- Optional TTL support
|
|
170
|
+
- Thread-safe with Mutex protection
|
|
171
|
+
- Toggleable via `cache_enabled` config
|
|
172
|
+
|
|
173
|
+
### Exclusion System
|
|
174
|
+
Two-tier exclusion mechanism:
|
|
175
|
+
- **Global exclusions**: Apply to all target languages (e.g., brand names)
|
|
176
|
+
- **Language-specific exclusions**: Exclude keys only for specific languages (e.g., legal text that was manually translated)
|
|
177
|
+
|
|
178
|
+
### Translation Modes
|
|
179
|
+
- **Override**: Replaces entire target YAML files
|
|
180
|
+
- **Incremental**: Merges with existing files, only translates missing keys
|
|
181
|
+
|
|
182
|
+
## Rails Integration
|
|
183
|
+
|
|
184
|
+
The gem provides three generators for Rails applications:
|
|
185
|
+
|
|
186
|
+
```bash
|
|
187
|
+
# Generate initializer with example configuration
|
|
188
|
+
rails generate better_translate:install
|
|
189
|
+
|
|
190
|
+
# Run translation process
|
|
191
|
+
rails generate better_translate:translate
|
|
192
|
+
|
|
193
|
+
# Analyze translation similarities (Levenshtein distance)
|
|
194
|
+
rails generate better_translate:analyze
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Configuration is typically done in `config/initializers/better_translate.rb`.
|
|
198
|
+
|
|
199
|
+
## Development Requirements
|
|
200
|
+
|
|
201
|
+
### YARD Documentation (MANDATORY)
|
|
202
|
+
ALL public methods, classes, and modules must have comprehensive YARD documentation:
|
|
203
|
+
|
|
204
|
+
- Use `@param` for parameters with types (e.g., `@param text [String]`)
|
|
205
|
+
- Use `@return` for return values with types
|
|
206
|
+
- Use `@raise` for exceptions
|
|
207
|
+
- Provide `@example` blocks for public APIs
|
|
208
|
+
- Mark private methods with `@api private`
|
|
209
|
+
|
|
210
|
+
Example:
|
|
211
|
+
```ruby
|
|
212
|
+
# Translates text to a target language
|
|
213
|
+
#
|
|
214
|
+
# @param text [String] The text to translate
|
|
215
|
+
# @param target_lang_code [String] Language code (e.g., "it", "fr")
|
|
216
|
+
# @return [String] The translated text
|
|
217
|
+
# @raise [ValidationError] If input is invalid
|
|
218
|
+
# @raise [TranslationError] If translation fails
|
|
219
|
+
#
|
|
220
|
+
# @example
|
|
221
|
+
# translate("Hello", "it") #=> "Ciao"
|
|
222
|
+
def translate(text, target_lang_code)
|
|
223
|
+
# ...
|
|
224
|
+
end
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
### HTTP Client (MANDATORY)
|
|
228
|
+
- Use Faraday for ALL HTTP connections
|
|
229
|
+
- Do NOT use Net::HTTP, HTTParty, or other HTTP libraries
|
|
230
|
+
- Implement retry logic and error handling as shown in BaseHttpProvider
|
|
231
|
+
|
|
232
|
+
### Code Style
|
|
233
|
+
- RuboCop compliance required before commits
|
|
234
|
+
- String literals: Use double quotes (enforced by RuboCop)
|
|
235
|
+
- Target Ruby version: 3.0+
|
|
236
|
+
- Frozen string literals: Required at top of all files
|
|
237
|
+
|
|
238
|
+
### Security
|
|
239
|
+
- NEVER hardcode API keys in code
|
|
240
|
+
- Use environment variables: `ENV['OPENAI_API_KEY']`, `ENV['GEMINI_API_KEY']`
|
|
241
|
+
- VCR cassettes must anonymize API keys automatically
|
|
242
|
+
- Input validation required for all user-provided data (language codes, file paths, text)
|
|
243
|
+
|
|
244
|
+
## Error Handling
|
|
245
|
+
|
|
246
|
+
Custom exception hierarchy (all inherit from `BetterTranslate::Error`):
|
|
247
|
+
- `ConfigurationError`: Configuration issues
|
|
248
|
+
- `ValidationError`: Input validation failures
|
|
249
|
+
- `TranslationError`: Translation failures
|
|
250
|
+
- `ProviderError`: Provider-specific errors
|
|
251
|
+
- `ApiError`: API call failures
|
|
252
|
+
- `RateLimitError`: Rate limit exceeded
|
|
253
|
+
- `FileError`: File operation failures
|
|
254
|
+
- `YamlError`: YAML parsing errors
|
|
255
|
+
- `ProviderNotFoundError`: Unknown provider
|
|
256
|
+
|
|
257
|
+
All errors include detailed messages and context hash for debugging.
|
|
258
|
+
|
|
259
|
+
## Testing Practices
|
|
260
|
+
|
|
261
|
+
### Test-Driven Development (TDD) - MANDATORY
|
|
262
|
+
**ALWAYS write tests BEFORE implementing any new feature or bug fix.**
|
|
263
|
+
|
|
264
|
+
This is a strict requirement for all development in this project. Follow the Red-Green-Refactor cycle:
|
|
265
|
+
|
|
266
|
+
#### TDD Workflow (REQUIRED)
|
|
267
|
+
1. **RED**: Write failing tests first
|
|
268
|
+
- Write RSpec tests that describe the desired behavior
|
|
269
|
+
- Run tests and verify they fail: `bundle exec rspec`
|
|
270
|
+
- Failing tests prove that the test is valid and catches the missing functionality
|
|
271
|
+
|
|
272
|
+
2. **GREEN**: Implement minimum code to pass tests
|
|
273
|
+
- Write the simplest implementation that makes tests pass
|
|
274
|
+
- Run tests again and verify they pass: `bundle exec rspec`
|
|
275
|
+
- DO NOT add extra features beyond what tests require
|
|
276
|
+
|
|
277
|
+
3. **REFACTOR**: Clean up code while keeping tests green
|
|
278
|
+
- Improve code quality, remove duplication
|
|
279
|
+
- Run tests after each refactoring to ensure nothing breaks
|
|
280
|
+
- Update documentation (YARD) as needed
|
|
281
|
+
|
|
282
|
+
#### Example TDD Workflow
|
|
283
|
+
```bash
|
|
284
|
+
# 1. RED - Write failing test
|
|
285
|
+
# Edit spec/providers/new_provider_spec.rb with test cases
|
|
286
|
+
bundle exec rspec spec/providers/new_provider_spec.rb
|
|
287
|
+
# => Should see failures (RED)
|
|
288
|
+
|
|
289
|
+
# 2. GREEN - Implement feature
|
|
290
|
+
# Edit lib/better_translate/providers/new_provider.rb
|
|
291
|
+
bundle exec rspec spec/providers/new_provider_spec.rb
|
|
292
|
+
# => Should see passing tests (GREEN)
|
|
293
|
+
|
|
294
|
+
# 3. REFACTOR - Improve code
|
|
295
|
+
# Refactor implementation while keeping tests green
|
|
296
|
+
bundle exec rspec spec/providers/new_provider_spec.rb
|
|
297
|
+
# => Should still see passing tests (GREEN)
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
#### Why TDD is Mandatory
|
|
301
|
+
- Ensures all code is testable by design
|
|
302
|
+
- Prevents regression bugs
|
|
303
|
+
- Provides living documentation of expected behavior
|
|
304
|
+
- Catches edge cases early
|
|
305
|
+
- Makes refactoring safer
|
|
306
|
+
|
|
307
|
+
#### Exceptions
|
|
308
|
+
The ONLY acceptable exception to writing tests first is for critical production hotfixes where immediate deployment is required. In such cases:
|
|
309
|
+
- Document the technical debt in code comments
|
|
310
|
+
- Create a GitHub issue to add tests
|
|
311
|
+
- Add tests within 24 hours of the hotfix
|
|
312
|
+
|
|
313
|
+
### RSpec Setup
|
|
314
|
+
|
|
315
|
+
**Test Organization**:
|
|
316
|
+
- `spec/better_translate/`: Unit tests with WebMock stubs (fast, no API calls)
|
|
317
|
+
- `spec/integration/`: Integration tests with VCR cassettes (real API interactions)
|
|
318
|
+
|
|
319
|
+
**Unit Tests** (WebMock):
|
|
320
|
+
- Fast execution, no API keys required
|
|
321
|
+
- Test code structure, request formatting, and error handling
|
|
322
|
+
- Use `stub_request` to mock HTTP responses
|
|
323
|
+
- Example: `spec/better_translate/providers/chatgpt_provider_spec.rb`
|
|
324
|
+
|
|
325
|
+
**Integration Tests** (VCR):
|
|
326
|
+
- Test real API interactions
|
|
327
|
+
- Require API keys in `.env` file for first run (to record cassettes)
|
|
328
|
+
- Subsequent runs use recorded cassettes (no API keys needed)
|
|
329
|
+
- Tag with `:integration` and `:vcr`
|
|
330
|
+
- Example: `spec/integration/chatgpt_integration_spec.rb`
|
|
331
|
+
|
|
332
|
+
**Running Tests**:
|
|
333
|
+
```bash
|
|
334
|
+
# Unit tests only (fast, recommended for TDD)
|
|
335
|
+
bundle exec rspec spec/better_translate/
|
|
336
|
+
|
|
337
|
+
# Integration tests only (slower, validates API compatibility)
|
|
338
|
+
bundle exec rspec spec/integration/ --tag integration
|
|
339
|
+
|
|
340
|
+
# All tests
|
|
341
|
+
bundle exec rspec
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
### VCR Configuration Details
|
|
345
|
+
|
|
346
|
+
VCR is configured in `spec_helper.rb` with:
|
|
347
|
+
- **Cassette library**: `spec/vcr_cassettes/`
|
|
348
|
+
- **Record mode**: `:once` (use existing, record new)
|
|
349
|
+
- **API key filtering**: Automatically replaces keys with `<OPENAI_API_KEY>`, etc.
|
|
350
|
+
- **Match on**: HTTP method, URI, and request body
|
|
351
|
+
|
|
352
|
+
**When to Re-record Cassettes**:
|
|
353
|
+
1. API response format changes
|
|
354
|
+
2. Adding new test scenarios
|
|
355
|
+
3. Provider updates model or endpoint
|
|
356
|
+
4. Testing error conditions
|
|
357
|
+
|
|
358
|
+
**Cassette Workflow**:
|
|
359
|
+
1. First run: Needs real API keys, records responses
|
|
360
|
+
2. Subsequent runs: Uses cassettes, no API calls
|
|
361
|
+
3. CI/CD: Uses committed cassettes, no secrets needed
|
|
362
|
+
|
|
363
|
+
## Translation Context Feature
|
|
364
|
+
|
|
365
|
+
The `translation_context` configuration allows providing domain-specific context to improve translation accuracy:
|
|
366
|
+
|
|
367
|
+
```ruby
|
|
368
|
+
config.translation_context = "Medical terminology for healthcare applications"
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
This context is included in the AI system prompt, helping with specialized terminology in fields like:
|
|
372
|
+
- Medical/Healthcare
|
|
373
|
+
- Legal
|
|
374
|
+
- Financial
|
|
375
|
+
- E-commerce
|
|
376
|
+
- Technical documentation
|
|
377
|
+
|
|
378
|
+
## Performance Considerations
|
|
379
|
+
|
|
380
|
+
- Enable caching for repeated translations to reduce API costs
|
|
381
|
+
- Use incremental mode to preserve manual corrections
|
|
382
|
+
- Monitor API usage through provider dashboards
|
|
383
|
+
- Batch processing automatically used for large files (≥50 strings)
|
|
384
|
+
- Rate limiting prevents API overload (configurable, default 0.5s between requests)
|
|
385
|
+
- Concurrent requests configurable via `max_concurrent_requests` (default: 3)
|