honyaku 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/LICENSE.txt +21 -0
- data/README.md +169 -0
- data/exe/honyaku +5 -0
- data/lib/honyaku/cli.rb +323 -0
- data/lib/honyaku/translator.rb +241 -0
- data/lib/honyaku/version.rb +5 -0
- data/lib/honyaku.rb +10 -0
- metadata +95 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: c4deb596d588ecdc949ec04c2c139bdd0c8d886c365aa9e5f980f21ffd0b986a
|
4
|
+
data.tar.gz: 42749082904b0c7cfd8ce08b038c776a7e3c57cf22b37878a1e3145cf5878c7a
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 0bab4819f367a8144255b85f6b1c5aa0b23b132d4dc02fe6cb2c7bad6bfbdc09f6904c833ee37849ad739275d4c03f6ee65fa7917de1e95c5c249ef562d82d28
|
7
|
+
data.tar.gz: 8178c8117039fcb339476be232b109695d787a64346af4d64eb8bf3375c735494e563f1a44871acd01b534822de3cf9111b93b79a014100d9eab1033dc0ffba0
|
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2025 Andrew Culver
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,169 @@
|
|
1
|
+
# Honyaku 翻訳
|
2
|
+
|
3
|
+
A Ruby gem for quickly, reliably, and accurately translating your Rails application using OpenAI. Created because it replaced a $34K/year SaaS contract and streamlined our deploy process.
|
4
|
+
|
5
|
+
Honyaku was built using [Cursor Composer](https://docs.cursor.com/composer) with [claude-3.5-sonnet](https://www.anthropic.com/news/claude-35-sonnet), prompted by [Andrew Culver](https://x.com/andrewculver) at [ClickFunnels](https://www.clickfunnels.com).
|
6
|
+
|
7
|
+
## Features
|
8
|
+
|
9
|
+
- Uses GPT-4 for high-quality translations (GPT-3.5-turbo optional for faster processing)
|
10
|
+
- Preserves YAML structure, references, and interpolation variables
|
11
|
+
- Supports translation rules via `.honyakurules` files
|
12
|
+
- Handles large files through automatic chunking
|
13
|
+
- Automatically fixes YAML formatting issues caused by the GPT
|
14
|
+
- Supports backup creation before modifications
|
15
|
+
- Smart file skipping to avoid unnecessary retranslation
|
16
|
+
|
17
|
+
# Example Output
|
18
|
+
|
19
|
+
```
|
20
|
+
$ honyaku translate ja --path config/locales/en/affiliates
|
21
|
+
π Found 2 translation rule file(s):
|
22
|
+
π /Users/andrewculver/Sites/admin/.honyakurules
|
23
|
+
π /Users/andrewculver/Sites/admin/.honyakurules.ja
|
24
|
+
π Translating from en to ja...
|
25
|
+
π Processing files in config/locales/en/affiliates...
|
26
|
+
π Processing config/locales/en/affiliates/active_referrals_report.en.yml...
|
27
|
+
π¦ Splitting file into 3 chunks...
|
28
|
+
π Translating chunk 1 of 3...
|
29
|
+
π Translating chunk 2 of 3...
|
30
|
+
π Translating chunk 3 of 3...
|
31
|
+
β¨ Created config/locales/ja/affiliates/active_referrals_report.ja.yml
|
32
|
+
π§ Checking for YAML issues...
|
33
|
+
β
No more YAML errors found
|
34
|
+
π Processing config/locales/en/affiliates/add_tag_actions.en.yml...
|
35
|
+
β¨ Created config/locales/ja/affiliates/add_tag_actions.ja.yml
|
36
|
+
π§ Checking for YAML issues...
|
37
|
+
π§ Found YAML error on line 5: (<unknown>): found character that cannot start any token while scanning for the next token at line 5 column 13
|
38
|
+
zero: %{count}γ’γγ£γͺγ¨γ€γγ«γ³γγγ·γ§γ³γγ©γ³γ’γ―γ·γ§γ³γθΏ½ε γγ
|
39
|
+
π§ Found YAML error on line 6: (<unknown>): found character that cannot start any token while scanning for the next token at line 6 column 12
|
40
|
+
one: %{count}γ’γγ£γͺγ¨γ€γγ«γ³γγγ·γ§γ³γγ©γ³γ’γ―γ·γ§γ³γθΏ½ε γγ
|
41
|
+
π§ Found YAML error on line 7: (<unknown>): found character that cannot start any token while scanning for the next token at line 7 column 14
|
42
|
+
other: %{count}γ’γγ£γͺγ¨γ€γγ«γ³γγγ·γ§γ³γγ©γ³γ’γ―γ·γ§γ³γθΏ½ε γγ
|
43
|
+
β
No more YAML errors found
|
44
|
+
β¨ Fixed YAML formatting issues
|
45
|
+
βοΈ Skipping config/locales/en/affiliates/applied_tags.en.yml - translation is up to date
|
46
|
+
βοΈ Skipping config/locales/en/affiliates/approve_actions.en.yml - translation is up to date
|
47
|
+
...
|
48
|
+
```
|
49
|
+
|
50
|
+
## Installation
|
51
|
+
|
52
|
+
Add to your Gemfile:
|
53
|
+
```ruby
|
54
|
+
gem 'honyaku'
|
55
|
+
```
|
56
|
+
|
57
|
+
Or install directly:
|
58
|
+
```bash
|
59
|
+
gem install honyaku
|
60
|
+
```
|
61
|
+
|
62
|
+
## Configuration
|
63
|
+
|
64
|
+
Set your OpenAI API key:
|
65
|
+
```bash
|
66
|
+
export OPENAI_API_KEY=your-api-key
|
67
|
+
```
|
68
|
+
|
69
|
+
Or if you've already got that configured for another purpose and you want to specify a different key for Honyaku, you can set this and we'll use it instead:
|
70
|
+
```bash
|
71
|
+
export HONYAKU_OPENAI_API_KEY=your-api-key
|
72
|
+
```
|
73
|
+
|
74
|
+
## Usage
|
75
|
+
|
76
|
+
### Basic Translation
|
77
|
+
|
78
|
+
```bash
|
79
|
+
# Translate a file
|
80
|
+
honyaku translate ja --path config/locales/en.yml
|
81
|
+
|
82
|
+
# Translate a directory
|
83
|
+
honyaku translate es --path config/locales
|
84
|
+
|
85
|
+
# Create backups before modifying
|
86
|
+
honyaku translate ja --backup --path config/locales/en.yml
|
87
|
+
|
88
|
+
# Use GPT-3.5-turbo for faster processing
|
89
|
+
honyaku translate fr --model gpt-3.5-turbo --path config/locales/en.yml
|
90
|
+
|
91
|
+
# Force retranslation of files even if they're up to date
|
92
|
+
honyaku translate ja --force --path config/locales/en.yml
|
93
|
+
```
|
94
|
+
|
95
|
+
### Smart File Skipping
|
96
|
+
|
97
|
+
Honyaku tracks file modification times to avoid unnecessary retranslation:
|
98
|
+
|
99
|
+
- Checks both git history and filesystem timestamps
|
100
|
+
- Uses the newer of the two dates for comparison
|
101
|
+
- Skips translation if target file is newer than source
|
102
|
+
- Shows "βοΈ Skipping" message for up-to-date files
|
103
|
+
|
104
|
+
You can override this behavior with `--force` to retranslate all files regardless of their timestamps.
|
105
|
+
|
106
|
+
### Translation Rules
|
107
|
+
|
108
|
+
Honyaku supports two types of rule files:
|
109
|
+
- `.honyakurules` - General rules for all translations
|
110
|
+
- `.honyakurules.{locale}` - Language-specific rules (e.g., `.honyakurules.ja`)
|
111
|
+
|
112
|
+
Example `.honyakurules`:
|
113
|
+
```yaml
|
114
|
+
Don't translate the term "ClickFunnels", that's our brand name.
|
115
|
+
```
|
116
|
+
|
117
|
+
Example `.honyakurules.ja`:
|
118
|
+
```yaml
|
119
|
+
When translating to Japanese, do not insert a space between particles like `%{site_name} γ«`... that should be `%{site_name}γ«`
|
120
|
+
```
|
121
|
+
|
122
|
+
Rules can be used for:
|
123
|
+
- Preserving brand names
|
124
|
+
- Enforcing locale-specific formatting
|
125
|
+
- Maintaining consistent terminology
|
126
|
+
|
127
|
+
### YAML Fixing
|
128
|
+
|
129
|
+
Fix formatting issues in translated files:
|
130
|
+
```bash
|
131
|
+
# Fix a single file
|
132
|
+
honyaku fix config/locales/ja/application.ja.yml
|
133
|
+
|
134
|
+
# Fix all files in a directory
|
135
|
+
honyaku fix config/locales/ja --backup
|
136
|
+
```
|
137
|
+
|
138
|
+
## Technical Details
|
139
|
+
|
140
|
+
### Large File Handling
|
141
|
+
|
142
|
+
Files over 250 lines are automatically split into chunks for translation. Each chunk maintains proper YAML structure to ensure accurate translations.
|
143
|
+
|
144
|
+
### Error Recovery
|
145
|
+
|
146
|
+
When invalid YAML is detected:
|
147
|
+
1. Automatic formatting fixes are attempted
|
148
|
+
2. Translation is retried if necessary
|
149
|
+
3. Original file is preserved if fixes fail
|
150
|
+
|
151
|
+
### Model Selection
|
152
|
+
|
153
|
+
- Default: GPT-4 (higher quality, slower)
|
154
|
+
- Alternative: GPT-3.5-turbo (faster, less accurate)
|
155
|
+
|
156
|
+
## Development
|
157
|
+
|
158
|
+
After checking out the repo:
|
159
|
+
1. Run `bin/setup` to install dependencies
|
160
|
+
2. Run `rake test` to run the tests
|
161
|
+
3. Run `bin/console` for an interactive prompt
|
162
|
+
|
163
|
+
## Contributing
|
164
|
+
|
165
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/andrewculver/honyaku.
|
166
|
+
|
167
|
+
## License
|
168
|
+
|
169
|
+
Released under the MIT License. See [LICENSE](LICENSE.txt) for details.
|
data/exe/honyaku
ADDED
data/lib/honyaku/cli.rb
ADDED
@@ -0,0 +1,323 @@
|
|
1
|
+
require "thor"
|
2
|
+
require "yaml"
|
3
|
+
require "honyaku/translator"
|
4
|
+
|
5
|
+
module Honyaku
|
6
|
+
class CLI < Thor
|
7
|
+
desc "translate LOCALE", "Translate your application into the specified locale"
|
8
|
+
long_desc <<-LONGDESC
|
9
|
+
Translates YAML files from one locale to another using OpenAI.
|
10
|
+
|
11
|
+
Examples:
|
12
|
+
# Translate a specific file from English to Japanese
|
13
|
+
$ honyaku translate ja --path config/locales/en.yml
|
14
|
+
|
15
|
+
# Translate all files in a directory from English to Spanish
|
16
|
+
$ honyaku translate es --path config/locales
|
17
|
+
|
18
|
+
# Translate using GPT-4 for higher accuracy
|
19
|
+
$ honyaku translate de --model gpt-4 --path config/locales/en.yml
|
20
|
+
LONGDESC
|
21
|
+
method_option :from, aliases: "-f", desc: "Source locale (defaults to en)"
|
22
|
+
method_option :path, aliases: "-p", desc: "Path to YAML file or directory (defaults to config/locales)"
|
23
|
+
method_option :model,
|
24
|
+
aliases: "-m",
|
25
|
+
desc: "Specify which AI model to use (defaults to gpt-4, use gpt-3.5-turbo for faster but less accurate translations)"
|
26
|
+
method_option :backup, aliases: "-b", type: :boolean, desc: "Create .bak files before modifying"
|
27
|
+
method_option :force, type: :boolean, desc: "Retranslate files even if target is newer than source"
|
28
|
+
def translate(locale)
|
29
|
+
api_key = ENV["HONYAKU_OPENAI_API_KEY"] || ENV["OPENAI_API_KEY"]
|
30
|
+
unless api_key
|
31
|
+
puts "β Please set either HONYAKU_OPENAI_API_KEY or OPENAI_API_KEY environment variable"
|
32
|
+
exit 1
|
33
|
+
end
|
34
|
+
|
35
|
+
source_locale = options[:from] || "en"
|
36
|
+
path = options[:path] || "config/locales"
|
37
|
+
model = options[:model] || "gpt-4"
|
38
|
+
|
39
|
+
# Check if the source path exists
|
40
|
+
unless File.exist?(path)
|
41
|
+
puts "β Source path not found: #{path}"
|
42
|
+
puts " Please check that the file or directory exists"
|
43
|
+
exit 1
|
44
|
+
end
|
45
|
+
|
46
|
+
# Find all .honyakurules files from root to current path
|
47
|
+
rules = find_translation_rules(path, locale)
|
48
|
+
if rules.any?
|
49
|
+
puts "π Found #{rules.length} translation rule file(s):"
|
50
|
+
rules.each do |rule|
|
51
|
+
prefix = rule[:locale_specific] ? "π" : "π"
|
52
|
+
puts " #{prefix} #{rule[:path]}"
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
puts "π Translating from #{source_locale} to #{locale}..."
|
57
|
+
puts "π Processing files in #{path}..."
|
58
|
+
|
59
|
+
translator = Translator.new(model: model, translation_rules: rules)
|
60
|
+
|
61
|
+
if File.file?(path)
|
62
|
+
process_file(path, translator, source_locale, locale)
|
63
|
+
else
|
64
|
+
files = Dir.glob("#{path}/**/*.yml")
|
65
|
+
if files.empty?
|
66
|
+
puts "β No YAML files found in: #{path}"
|
67
|
+
puts " Make sure your path contains .yml files"
|
68
|
+
exit 1
|
69
|
+
end
|
70
|
+
files.each do |file|
|
71
|
+
process_file(file, translator, source_locale, locale)
|
72
|
+
end
|
73
|
+
end
|
74
|
+
|
75
|
+
puts "β
Translation complete!"
|
76
|
+
end
|
77
|
+
|
78
|
+
desc "fix PATH", "Fix YAML formatting issues in translated files"
|
79
|
+
long_desc <<-LONGDESC
|
80
|
+
Fixes common YAML formatting issues in translated files, such as:
|
81
|
+
- Adding quotes around values that start with %{variable}
|
82
|
+
- Fixing spacing in interpolation variables
|
83
|
+
- Preserving YAML references and anchors
|
84
|
+
- Maintaining proper indentation
|
85
|
+
|
86
|
+
Examples:
|
87
|
+
# Fix a specific file
|
88
|
+
$ honyaku fix config/locales/ja/courses.ja.yml
|
89
|
+
|
90
|
+
# Fix all YAML files in a directory
|
91
|
+
$ honyaku fix config/locales/ja
|
92
|
+
LONGDESC
|
93
|
+
method_option :model, aliases: "-m", desc: "Specify which AI model to use (defaults to gpt-3.5-turbo)"
|
94
|
+
method_option :backup, aliases: "-b", type: :boolean, desc: "Create .bak files before modifying"
|
95
|
+
def fix(path)
|
96
|
+
api_key = ENV["HONYAKU_OPENAI_API_KEY"] || ENV["OPENAI_API_KEY"]
|
97
|
+
unless api_key
|
98
|
+
puts "β Please set either HONYAKU_OPENAI_API_KEY or OPENAI_API_KEY environment variable"
|
99
|
+
exit 1
|
100
|
+
end
|
101
|
+
|
102
|
+
model = options[:model] || "gpt-3.5-turbo"
|
103
|
+
|
104
|
+
puts "π§ Fixing YAML formatting issues..."
|
105
|
+
puts "π Processing files in #{path}..."
|
106
|
+
|
107
|
+
fixer = Translator.new(model: model)
|
108
|
+
|
109
|
+
if File.file?(path)
|
110
|
+
fix_file(path, fixer)
|
111
|
+
else
|
112
|
+
Dir.glob("#{path}/**/*.yml").each do |file|
|
113
|
+
fix_file(file, fixer)
|
114
|
+
end
|
115
|
+
end
|
116
|
+
|
117
|
+
puts "β
Fixes complete!"
|
118
|
+
end
|
119
|
+
|
120
|
+
private
|
121
|
+
|
122
|
+
def find_translation_rules(start_path, target_locale = nil)
|
123
|
+
rules = []
|
124
|
+
|
125
|
+
# Start from the directory containing the YAML file/directory
|
126
|
+
current_path = File.expand_path(start_path)
|
127
|
+
|
128
|
+
# First check the current working directory
|
129
|
+
if File.exist?('.honyakurules')
|
130
|
+
rules << {
|
131
|
+
path: File.expand_path('.honyakurules'),
|
132
|
+
content: File.read('.honyakurules').strip
|
133
|
+
}
|
134
|
+
end
|
135
|
+
|
136
|
+
# Check for locale-specific rules in current directory
|
137
|
+
if target_locale && File.exist?(".honyakurules.#{target_locale}")
|
138
|
+
rules << {
|
139
|
+
path: File.expand_path(".honyakurules.#{target_locale}"),
|
140
|
+
content: File.read(".honyakurules.#{target_locale}").strip,
|
141
|
+
locale_specific: true
|
142
|
+
}
|
143
|
+
end
|
144
|
+
|
145
|
+
# Walk up the directory tree from the YAML path
|
146
|
+
while current_path != '/' && current_path != Dir.pwd
|
147
|
+
# Check for general rules
|
148
|
+
rules_file = File.join(current_path, '.honyakurules')
|
149
|
+
if File.exist?(rules_file)
|
150
|
+
rules << {
|
151
|
+
path: rules_file,
|
152
|
+
content: File.read(rules_file).strip
|
153
|
+
}
|
154
|
+
end
|
155
|
+
|
156
|
+
# Check for locale-specific rules
|
157
|
+
if target_locale
|
158
|
+
locale_rules_file = File.join(current_path, ".honyakurules.#{target_locale}")
|
159
|
+
if File.exist?(locale_rules_file)
|
160
|
+
rules << {
|
161
|
+
path: locale_rules_file,
|
162
|
+
content: File.read(locale_rules_file).strip,
|
163
|
+
locale_specific: true
|
164
|
+
}
|
165
|
+
end
|
166
|
+
end
|
167
|
+
|
168
|
+
current_path = File.dirname(current_path)
|
169
|
+
end
|
170
|
+
|
171
|
+
# Reverse to maintain root-to-local order, but ensure locale-specific rules come after general rules
|
172
|
+
rules.reverse.partition { |r| !r[:locale_specific] }.flatten
|
173
|
+
end
|
174
|
+
|
175
|
+
def process_file(file_path, translator, source_locale, target_locale)
|
176
|
+
# Check if this is a source locale file we should translate
|
177
|
+
source_pattern = /#{source_locale}(\/|\.yml)/
|
178
|
+
return unless file_path =~ source_pattern
|
179
|
+
|
180
|
+
# Generate the target filename
|
181
|
+
target_file = file_path.gsub(source_pattern, "#{target_locale}\\1")
|
182
|
+
|
183
|
+
# Only skip if target exists AND is newer (unless --force is used)
|
184
|
+
if File.exist?(target_file) && !options[:force]
|
185
|
+
source_time = get_last_modified_time(file_path)
|
186
|
+
target_time = get_last_modified_time(target_file)
|
187
|
+
|
188
|
+
if target_time && source_time && target_time > source_time
|
189
|
+
puts "βοΈ Skipping #{file_path} - translation is up to date"
|
190
|
+
return
|
191
|
+
end
|
192
|
+
end
|
193
|
+
|
194
|
+
puts "π Processing #{file_path}..."
|
195
|
+
|
196
|
+
begin
|
197
|
+
attempts = 0
|
198
|
+
max_attempts = 3
|
199
|
+
|
200
|
+
loop do
|
201
|
+
attempts += 1
|
202
|
+
begin
|
203
|
+
translated_content = translator.translate_hash(file_path, source_locale, target_locale)
|
204
|
+
rescue => e
|
205
|
+
puts "β Translation failed: #{e.message}"
|
206
|
+
break
|
207
|
+
end
|
208
|
+
|
209
|
+
# Don't proceed if translation failed
|
210
|
+
if !translated_content || translated_content.strip.empty?
|
211
|
+
puts "β Translation failed - no content generated"
|
212
|
+
break
|
213
|
+
end
|
214
|
+
|
215
|
+
# Create directory and write file only if we have valid content
|
216
|
+
FileUtils.mkdir_p(File.dirname(target_file))
|
217
|
+
|
218
|
+
# Backup if requested
|
219
|
+
if options[:backup] && File.exist?(target_file)
|
220
|
+
backup_path = "#{target_file}.bak"
|
221
|
+
FileUtils.cp(target_file, backup_path)
|
222
|
+
end
|
223
|
+
|
224
|
+
# Write the translated content
|
225
|
+
File.write(target_file, translated_content)
|
226
|
+
puts "β¨ Created #{target_file}"
|
227
|
+
|
228
|
+
# Automatically fix any YAML issues
|
229
|
+
puts "π§ Checking for YAML issues..."
|
230
|
+
begin
|
231
|
+
fixed_content = translator.fix_yaml(target_file)
|
232
|
+
if fixed_content != translated_content
|
233
|
+
if options[:backup] && !File.exist?("#{target_file}.bak")
|
234
|
+
FileUtils.cp(target_file, "#{target_file}.bak")
|
235
|
+
end
|
236
|
+
|
237
|
+
File.write(target_file, fixed_content)
|
238
|
+
puts "β¨ Fixed YAML formatting issues"
|
239
|
+
end
|
240
|
+
break # Success! Exit the loop
|
241
|
+
rescue => e
|
242
|
+
if e.message.include?("needs retranslation") && attempts < max_attempts
|
243
|
+
puts "β οΈ Translation attempt #{attempts} produced invalid YAML, retrying..."
|
244
|
+
# Clean up the file before retrying
|
245
|
+
File.unlink(target_file) if File.exist?(target_file)
|
246
|
+
next
|
247
|
+
else
|
248
|
+
# Clean up and re-raise
|
249
|
+
File.unlink(target_file) if File.exist?(target_file)
|
250
|
+
raise e
|
251
|
+
end
|
252
|
+
end
|
253
|
+
end
|
254
|
+
rescue => e
|
255
|
+
puts "β Error processing #{file_path}: #{e.message}"
|
256
|
+
# Ensure file is cleaned up if it was created
|
257
|
+
File.unlink(target_file) if File.exist?(target_file)
|
258
|
+
end
|
259
|
+
end
|
260
|
+
|
261
|
+
def fix_file(file_path, fixer)
|
262
|
+
puts "π§ Fixing #{file_path}..."
|
263
|
+
|
264
|
+
begin
|
265
|
+
# Backup if requested
|
266
|
+
if options[:backup]
|
267
|
+
backup_path = "#{file_path}.bak"
|
268
|
+
FileUtils.cp(file_path, backup_path)
|
269
|
+
puts "π Created backup at #{backup_path}"
|
270
|
+
end
|
271
|
+
|
272
|
+
fixed_content = fixer.fix_yaml(file_path)
|
273
|
+
File.write(file_path, fixed_content)
|
274
|
+
puts "β¨ Fixed #{file_path}"
|
275
|
+
rescue => e
|
276
|
+
puts "β Error fixing #{file_path}: #{e.message}"
|
277
|
+
end
|
278
|
+
end
|
279
|
+
|
280
|
+
def get_last_modified_time(file_path)
|
281
|
+
times = []
|
282
|
+
|
283
|
+
# Get git timestamp if available
|
284
|
+
if git_time = get_git_modified_time(file_path)
|
285
|
+
times << git_time
|
286
|
+
end
|
287
|
+
|
288
|
+
# Get filesystem timestamp
|
289
|
+
if File.exist?(file_path)
|
290
|
+
times << File.mtime(file_path)
|
291
|
+
end
|
292
|
+
|
293
|
+
# Return the newest timestamp (or nil if no timestamps found)
|
294
|
+
times.max
|
295
|
+
end
|
296
|
+
|
297
|
+
def get_git_modified_time(file_path)
|
298
|
+
return nil unless system("git rev-parse --is-inside-work-tree > /dev/null 2>&1")
|
299
|
+
|
300
|
+
time_str = `git log -1 --format=%cd --date=iso -- #{file_path} 2>/dev/null`.strip
|
301
|
+
return nil if time_str.empty?
|
302
|
+
|
303
|
+
Time.parse(time_str)
|
304
|
+
rescue
|
305
|
+
nil
|
306
|
+
end
|
307
|
+
|
308
|
+
desc "status", "Show translation status for all locales"
|
309
|
+
def status
|
310
|
+
puts "π Translation Status:"
|
311
|
+
# Status reporting logic will go here
|
312
|
+
end
|
313
|
+
|
314
|
+
desc "version", "Show Honyaku version"
|
315
|
+
def version
|
316
|
+
puts "Honyaku v#{Honyaku::VERSION}"
|
317
|
+
end
|
318
|
+
|
319
|
+
def self.exit_on_failure?
|
320
|
+
true
|
321
|
+
end
|
322
|
+
end
|
323
|
+
end
|
@@ -0,0 +1,241 @@
|
|
1
|
+
require "openai"
|
2
|
+
require "yaml"
|
3
|
+
|
4
|
+
module Honyaku
|
5
|
+
class Translator
|
6
|
+
LINES_PER_CHUNK = 250
|
7
|
+
|
8
|
+
def initialize(api_key: nil, model: "gpt-4", translation_rules: [])
|
9
|
+
api_key ||= ENV["HONYAKU_OPENAI_API_KEY"] || ENV["OPENAI_API_KEY"]
|
10
|
+
@client = OpenAI::Client.new(access_token: api_key)
|
11
|
+
@model = model
|
12
|
+
@translation_rules = translation_rules
|
13
|
+
end
|
14
|
+
|
15
|
+
def translate_hash(file_path, from_locale, to_locale)
|
16
|
+
yaml_content = File.read(file_path)
|
17
|
+
lines = yaml_content.lines
|
18
|
+
|
19
|
+
# If the file is small enough, translate it all at once
|
20
|
+
if lines.size <= LINES_PER_CHUNK
|
21
|
+
result = translate_chunk(yaml_content, from_locale, to_locale)
|
22
|
+
raise "Translation failed" unless result
|
23
|
+
return result
|
24
|
+
end
|
25
|
+
|
26
|
+
# Otherwise, split into chunks and translate each
|
27
|
+
chunks = split_into_chunks(lines)
|
28
|
+
puts "π¦ Splitting file into #{chunks.size} chunks..."
|
29
|
+
|
30
|
+
translated_chunks = []
|
31
|
+
|
32
|
+
chunks.each_with_index do |chunk, i|
|
33
|
+
puts "π Translating chunk #{i + 1} of #{chunks.size}..."
|
34
|
+
result = translate_chunk(chunk, from_locale, to_locale)
|
35
|
+
|
36
|
+
# If any chunk fails, abort the whole translation
|
37
|
+
raise "Translation failed for chunk #{i + 1}" unless result
|
38
|
+
translated_chunks << result
|
39
|
+
end
|
40
|
+
|
41
|
+
translated_chunks.join("\n")
|
42
|
+
end
|
43
|
+
|
44
|
+
def fix_yaml(file_path)
|
45
|
+
content = File.read(file_path)
|
46
|
+
fixed_any = false
|
47
|
+
|
48
|
+
loop do
|
49
|
+
begin
|
50
|
+
YAML.load(content)
|
51
|
+
puts "β
No more YAML errors found"
|
52
|
+
return content
|
53
|
+
rescue Psych::SyntaxError => e
|
54
|
+
# If OpenAI returned invalid YAML structure, signal that we need to retranslate
|
55
|
+
if e.message.include?("did not find expected key while parsing a block mapping")
|
56
|
+
raise "Translation resulted in invalid YAML structure - needs retranslation"
|
57
|
+
end
|
58
|
+
|
59
|
+
lines = content.lines
|
60
|
+
line_number = e.line - 1 # YAML errors are 1-based
|
61
|
+
problematic_line = lines[line_number]
|
62
|
+
|
63
|
+
puts "π§ Found YAML error on line #{e.line}: #{e.message}"
|
64
|
+
puts " #{problematic_line.strip}"
|
65
|
+
|
66
|
+
# Only try to fix common syntax issues
|
67
|
+
if e.message.include?("cannot start any token")
|
68
|
+
fixed = false
|
69
|
+
|
70
|
+
# Fix case 1: Values starting with %{var} need quotes
|
71
|
+
if problematic_line.include?("%{") && problematic_line =~ /^(\s*[^:]+:\s*)(?:(&\w+)\s+)?(%\{.+)$/
|
72
|
+
prefix, reference, value = $1, $2, $3
|
73
|
+
fixed_line = if reference
|
74
|
+
"#{prefix}#{reference} \"#{value}\""
|
75
|
+
else
|
76
|
+
"#{prefix}\"#{value}\""
|
77
|
+
end
|
78
|
+
fixed = true
|
79
|
+
# Fix case 2: Fix incorrect spacing in %{ var }
|
80
|
+
elsif problematic_line.include?("% {")
|
81
|
+
fixed_line = problematic_line.gsub("% {", "%{")
|
82
|
+
fixed = true
|
83
|
+
end
|
84
|
+
|
85
|
+
if fixed
|
86
|
+
# Update the line
|
87
|
+
lines[line_number] = "#{fixed_line}\n"
|
88
|
+
content = lines.join
|
89
|
+
fixed_any = true
|
90
|
+
next # Continue to the next iteration to find more errors
|
91
|
+
end
|
92
|
+
end
|
93
|
+
|
94
|
+
# If we get here, we couldn't fix this error
|
95
|
+
if fixed_any
|
96
|
+
puts "β Unable to fix remaining YAML errors"
|
97
|
+
else
|
98
|
+
puts "β Unable to fix any YAML errors"
|
99
|
+
end
|
100
|
+
return content
|
101
|
+
end
|
102
|
+
end
|
103
|
+
end
|
104
|
+
|
105
|
+
private
|
106
|
+
|
107
|
+
def split_into_chunks(lines)
|
108
|
+
chunks = []
|
109
|
+
current_chunk = []
|
110
|
+
current_indent = 0
|
111
|
+
line_count = 0
|
112
|
+
|
113
|
+
lines.each do |line|
|
114
|
+
# Calculate the indentation level of the current line
|
115
|
+
indent = line[/\A */].length
|
116
|
+
|
117
|
+
# Start a new chunk if we hit the line limit and we're at the root level
|
118
|
+
if line_count >= LINES_PER_CHUNK && indent <= current_indent
|
119
|
+
chunks << current_chunk.join
|
120
|
+
current_chunk = []
|
121
|
+
line_count = 0
|
122
|
+
end
|
123
|
+
|
124
|
+
current_chunk << line
|
125
|
+
line_count += 1
|
126
|
+
current_indent = indent
|
127
|
+
end
|
128
|
+
|
129
|
+
# Add the last chunk if there's anything left
|
130
|
+
chunks << current_chunk.join if current_chunk.any?
|
131
|
+
|
132
|
+
chunks
|
133
|
+
end
|
134
|
+
|
135
|
+
def translate_chunk(content, from_locale, to_locale)
|
136
|
+
max_retries = 3
|
137
|
+
attempts = 0
|
138
|
+
|
139
|
+
begin
|
140
|
+
attempts += 1
|
141
|
+
response = @client.chat(
|
142
|
+
parameters: {
|
143
|
+
model: @model,
|
144
|
+
messages: [
|
145
|
+
{
|
146
|
+
role: "system",
|
147
|
+
content: build_system_prompt
|
148
|
+
},
|
149
|
+
{
|
150
|
+
role: "user",
|
151
|
+
content: "Translate this YAML content from #{from_locale} to #{to_locale}. Keep all structure and special characters exactly the same:\n\n#{content}"
|
152
|
+
}
|
153
|
+
],
|
154
|
+
temperature: 0.7
|
155
|
+
}
|
156
|
+
)
|
157
|
+
|
158
|
+
# Clean up any markdown code block markers
|
159
|
+
response_text = response.dig("choices", 0, "message", "content")
|
160
|
+
response_text.gsub(/^```ya?ml\s*\n/, '').gsub(/\n```\s*$/, '')
|
161
|
+
rescue => e
|
162
|
+
# Don't retry if it's a billing/credits issue
|
163
|
+
if e.message.include?("insufficient_quota") || e.message.include?("billing")
|
164
|
+
puts "β OpenAI API error: #{e.message}"
|
165
|
+
raise e
|
166
|
+
end
|
167
|
+
|
168
|
+
if attempts < max_retries
|
169
|
+
puts "β οΈ OpenAI API error, retrying (attempt #{attempts}/#{max_retries}): #{e.message}"
|
170
|
+
sleep(attempts) # Exponential backoff
|
171
|
+
retry
|
172
|
+
else
|
173
|
+
puts "β OpenAI API error after #{max_retries} attempts: #{e.message}"
|
174
|
+
raise e
|
175
|
+
end
|
176
|
+
end
|
177
|
+
end
|
178
|
+
|
179
|
+
def fix_line(line, error)
|
180
|
+
response = @client.chat(
|
181
|
+
parameters: {
|
182
|
+
model: @model,
|
183
|
+
messages: [
|
184
|
+
{
|
185
|
+
role: "system",
|
186
|
+
content: "You are a YAML expert. Fix the provided line to be valid YAML. Common issues include:
|
187
|
+
- Values starting with % need to be quoted
|
188
|
+
- Proper escaping of special characters
|
189
|
+
Return only the fixed line, no explanation needed."
|
190
|
+
},
|
191
|
+
{
|
192
|
+
role: "user",
|
193
|
+
content: "Fix this YAML line that generated this error: #{error}\n\nLine: #{line}"
|
194
|
+
}
|
195
|
+
],
|
196
|
+
temperature: 0.3
|
197
|
+
}
|
198
|
+
)
|
199
|
+
|
200
|
+
response.dig("choices", 0, "message", "content").strip
|
201
|
+
rescue => e
|
202
|
+
puts "β οΈ Error getting fix suggestion: #{e.message}"
|
203
|
+
line
|
204
|
+
end
|
205
|
+
|
206
|
+
def build_system_prompt
|
207
|
+
base_prompt = <<~PROMPT
|
208
|
+
You are a professional translator. You will be translating YAML files.
|
209
|
+
|
210
|
+
CRITICAL REQUIREMENTS:
|
211
|
+
1. Only translate text values after the colon (:)
|
212
|
+
2. Never modify, translate, or remove:
|
213
|
+
- YAML keys (text before the colon)
|
214
|
+
- Interpolation variables (like %{name})
|
215
|
+
- YAML references and anchors
|
216
|
+
- Comments
|
217
|
+
- Empty lines
|
218
|
+
3. Keep all special characters exactly as they appear
|
219
|
+
4. Maintain the exact same line count
|
220
|
+
5. Never add or remove lines
|
221
|
+
6. Never change the structure of the file
|
222
|
+
PROMPT
|
223
|
+
|
224
|
+
if @translation_rules.any?
|
225
|
+
general_rules, locale_rules = @translation_rules.partition { |r| !r[:locale_specific] }
|
226
|
+
|
227
|
+
if general_rules.any?
|
228
|
+
base_prompt += "\n\nGeneral translation rules:\n" +
|
229
|
+
general_rules.map { |rule| rule[:content] }.join("\n\n")
|
230
|
+
end
|
231
|
+
|
232
|
+
if locale_rules.any?
|
233
|
+
base_prompt += "\n\nTarget language specific rules:\n" +
|
234
|
+
locale_rules.map { |rule| rule[:content] }.join("\n\n")
|
235
|
+
end
|
236
|
+
end
|
237
|
+
|
238
|
+
base_prompt
|
239
|
+
end
|
240
|
+
end
|
241
|
+
end
|
data/lib/honyaku.rb
ADDED
metadata
ADDED
@@ -0,0 +1,95 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: honyaku
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Andrew Culver
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2025-02-20 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: thor
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.3'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.3'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: ruby-openai
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '6.3'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '6.3'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: yaml
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: 0.3.0
|
48
|
+
type: :runtime
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: 0.3.0
|
55
|
+
description:
|
56
|
+
email:
|
57
|
+
- andrew.culver@gmail.com
|
58
|
+
executables:
|
59
|
+
- honyaku
|
60
|
+
extensions: []
|
61
|
+
extra_rdoc_files: []
|
62
|
+
files:
|
63
|
+
- LICENSE.txt
|
64
|
+
- README.md
|
65
|
+
- exe/honyaku
|
66
|
+
- lib/honyaku.rb
|
67
|
+
- lib/honyaku/cli.rb
|
68
|
+
- lib/honyaku/translator.rb
|
69
|
+
- lib/honyaku/version.rb
|
70
|
+
homepage: https://github.com/andrewculver/honyaku
|
71
|
+
licenses:
|
72
|
+
- MIT
|
73
|
+
metadata:
|
74
|
+
homepage_uri: https://github.com/andrewculver/honyaku
|
75
|
+
source_code_uri: https://github.com/andrewculver/honyaku
|
76
|
+
post_install_message:
|
77
|
+
rdoc_options: []
|
78
|
+
require_paths:
|
79
|
+
- lib
|
80
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
81
|
+
requirements:
|
82
|
+
- - ">="
|
83
|
+
- !ruby/object:Gem::Version
|
84
|
+
version: 3.0.0
|
85
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - ">="
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: '0'
|
90
|
+
requirements: []
|
91
|
+
rubygems_version: 3.2.33
|
92
|
+
signing_key:
|
93
|
+
specification_version: 4
|
94
|
+
summary: Translate your Rails application using OpenAI
|
95
|
+
test_files: []
|