honyaku 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/LICENSE.txt +21 -0
- data/README.md +169 -0
- data/exe/honyaku +5 -0
- data/lib/honyaku/cli.rb +323 -0
- data/lib/honyaku/translator.rb +241 -0
- data/lib/honyaku/version.rb +5 -0
- data/lib/honyaku.rb +10 -0
- metadata +95 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: c4deb596d588ecdc949ec04c2c139bdd0c8d886c365aa9e5f980f21ffd0b986a
|
4
|
+
data.tar.gz: 42749082904b0c7cfd8ce08b038c776a7e3c57cf22b37878a1e3145cf5878c7a
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 0bab4819f367a8144255b85f6b1c5aa0b23b132d4dc02fe6cb2c7bad6bfbdc09f6904c833ee37849ad739275d4c03f6ee65fa7917de1e95c5c249ef562d82d28
|
7
|
+
data.tar.gz: 8178c8117039fcb339476be232b109695d787a64346af4d64eb8bf3375c735494e563f1a44871acd01b534822de3cf9111b93b79a014100d9eab1033dc0ffba0
|
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2025 Andrew Culver
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,169 @@
|
|
1
|
+
# Honyaku 翻訳
|
2
|
+
|
3
|
+
A Ruby gem for quickly, reliably, and accurately translating your Rails application using OpenAI. Created because it replaced a $34K/year SaaS contract and streamlined our deploy process.
|
4
|
+
|
5
|
+
Honyaku was built using [Cursor Composer](https://docs.cursor.com/composer) with [claude-3.5-sonnet](https://www.anthropic.com/news/claude-35-sonnet), prompted by [Andrew Culver](https://x.com/andrewculver) at [ClickFunnels](https://www.clickfunnels.com).
|
6
|
+
|
7
|
+
## Features
|
8
|
+
|
9
|
+
- Uses GPT-4 for high-quality translations (GPT-3.5-turbo optional for faster processing)
|
10
|
+
- Preserves YAML structure, references, and interpolation variables
|
11
|
+
- Supports translation rules via `.honyakurules` files
|
12
|
+
- Handles large files through automatic chunking
|
13
|
+
- Automatically fixes YAML formatting issues caused by the GPT
|
14
|
+
- Supports backup creation before modifications
|
15
|
+
- Smart file skipping to avoid unnecessary retranslation
|
16
|
+
|
17
|
+
# Example Output
|
18
|
+
|
19
|
+
```
|
20
|
+
$ honyaku translate ja --path config/locales/en/affiliates
|
21
|
+
π Found 2 translation rule file(s):
|
22
|
+
π /Users/andrewculver/Sites/admin/.honyakurules
|
23
|
+
π /Users/andrewculver/Sites/admin/.honyakurules.ja
|
24
|
+
π Translating from en to ja...
|
25
|
+
π Processing files in config/locales/en/affiliates...
|
26
|
+
π Processing config/locales/en/affiliates/active_referrals_report.en.yml...
|
27
|
+
π¦ Splitting file into 3 chunks...
|
28
|
+
π Translating chunk 1 of 3...
|
29
|
+
π Translating chunk 2 of 3...
|
30
|
+
π Translating chunk 3 of 3...
|
31
|
+
β¨ Created config/locales/ja/affiliates/active_referrals_report.ja.yml
|
32
|
+
π§ Checking for YAML issues...
|
33
|
+
β
No more YAML errors found
|
34
|
+
π Processing config/locales/en/affiliates/add_tag_actions.en.yml...
|
35
|
+
β¨ Created config/locales/ja/affiliates/add_tag_actions.ja.yml
|
36
|
+
π§ Checking for YAML issues...
|
37
|
+
π§ Found YAML error on line 5: (<unknown>): found character that cannot start any token while scanning for the next token at line 5 column 13
|
38
|
+
zero: %{count}γ’γγ£γͺγ¨γ€γγ«γ³γγγ·γ§γ³γγ©γ³γ’γ―γ·γ§γ³γθΏ½ε γγ
|
39
|
+
π§ Found YAML error on line 6: (<unknown>): found character that cannot start any token while scanning for the next token at line 6 column 12
|
40
|
+
one: %{count}γ’γγ£γͺγ¨γ€γγ«γ³γγγ·γ§γ³γγ©γ³γ’γ―γ·γ§γ³γθΏ½ε γγ
|
41
|
+
π§ Found YAML error on line 7: (<unknown>): found character that cannot start any token while scanning for the next token at line 7 column 14
|
42
|
+
other: %{count}γ’γγ£γͺγ¨γ€γγ«γ³γγγ·γ§γ³γγ©γ³γ’γ―γ·γ§γ³γθΏ½ε γγ
|
43
|
+
β
No more YAML errors found
|
44
|
+
β¨ Fixed YAML formatting issues
|
45
|
+
βοΈ Skipping config/locales/en/affiliates/applied_tags.en.yml - translation is up to date
|
46
|
+
βοΈ Skipping config/locales/en/affiliates/approve_actions.en.yml - translation is up to date
|
47
|
+
...
|
48
|
+
```
|
49
|
+
|
50
|
+
## Installation
|
51
|
+
|
52
|
+
Add to your Gemfile:
|
53
|
+
```ruby
|
54
|
+
gem 'honyaku'
|
55
|
+
```
|
56
|
+
|
57
|
+
Or install directly:
|
58
|
+
```bash
|
59
|
+
gem install honyaku
|
60
|
+
```
|
61
|
+
|
62
|
+
## Configuration
|
63
|
+
|
64
|
+
Set your OpenAI API key:
|
65
|
+
```bash
|
66
|
+
export OPENAI_API_KEY=your-api-key
|
67
|
+
```
|
68
|
+
|
69
|
+
Or if you've already got that configured for another purpose and you want to specify a different key for Honyaku, you can set this and we'll use it instead:
|
70
|
+
```bash
|
71
|
+
export HONYAKU_OPENAI_API_KEY=your-api-key
|
72
|
+
```
|
73
|
+
|
74
|
+
## Usage
|
75
|
+
|
76
|
+
### Basic Translation
|
77
|
+
|
78
|
+
```bash
|
79
|
+
# Translate a file
|
80
|
+
honyaku translate ja --path config/locales/en.yml
|
81
|
+
|
82
|
+
# Translate a directory
|
83
|
+
honyaku translate es --path config/locales
|
84
|
+
|
85
|
+
# Create backups before modifying
|
86
|
+
honyaku translate ja --backup --path config/locales/en.yml
|
87
|
+
|
88
|
+
# Use GPT-3.5-turbo for faster processing
|
89
|
+
honyaku translate fr --model gpt-3.5-turbo --path config/locales/en.yml
|
90
|
+
|
91
|
+
# Force retranslation of files even if they're up to date
|
92
|
+
honyaku translate ja --force --path config/locales/en.yml
|
93
|
+
```
|
94
|
+
|
95
|
+
### Smart File Skipping
|
96
|
+
|
97
|
+
Honyaku tracks file modification times to avoid unnecessary retranslation:
|
98
|
+
|
99
|
+
- Checks both git history and filesystem timestamps
|
100
|
+
- Uses the newer of the two dates for comparison
|
101
|
+
- Skips translation if target file is newer than source
|
102
|
+
- Shows "βοΈ Skipping" message for up-to-date files
|
103
|
+
|
104
|
+
You can override this behavior with `--force` to retranslate all files regardless of their timestamps.
|
105
|
+
|
106
|
+
### Translation Rules
|
107
|
+
|
108
|
+
Honyaku supports two types of rule files:
|
109
|
+
- `.honyakurules` - General rules for all translations
|
110
|
+
- `.honyakurules.{locale}` - Language-specific rules (e.g., `.honyakurules.ja`)
|
111
|
+
|
112
|
+
Example `.honyakurules`:
|
113
|
+
```yaml
|
114
|
+
Don't translate the term "ClickFunnels", that's our brand name.
|
115
|
+
```
|
116
|
+
|
117
|
+
Example `.honyakurules.ja`:
|
118
|
+
```yaml
|
119
|
+
When translating to Japanese, do not insert a space between particles like `%{site_name} γ«`... that should be `%{site_name}γ«`
|
120
|
+
```
|
121
|
+
|
122
|
+
Rules can be used for:
|
123
|
+
- Preserving brand names
|
124
|
+
- Enforcing locale-specific formatting
|
125
|
+
- Maintaining consistent terminology
|
126
|
+
|
127
|
+
### YAML Fixing
|
128
|
+
|
129
|
+
Fix formatting issues in translated files:
|
130
|
+
```bash
|
131
|
+
# Fix a single file
|
132
|
+
honyaku fix config/locales/ja/application.ja.yml
|
133
|
+
|
134
|
+
# Fix all files in a directory
|
135
|
+
honyaku fix config/locales/ja --backup
|
136
|
+
```
|
137
|
+
|
138
|
+
## Technical Details
|
139
|
+
|
140
|
+
### Large File Handling
|
141
|
+
|
142
|
+
Files over 250 lines are automatically split into chunks for translation. Each chunk maintains proper YAML structure to ensure accurate translations.
|
143
|
+
|
144
|
+
### Error Recovery
|
145
|
+
|
146
|
+
When invalid YAML is detected:
|
147
|
+
1. Automatic formatting fixes are attempted
|
148
|
+
2. Translation is retried if necessary
|
149
|
+
3. Original file is preserved if fixes fail
|
150
|
+
|
151
|
+
### Model Selection
|
152
|
+
|
153
|
+
- Default: GPT-4 (higher quality, slower)
|
154
|
+
- Alternative: GPT-3.5-turbo (faster, less accurate)
|
155
|
+
|
156
|
+
## Development
|
157
|
+
|
158
|
+
After checking out the repo:
|
159
|
+
1. Run `bin/setup` to install dependencies
|
160
|
+
2. Run `rake test` to run the tests
|
161
|
+
3. Run `bin/console` for an interactive prompt
|
162
|
+
|
163
|
+
## Contributing
|
164
|
+
|
165
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/andrewculver/honyaku.
|
166
|
+
|
167
|
+
## License
|
168
|
+
|
169
|
+
Released under the MIT License. See [LICENSE](LICENSE.txt) for details.
|
data/exe/honyaku
ADDED
data/lib/honyaku/cli.rb
ADDED
@@ -0,0 +1,323 @@
|
|
1
|
+
require "thor"
|
2
|
+
require "yaml"
|
3
|
+
require "honyaku/translator"
|
4
|
+
|
5
|
+
module Honyaku
|
6
|
+
class CLI < Thor
|
7
|
+
desc "translate LOCALE", "Translate your application into the specified locale"
|
8
|
+
long_desc <<-LONGDESC
|
9
|
+
Translates YAML files from one locale to another using OpenAI.
|
10
|
+
|
11
|
+
Examples:
|
12
|
+
# Translate a specific file from English to Japanese
|
13
|
+
$ honyaku translate ja --path config/locales/en.yml
|
14
|
+
|
15
|
+
# Translate all files in a directory from English to Spanish
|
16
|
+
$ honyaku translate es --path config/locales
|
17
|
+
|
18
|
+
# Translate using GPT-4 for higher accuracy
|
19
|
+
$ honyaku translate de --model gpt-4 --path config/locales/en.yml
|
20
|
+
LONGDESC
|
21
|
+
method_option :from, aliases: "-f", desc: "Source locale (defaults to en)"
|
22
|
+
method_option :path, aliases: "-p", desc: "Path to YAML file or directory (defaults to config/locales)"
|
23
|
+
method_option :model,
|
24
|
+
aliases: "-m",
|
25
|
+
desc: "Specify which AI model to use (defaults to gpt-4, use gpt-3.5-turbo for faster but less accurate translations)"
|
26
|
+
method_option :backup, aliases: "-b", type: :boolean, desc: "Create .bak files before modifying"
|
27
|
+
method_option :force, type: :boolean, desc: "Retranslate files even if target is newer than source"
|
28
|
+
def translate(locale)
|
29
|
+
api_key = ENV["HONYAKU_OPENAI_API_KEY"] || ENV["OPENAI_API_KEY"]
|
30
|
+
unless api_key
|
31
|
+
puts "β Please set either HONYAKU_OPENAI_API_KEY or OPENAI_API_KEY environment variable"
|
32
|
+
exit 1
|
33
|
+
end
|
34
|
+
|
35
|
+
source_locale = options[:from] || "en"
|
36
|
+
path = options[:path] || "config/locales"
|
37
|
+
model = options[:model] || "gpt-4"
|
38
|
+
|
39
|
+
# Check if the source path exists
|
40
|
+
unless File.exist?(path)
|
41
|
+
puts "β Source path not found: #{path}"
|
42
|
+
puts " Please check that the file or directory exists"
|
43
|
+
exit 1
|
44
|
+
end
|
45
|
+
|
46
|
+
# Find all .honyakurules files from root to current path
|
47
|
+
rules = find_translation_rules(path, locale)
|
48
|
+
if rules.any?
|
49
|
+
puts "π Found #{rules.length} translation rule file(s):"
|
50
|
+
rules.each do |rule|
|
51
|
+
prefix = rule[:locale_specific] ? "π" : "π"
|
52
|
+
puts " #{prefix} #{rule[:path]}"
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
puts "π Translating from #{source_locale} to #{locale}..."
|
57
|
+
puts "π Processing files in #{path}..."
|
58
|
+
|
59
|
+
translator = Translator.new(model: model, translation_rules: rules)
|
60
|
+
|
61
|
+
if File.file?(path)
|
62
|
+
process_file(path, translator, source_locale, locale)
|
63
|
+
else
|
64
|
+
files = Dir.glob("#{path}/**/*.yml")
|
65
|
+
if files.empty?
|
66
|
+
puts "β No YAML files found in: #{path}"
|
67
|
+
puts " Make sure your path contains .yml files"
|
68
|
+
exit 1
|
69
|
+
end
|
70
|
+
files.each do |file|
|
71
|
+
process_file(file, translator, source_locale, locale)
|
72
|
+
end
|
73
|
+
end
|
74
|
+
|
75
|
+
puts "β
Translation complete!"
|
76
|
+
end
|
77
|
+
|
78
|
+
desc "fix PATH", "Fix YAML formatting issues in translated files"
|
79
|
+
long_desc <<-LONGDESC
|
80
|
+
Fixes common YAML formatting issues in translated files, such as:
|
81
|
+
- Adding quotes around values that start with %{variable}
|
82
|
+
- Fixing spacing in interpolation variables
|
83
|
+
- Preserving YAML references and anchors
|
84
|
+
- Maintaining proper indentation
|
85
|
+
|
86
|
+
Examples:
|
87
|
+
# Fix a specific file
|
88
|
+
$ honyaku fix config/locales/ja/courses.ja.yml
|
89
|
+
|
90
|
+
# Fix all YAML files in a directory
|
91
|
+
$ honyaku fix config/locales/ja
|
92
|
+
LONGDESC
|
93
|
+
method_option :model, aliases: "-m", desc: "Specify which AI model to use (defaults to gpt-3.5-turbo)"
|
94
|
+
method_option :backup, aliases: "-b", type: :boolean, desc: "Create .bak files before modifying"
|
95
|
+
def fix(path)
|
96
|
+
api_key = ENV["HONYAKU_OPENAI_API_KEY"] || ENV["OPENAI_API_KEY"]
|
97
|
+
unless api_key
|
98
|
+
puts "β Please set either HONYAKU_OPENAI_API_KEY or OPENAI_API_KEY environment variable"
|
99
|
+
exit 1
|
100
|
+
end
|
101
|
+
|
102
|
+
model = options[:model] || "gpt-3.5-turbo"
|
103
|
+
|
104
|
+
puts "π§ Fixing YAML formatting issues..."
|
105
|
+
puts "π Processing files in #{path}..."
|
106
|
+
|
107
|
+
fixer = Translator.new(model: model)
|
108
|
+
|
109
|
+
if File.file?(path)
|
110
|
+
fix_file(path, fixer)
|
111
|
+
else
|
112
|
+
Dir.glob("#{path}/**/*.yml").each do |file|
|
113
|
+
fix_file(file, fixer)
|
114
|
+
end
|
115
|
+
end
|
116
|
+
|
117
|
+
puts "β
Fixes complete!"
|
118
|
+
end
|
119
|
+
|
120
|
+
private
|
121
|
+
|
122
|
+
def find_translation_rules(start_path, target_locale = nil)
|
123
|
+
rules = []
|
124
|
+
|
125
|
+
# Start from the directory containing the YAML file/directory
|
126
|
+
current_path = File.expand_path(start_path)
|
127
|
+
|
128
|
+
# First check the current working directory
|
129
|
+
if File.exist?('.honyakurules')
|
130
|
+
rules << {
|
131
|
+
path: File.expand_path('.honyakurules'),
|
132
|
+
content: File.read('.honyakurules').strip
|
133
|
+
}
|
134
|
+
end
|
135
|
+
|
136
|
+
# Check for locale-specific rules in current directory
|
137
|
+
if target_locale && File.exist?(".honyakurules.#{target_locale}")
|
138
|
+
rules << {
|
139
|
+
path: File.expand_path(".honyakurules.#{target_locale}"),
|
140
|
+
content: File.read(".honyakurules.#{target_locale}").strip,
|
141
|
+
locale_specific: true
|
142
|
+
}
|
143
|
+
end
|
144
|
+
|
145
|
+
# Walk up the directory tree from the YAML path
|
146
|
+
while current_path != '/' && current_path != Dir.pwd
|
147
|
+
# Check for general rules
|
148
|
+
rules_file = File.join(current_path, '.honyakurules')
|
149
|
+
if File.exist?(rules_file)
|
150
|
+
rules << {
|
151
|
+
path: rules_file,
|
152
|
+
content: File.read(rules_file).strip
|
153
|
+
}
|
154
|
+
end
|
155
|
+
|
156
|
+
# Check for locale-specific rules
|
157
|
+
if target_locale
|
158
|
+
locale_rules_file = File.join(current_path, ".honyakurules.#{target_locale}")
|
159
|
+
if File.exist?(locale_rules_file)
|
160
|
+
rules << {
|
161
|
+
path: locale_rules_file,
|
162
|
+
content: File.read(locale_rules_file).strip,
|
163
|
+
locale_specific: true
|
164
|
+
}
|
165
|
+
end
|
166
|
+
end
|
167
|
+
|
168
|
+
current_path = File.dirname(current_path)
|
169
|
+
end
|
170
|
+
|
171
|
+
# Reverse to maintain root-to-local order, but ensure locale-specific rules come after general rules
|
172
|
+
rules.reverse.partition { |r| !r[:locale_specific] }.flatten
|
173
|
+
end
|
174
|
+
|
175
|
+
def process_file(file_path, translator, source_locale, target_locale)
|
176
|
+
# Check if this is a source locale file we should translate
|
177
|
+
source_pattern = /#{source_locale}(\/|\.yml)/
|
178
|
+
return unless file_path =~ source_pattern
|
179
|
+
|
180
|
+
# Generate the target filename
|
181
|
+
target_file = file_path.gsub(source_pattern, "#{target_locale}\\1")
|
182
|
+
|
183
|
+
# Only skip if target exists AND is newer (unless --force is used)
|
184
|
+
if File.exist?(target_file) && !options[:force]
|
185
|
+
source_time = get_last_modified_time(file_path)
|
186
|
+
target_time = get_last_modified_time(target_file)
|
187
|
+
|
188
|
+
if target_time && source_time && target_time > source_time
|
189
|
+
puts "βοΈ Skipping #{file_path} - translation is up to date"
|
190
|
+
return
|
191
|
+
end
|
192
|
+
end
|
193
|
+
|
194
|
+
puts "π Processing #{file_path}..."
|
195
|
+
|
196
|
+
begin
|
197
|
+
attempts = 0
|
198
|
+
max_attempts = 3
|
199
|
+
|
200
|
+
loop do
|
201
|
+
attempts += 1
|
202
|
+
begin
|
203
|
+
translated_content = translator.translate_hash(file_path, source_locale, target_locale)
|
204
|
+
rescue => e
|
205
|
+
puts "β Translation failed: #{e.message}"
|
206
|
+
break
|
207
|
+
end
|
208
|
+
|
209
|
+
# Don't proceed if translation failed
|
210
|
+
if !translated_content || translated_content.strip.empty?
|
211
|
+
puts "β Translation failed - no content generated"
|
212
|
+
break
|
213
|
+
end
|
214
|
+
|
215
|
+
# Create directory and write file only if we have valid content
|
216
|
+
FileUtils.mkdir_p(File.dirname(target_file))
|
217
|
+
|
218
|
+
# Backup if requested
|
219
|
+
if options[:backup] && File.exist?(target_file)
|
220
|
+
backup_path = "#{target_file}.bak"
|
221
|
+
FileUtils.cp(target_file, backup_path)
|
222
|
+
end
|
223
|
+
|
224
|
+
# Write the translated content
|
225
|
+
File.write(target_file, translated_content)
|
226
|
+
puts "β¨ Created #{target_file}"
|
227
|
+
|
228
|
+
# Automatically fix any YAML issues
|
229
|
+
puts "π§ Checking for YAML issues..."
|
230
|
+
begin
|
231
|
+
fixed_content = translator.fix_yaml(target_file)
|
232
|
+
if fixed_content != translated_content
|
233
|
+
if options[:backup] && !File.exist?("#{target_file}.bak")
|
234
|
+
FileUtils.cp(target_file, "#{target_file}.bak")
|
235
|
+
end
|
236
|
+
|
237
|
+
File.write(target_file, fixed_content)
|
238
|
+
puts "β¨ Fixed YAML formatting issues"
|
239
|
+
end
|
240
|
+
break # Success! Exit the loop
|
241
|
+
rescue => e
|
242
|
+
if e.message.include?("needs retranslation") && attempts < max_attempts
|
243
|
+
puts "β οΈ Translation attempt #{attempts} produced invalid YAML, retrying..."
|
244
|
+
# Clean up the file before retrying
|
245
|
+
File.unlink(target_file) if File.exist?(target_file)
|
246
|
+
next
|
247
|
+
else
|
248
|
+
# Clean up and re-raise
|
249
|
+
File.unlink(target_file) if File.exist?(target_file)
|
250
|
+
raise e
|
251
|
+
end
|
252
|
+
end
|
253
|
+
end
|
254
|
+
rescue => e
|
255
|
+
puts "β Error processing #{file_path}: #{e.message}"
|
256
|
+
# Ensure file is cleaned up if it was created
|
257
|
+
File.unlink(target_file) if File.exist?(target_file)
|
258
|
+
end
|
259
|
+
end
|
260
|
+
|
261
|
+
def fix_file(file_path, fixer)
|
262
|
+
puts "π§ Fixing #{file_path}..."
|
263
|
+
|
264
|
+
begin
|
265
|
+
# Backup if requested
|
266
|
+
if options[:backup]
|
267
|
+
backup_path = "#{file_path}.bak"
|
268
|
+
FileUtils.cp(file_path, backup_path)
|
269
|
+
puts "π Created backup at #{backup_path}"
|
270
|
+
end
|
271
|
+
|
272
|
+
fixed_content = fixer.fix_yaml(file_path)
|
273
|
+
File.write(file_path, fixed_content)
|
274
|
+
puts "β¨ Fixed #{file_path}"
|
275
|
+
rescue => e
|
276
|
+
puts "β Error fixing #{file_path}: #{e.message}"
|
277
|
+
end
|
278
|
+
end
|
279
|
+
|
280
|
+
def get_last_modified_time(file_path)
|
281
|
+
times = []
|
282
|
+
|
283
|
+
# Get git timestamp if available
|
284
|
+
if git_time = get_git_modified_time(file_path)
|
285
|
+
times << git_time
|
286
|
+
end
|
287
|
+
|
288
|
+
# Get filesystem timestamp
|
289
|
+
if File.exist?(file_path)
|
290
|
+
times << File.mtime(file_path)
|
291
|
+
end
|
292
|
+
|
293
|
+
# Return the newest timestamp (or nil if no timestamps found)
|
294
|
+
times.max
|
295
|
+
end
|
296
|
+
|
297
|
+
def get_git_modified_time(file_path)
|
298
|
+
return nil unless system("git rev-parse --is-inside-work-tree > /dev/null 2>&1")
|
299
|
+
|
300
|
+
time_str = `git log -1 --format=%cd --date=iso -- #{file_path} 2>/dev/null`.strip
|
301
|
+
return nil if time_str.empty?
|
302
|
+
|
303
|
+
Time.parse(time_str)
|
304
|
+
rescue
|
305
|
+
nil
|
306
|
+
end
|
307
|
+
|
308
|
+
desc "status", "Show translation status for all locales"
|
309
|
+
def status
|
310
|
+
puts "π Translation Status:"
|
311
|
+
# Status reporting logic will go here
|
312
|
+
end
|
313
|
+
|
314
|
+
desc "version", "Show Honyaku version"
|
315
|
+
def version
|
316
|
+
puts "Honyaku v#{Honyaku::VERSION}"
|
317
|
+
end
|
318
|
+
|
319
|
+
def self.exit_on_failure?
|
320
|
+
true
|
321
|
+
end
|
322
|
+
end
|
323
|
+
end
|
@@ -0,0 +1,241 @@
|
|
1
|
+
require "openai"
|
2
|
+
require "yaml"
|
3
|
+
|
4
|
+
module Honyaku
|
5
|
+
class Translator
|
6
|
+
LINES_PER_CHUNK = 250
|
7
|
+
|
8
|
+
def initialize(api_key: nil, model: "gpt-4", translation_rules: [])
|
9
|
+
api_key ||= ENV["HONYAKU_OPENAI_API_KEY"] || ENV["OPENAI_API_KEY"]
|
10
|
+
@client = OpenAI::Client.new(access_token: api_key)
|
11
|
+
@model = model
|
12
|
+
@translation_rules = translation_rules
|
13
|
+
end
|
14
|
+
|
15
|
+
def translate_hash(file_path, from_locale, to_locale)
|
16
|
+
yaml_content = File.read(file_path)
|
17
|
+
lines = yaml_content.lines
|
18
|
+
|
19
|
+
# If the file is small enough, translate it all at once
|
20
|
+
if lines.size <= LINES_PER_CHUNK
|
21
|
+
result = translate_chunk(yaml_content, from_locale, to_locale)
|
22
|
+
raise "Translation failed" unless result
|
23
|
+
return result
|
24
|
+
end
|
25
|
+
|
26
|
+
# Otherwise, split into chunks and translate each
|
27
|
+
chunks = split_into_chunks(lines)
|
28
|
+
puts "π¦ Splitting file into #{chunks.size} chunks..."
|
29
|
+
|
30
|
+
translated_chunks = []
|
31
|
+
|
32
|
+
chunks.each_with_index do |chunk, i|
|
33
|
+
puts "π Translating chunk #{i + 1} of #{chunks.size}..."
|
34
|
+
result = translate_chunk(chunk, from_locale, to_locale)
|
35
|
+
|
36
|
+
# If any chunk fails, abort the whole translation
|
37
|
+
raise "Translation failed for chunk #{i + 1}" unless result
|
38
|
+
translated_chunks << result
|
39
|
+
end
|
40
|
+
|
41
|
+
translated_chunks.join("\n")
|
42
|
+
end
|
43
|
+
|
44
|
+
def fix_yaml(file_path)
|
45
|
+
content = File.read(file_path)
|
46
|
+
fixed_any = false
|
47
|
+
|
48
|
+
loop do
|
49
|
+
begin
|
50
|
+
YAML.load(content)
|
51
|
+
puts "β
No more YAML errors found"
|
52
|
+
return content
|
53
|
+
rescue Psych::SyntaxError => e
|
54
|
+
# If OpenAI returned invalid YAML structure, signal that we need to retranslate
|
55
|
+
if e.message.include?("did not find expected key while parsing a block mapping")
|
56
|
+
raise "Translation resulted in invalid YAML structure - needs retranslation"
|
57
|
+
end
|
58
|
+
|
59
|
+
lines = content.lines
|
60
|
+
line_number = e.line - 1 # YAML errors are 1-based
|
61
|
+
problematic_line = lines[line_number]
|
62
|
+
|
63
|
+
puts "π§ Found YAML error on line #{e.line}: #{e.message}"
|
64
|
+
puts " #{problematic_line.strip}"
|
65
|
+
|
66
|
+
# Only try to fix common syntax issues
|
67
|
+
if e.message.include?("cannot start any token")
|
68
|
+
fixed = false
|
69
|
+
|
70
|
+
# Fix case 1: Values starting with %{var} need quotes
|
71
|
+
if problematic_line.include?("%{") && problematic_line =~ /^(\s*[^:]+:\s*)(?:(&\w+)\s+)?(%\{.+)$/
|
72
|
+
prefix, reference, value = $1, $2, $3
|
73
|
+
fixed_line = if reference
|
74
|
+
"#{prefix}#{reference} \"#{value}\""
|
75
|
+
else
|
76
|
+
"#{prefix}\"#{value}\""
|
77
|
+
end
|
78
|
+
fixed = true
|
79
|
+
# Fix case 2: Fix incorrect spacing in %{ var }
|
80
|
+
elsif problematic_line.include?("% {")
|
81
|
+
fixed_line = problematic_line.gsub("% {", "%{")
|
82
|
+
fixed = true
|
83
|
+
end
|
84
|
+
|
85
|
+
if fixed
|
86
|
+
# Update the line
|
87
|
+
lines[line_number] = "#{fixed_line}\n"
|
88
|
+
content = lines.join
|
89
|
+
fixed_any = true
|
90
|
+
next # Continue to the next iteration to find more errors
|
91
|
+
end
|
92
|
+
end
|
93
|
+
|
94
|
+
# If we get here, we couldn't fix this error
|
95
|
+
if fixed_any
|
96
|
+
puts "β Unable to fix remaining YAML errors"
|
97
|
+
else
|
98
|
+
puts "β Unable to fix any YAML errors"
|
99
|
+
end
|
100
|
+
return content
|
101
|
+
end
|
102
|
+
end
|
103
|
+
end
|
104
|
+
|
105
|
+
private
|
106
|
+
|
107
|
+
def split_into_chunks(lines)
|
108
|
+
chunks = []
|
109
|
+
current_chunk = []
|
110
|
+
current_indent = 0
|
111
|
+
line_count = 0
|
112
|
+
|
113
|
+
lines.each do |line|
|
114
|
+
# Calculate the indentation level of the current line
|
115
|
+
indent = line[/\A */].length
|
116
|
+
|
117
|
+
# Start a new chunk if we hit the line limit and we're at the root level
|
118
|
+
if line_count >= LINES_PER_CHUNK && indent <= current_indent
|
119
|
+
chunks << current_chunk.join
|
120
|
+
current_chunk = []
|
121
|
+
line_count = 0
|
122
|
+
end
|
123
|
+
|
124
|
+
current_chunk << line
|
125
|
+
line_count += 1
|
126
|
+
current_indent = indent
|
127
|
+
end
|
128
|
+
|
129
|
+
# Add the last chunk if there's anything left
|
130
|
+
chunks << current_chunk.join if current_chunk.any?
|
131
|
+
|
132
|
+
chunks
|
133
|
+
end
|
134
|
+
|
135
|
+
def translate_chunk(content, from_locale, to_locale)
|
136
|
+
max_retries = 3
|
137
|
+
attempts = 0
|
138
|
+
|
139
|
+
begin
|
140
|
+
attempts += 1
|
141
|
+
response = @client.chat(
|
142
|
+
parameters: {
|
143
|
+
model: @model,
|
144
|
+
messages: [
|
145
|
+
{
|
146
|
+
role: "system",
|
147
|
+
content: build_system_prompt
|
148
|
+
},
|
149
|
+
{
|
150
|
+
role: "user",
|
151
|
+
content: "Translate this YAML content from #{from_locale} to #{to_locale}. Keep all structure and special characters exactly the same:\n\n#{content}"
|
152
|
+
}
|
153
|
+
],
|
154
|
+
temperature: 0.7
|
155
|
+
}
|
156
|
+
)
|
157
|
+
|
158
|
+
# Clean up any markdown code block markers
|
159
|
+
response_text = response.dig("choices", 0, "message", "content")
|
160
|
+
response_text.gsub(/^```ya?ml\s*\n/, '').gsub(/\n```\s*$/, '')
|
161
|
+
rescue => e
|
162
|
+
# Don't retry if it's a billing/credits issue
|
163
|
+
if e.message.include?("insufficient_quota") || e.message.include?("billing")
|
164
|
+
puts "β OpenAI API error: #{e.message}"
|
165
|
+
raise e
|
166
|
+
end
|
167
|
+
|
168
|
+
if attempts < max_retries
|
169
|
+
puts "β οΈ OpenAI API error, retrying (attempt #{attempts}/#{max_retries}): #{e.message}"
|
170
|
+
sleep(attempts) # Exponential backoff
|
171
|
+
retry
|
172
|
+
else
|
173
|
+
puts "β OpenAI API error after #{max_retries} attempts: #{e.message}"
|
174
|
+
raise e
|
175
|
+
end
|
176
|
+
end
|
177
|
+
end
|
178
|
+
|
179
|
+
def fix_line(line, error)
|
180
|
+
response = @client.chat(
|
181
|
+
parameters: {
|
182
|
+
model: @model,
|
183
|
+
messages: [
|
184
|
+
{
|
185
|
+
role: "system",
|
186
|
+
content: "You are a YAML expert. Fix the provided line to be valid YAML. Common issues include:
|
187
|
+
- Values starting with % need to be quoted
|
188
|
+
- Proper escaping of special characters
|
189
|
+
Return only the fixed line, no explanation needed."
|
190
|
+
},
|
191
|
+
{
|
192
|
+
role: "user",
|
193
|
+
content: "Fix this YAML line that generated this error: #{error}\n\nLine: #{line}"
|
194
|
+
}
|
195
|
+
],
|
196
|
+
temperature: 0.3
|
197
|
+
}
|
198
|
+
)
|
199
|
+
|
200
|
+
response.dig("choices", 0, "message", "content").strip
|
201
|
+
rescue => e
|
202
|
+
puts "β οΈ Error getting fix suggestion: #{e.message}"
|
203
|
+
line
|
204
|
+
end
|
205
|
+
|
206
|
+
def build_system_prompt
|
207
|
+
base_prompt = <<~PROMPT
|
208
|
+
You are a professional translator. You will be translating YAML files.
|
209
|
+
|
210
|
+
CRITICAL REQUIREMENTS:
|
211
|
+
1. Only translate text values after the colon (:)
|
212
|
+
2. Never modify, translate, or remove:
|
213
|
+
- YAML keys (text before the colon)
|
214
|
+
- Interpolation variables (like %{name})
|
215
|
+
- YAML references and anchors
|
216
|
+
- Comments
|
217
|
+
- Empty lines
|
218
|
+
3. Keep all special characters exactly as they appear
|
219
|
+
4. Maintain the exact same line count
|
220
|
+
5. Never add or remove lines
|
221
|
+
6. Never change the structure of the file
|
222
|
+
PROMPT
|
223
|
+
|
224
|
+
if @translation_rules.any?
|
225
|
+
general_rules, locale_rules = @translation_rules.partition { |r| !r[:locale_specific] }
|
226
|
+
|
227
|
+
if general_rules.any?
|
228
|
+
base_prompt += "\n\nGeneral translation rules:\n" +
|
229
|
+
general_rules.map { |rule| rule[:content] }.join("\n\n")
|
230
|
+
end
|
231
|
+
|
232
|
+
if locale_rules.any?
|
233
|
+
base_prompt += "\n\nTarget language specific rules:\n" +
|
234
|
+
locale_rules.map { |rule| rule[:content] }.join("\n\n")
|
235
|
+
end
|
236
|
+
end
|
237
|
+
|
238
|
+
base_prompt
|
239
|
+
end
|
240
|
+
end
|
241
|
+
end
|
data/lib/honyaku.rb
ADDED
metadata
ADDED
@@ -0,0 +1,95 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: honyaku
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Andrew Culver
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2025-02-20 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: thor
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1.3'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1.3'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: ruby-openai
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '6.3'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '6.3'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: yaml
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: 0.3.0
|
48
|
+
type: :runtime
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: 0.3.0
|
55
|
+
description:
|
56
|
+
email:
|
57
|
+
- andrew.culver@gmail.com
|
58
|
+
executables:
|
59
|
+
- honyaku
|
60
|
+
extensions: []
|
61
|
+
extra_rdoc_files: []
|
62
|
+
files:
|
63
|
+
- LICENSE.txt
|
64
|
+
- README.md
|
65
|
+
- exe/honyaku
|
66
|
+
- lib/honyaku.rb
|
67
|
+
- lib/honyaku/cli.rb
|
68
|
+
- lib/honyaku/translator.rb
|
69
|
+
- lib/honyaku/version.rb
|
70
|
+
homepage: https://github.com/andrewculver/honyaku
|
71
|
+
licenses:
|
72
|
+
- MIT
|
73
|
+
metadata:
|
74
|
+
homepage_uri: https://github.com/andrewculver/honyaku
|
75
|
+
source_code_uri: https://github.com/andrewculver/honyaku
|
76
|
+
post_install_message:
|
77
|
+
rdoc_options: []
|
78
|
+
require_paths:
|
79
|
+
- lib
|
80
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
81
|
+
requirements:
|
82
|
+
- - ">="
|
83
|
+
- !ruby/object:Gem::Version
|
84
|
+
version: 3.0.0
|
85
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - ">="
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: '0'
|
90
|
+
requirements: []
|
91
|
+
rubygems_version: 3.2.33
|
92
|
+
signing_key:
|
93
|
+
specification_version: 4
|
94
|
+
summary: Translate your Rails application using OpenAI
|
95
|
+
test_files: []
|