llm_translate 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rspec_status +14 -0
- data/README.md +301 -0
- data/README.zh.md +209 -0
- data/Rakefile +12 -0
- data/content/changelog-1.md +12 -0
- data/content/changelog-2.md +12 -0
- data/content/llm_translate.yml +189 -0
- data/content/prompt.md +8 -0
- data/content/todo.md +115 -0
- data/exe/llm_translate +6 -0
- data/lib/llm_translate/ai_client.rb +95 -0
- data/lib/llm_translate/cli.rb +205 -0
- data/lib/llm_translate/config.rb +279 -0
- data/lib/llm_translate/file_finder.rb +153 -0
- data/lib/llm_translate/logger.rb +170 -0
- data/lib/llm_translate/translator_engine.rb +233 -0
- data/lib/llm_translate/version.rb +5 -0
- data/lib/llm_translate.rb +16 -0
- data/llm_translate.gemspec +41 -0
- data/llm_translate.yml +189 -0
- data/test_config.yml +52 -0
- data/test_docs/sample.md +22 -0
- data/test_docs_translated/sample.zh.md +22 -0
- data/test_llm_translate.yml +180 -0
- data/test_new_config.yml +189 -0
- metadata +143 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 584007a41c9b59041a1ecd9e42c96231c66fdc606edb19567c58fd342d838a15
|
4
|
+
data.tar.gz: 1b9afce81578be82bcfd4f81a7fa6050073d9eef8848a7ee43ef9110b38ead94
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: a959c5dfcf20bfc2bc94f787cf8c3e49ee75417a5be461a0b2650d023ce964670dd566e118e764b3152c3afdc2a22d2ac33605fdfca7909e55848e0761b0515c
|
7
|
+
data.tar.gz: d56df324476f82d6deb7e4b28bba026349661e4b5c389e1068e09dae36ecae7c38a8d38cee8cbfa861c686db3f063f3715cc0c18430e462d86f4ebf69079ce72
|
data/.rspec_status
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
example_id | status | run_time |
|
2
|
+
--------------------------------------- | ------ | --------------- |
|
3
|
+
./spec/translator/config_spec.rb[1:1:1] | passed | 0.00338 seconds |
|
4
|
+
./spec/translator/config_spec.rb[1:1:2] | passed | 0.00113 seconds |
|
5
|
+
./spec/translator/config_spec.rb[1:2:1] | passed | 0.00078 seconds |
|
6
|
+
./spec/translator/config_spec.rb[1:3:1] | passed | 0.00044 seconds |
|
7
|
+
./spec/translator/config_spec.rb[1:3:2] | passed | 0.00077 seconds |
|
8
|
+
./spec/translator/config_spec.rb[1:4:1] | passed | 0.00038 seconds |
|
9
|
+
./spec/translator/config_spec.rb[1:5:1] | passed | 0.0007 seconds |
|
10
|
+
./spec/translator/config_spec.rb[1:5:2] | passed | 0.00035 seconds |
|
11
|
+
./spec/translator/config_spec.rb[1:5:3] | passed | 0.00043 seconds |
|
12
|
+
./spec/translator_spec.rb[1:1] | passed | 0.00057 seconds |
|
13
|
+
./spec/translator_spec.rb[1:2] | passed | 0.00004 seconds |
|
14
|
+
./spec/translator_spec.rb[1:3] | passed | 0.00073 seconds |
|
data/README.md
ADDED
@@ -0,0 +1,301 @@
|
|
1
|
+
# LlmTranslate
|
2
|
+
|
3
|
+
AI-powered Markdown translator that preserves formatting while translating content using various AI providers.
|
4
|
+
|
5
|
+
## Features
|
6
|
+
|
7
|
+
- 🤖 **AI-Powered Translation**: Support for OpenAI, Anthropic, and Ollama
|
8
|
+
- 📝 **Markdown Format Preservation**: Keeps code blocks, links, images, and formatting intact
|
9
|
+
- 🔧 **Flexible Configuration**: YAML-based configuration with environment variable support
|
10
|
+
- 📁 **Batch Processing**: Recursively processes entire directory structures
|
11
|
+
- 🚀 **CLI Interface**: Easy-to-use command-line interface with Thor
|
12
|
+
- 📊 **Progress Tracking**: Built-in logging and reporting
|
13
|
+
- ⚡ **Error Handling**: Robust error handling with retry mechanisms
|
14
|
+
- 🎯 **Customizable**: Custom prompts, file patterns, and output strategies
|
15
|
+
|
16
|
+
## Installation
|
17
|
+
|
18
|
+
Add this line to your application's Gemfile:
|
19
|
+
|
20
|
+
```ruby
|
21
|
+
gem 'llm_translate'
|
22
|
+
```
|
23
|
+
|
24
|
+
And then execute:
|
25
|
+
|
26
|
+
```bash
|
27
|
+
bundle install
|
28
|
+
```
|
29
|
+
|
30
|
+
Or install it yourself as:
|
31
|
+
|
32
|
+
```bash
|
33
|
+
gem install llm_translate
|
34
|
+
```
|
35
|
+
|
36
|
+
## Dependencies
|
37
|
+
|
38
|
+
The gem requires the `rubyllm` gem for AI integration:
|
39
|
+
|
40
|
+
```bash
|
41
|
+
gem install rubyllm
|
42
|
+
```
|
43
|
+
|
44
|
+
## Quick Start
|
45
|
+
|
46
|
+
1. **Initialize a configuration file**:
|
47
|
+
```bash
|
48
|
+
llm_translate init
|
49
|
+
```
|
50
|
+
|
51
|
+
2. **Set your API key**:
|
52
|
+
```bash
|
53
|
+
export LLM_TRANSLATE_API_KEY="your-api-key-here"
|
54
|
+
```
|
55
|
+
|
56
|
+
3. **Translate your markdown files**:
|
57
|
+
```bash
|
58
|
+
llm_translate translate --config ./llm_translate.yml
|
59
|
+
```
|
60
|
+
|
61
|
+
## Configuration
|
62
|
+
|
63
|
+
The translator uses a YAML configuration file. Here's a minimal example:
|
64
|
+
|
65
|
+
```yaml
|
66
|
+
# llm_translate.yml
|
67
|
+
ai:
|
68
|
+
api_key: ${LLM_TRANSLATE_API_KEY}
|
69
|
+
provider: "openai"
|
70
|
+
model: "gpt-4"
|
71
|
+
temperature: 0.3
|
72
|
+
|
73
|
+
translation:
|
74
|
+
target_language: "zh-CN"
|
75
|
+
default_prompt: |
|
76
|
+
Please translate the following Markdown content to Chinese, keeping all formatting intact:
|
77
|
+
- Preserve code blocks, links, images, and other Markdown syntax
|
78
|
+
- Keep English technical terms and product names
|
79
|
+
- Ensure natural and fluent translation
|
80
|
+
|
81
|
+
Content:
|
82
|
+
{content}
|
83
|
+
|
84
|
+
files:
|
85
|
+
input_directory: "./docs"
|
86
|
+
output_directory: "./docs-translated"
|
87
|
+
filename_suffix: ".zh"
|
88
|
+
|
89
|
+
logging:
|
90
|
+
level: "info"
|
91
|
+
output: "console"
|
92
|
+
```
|
93
|
+
|
94
|
+
### AI Providers
|
95
|
+
|
96
|
+
#### OpenAI
|
97
|
+
```yaml
|
98
|
+
ai:
|
99
|
+
provider: "openai"
|
100
|
+
api_key: ${OPENAI_API_KEY}
|
101
|
+
model: "gpt-4"
|
102
|
+
```
|
103
|
+
|
104
|
+
#### Anthropic
|
105
|
+
```yaml
|
106
|
+
ai:
|
107
|
+
provider: "anthropic"
|
108
|
+
api_key: ${ANTHROPIC_API_KEY}
|
109
|
+
model: "claude-3-sonnet-20240229"
|
110
|
+
```
|
111
|
+
|
112
|
+
#### Ollama (Local)
|
113
|
+
```yaml
|
114
|
+
ai:
|
115
|
+
provider: "ollama"
|
116
|
+
model: "llama2"
|
117
|
+
# Set OLLAMA_HOST environment variable if not using default
|
118
|
+
```
|
119
|
+
|
120
|
+
## Usage
|
121
|
+
|
122
|
+
### Basic Translation
|
123
|
+
|
124
|
+
#### Directory Mode (Default)
|
125
|
+
```bash
|
126
|
+
llm_translate translate --config ./llm_translate.yml
|
127
|
+
```
|
128
|
+
|
129
|
+
#### Single File Mode
|
130
|
+
To translate a single file, configure `input_file` and `output_file` in your configuration:
|
131
|
+
|
132
|
+
```yaml
|
133
|
+
files:
|
134
|
+
# Single file mode
|
135
|
+
input_file: "./README.md"
|
136
|
+
output_file: "./README.zh.md"
|
137
|
+
```
|
138
|
+
|
139
|
+
When both `input_file` and `output_file` are specified, the translator will operate in single file mode, ignoring directory-related settings.
|
140
|
+
|
141
|
+
### Command Line Options
|
142
|
+
|
143
|
+
```bash
|
144
|
+
llm_translate translate [OPTIONS]
|
145
|
+
|
146
|
+
Options:
|
147
|
+
-c, --config PATH Configuration file path (default: ./llm_translate.yml)
|
148
|
+
-i, --input PATH Input directory (overrides config)
|
149
|
+
-o, --output PATH Output directory (overrides config)
|
150
|
+
-p, --prompt TEXT Custom translation prompt (overrides config)
|
151
|
+
-v, --verbose Enable verbose output
|
152
|
+
-d, --dry-run Perform a dry run without actual translation
|
153
|
+
|
154
|
+
Other Commands:
|
155
|
+
llm_translate init Initialize a new configuration file
|
156
|
+
llm_translate version Show version information
|
157
|
+
```
|
158
|
+
|
159
|
+
### Configuration File Structure
|
160
|
+
|
161
|
+
```yaml
|
162
|
+
# AI Configuration
|
163
|
+
ai:
|
164
|
+
api_key: ${LLM_TRANSLATE_API_KEY}
|
165
|
+
provider: "openai" # openai, anthropic, ollama
|
166
|
+
model: "gpt-4"
|
167
|
+
temperature: 0.3
|
168
|
+
max_tokens: 4000
|
169
|
+
retry_attempts: 3
|
170
|
+
retry_delay: 2
|
171
|
+
timeout: 60
|
172
|
+
|
173
|
+
# Translation Settings
|
174
|
+
translation:
|
175
|
+
target_language: "zh-CN"
|
176
|
+
source_language: "auto"
|
177
|
+
default_prompt: "Your custom prompt with {content} placeholder"
|
178
|
+
preserve_formatting: true
|
179
|
+
translate_code_comments: false
|
180
|
+
preserve_patterns:
|
181
|
+
- "```[\\s\\S]*?```" # Code blocks
|
182
|
+
- "`[^`]+`" # Inline code
|
183
|
+
- "\\[.*?\\]\\(.*?\\)" # Links
|
184
|
+
- "!\\[.*?\\]\\(.*?\\)" # Images
|
185
|
+
|
186
|
+
# File Processing
|
187
|
+
files:
|
188
|
+
input_directory: "./docs"
|
189
|
+
output_directory: "./docs-translated"
|
190
|
+
filename_strategy: "suffix" # suffix, replace, directory
|
191
|
+
filename_suffix: ".zh"
|
192
|
+
include_patterns:
|
193
|
+
- "**/*.md"
|
194
|
+
- "**/*.markdown"
|
195
|
+
exclude_patterns:
|
196
|
+
- "**/node_modules/**"
|
197
|
+
- "**/.*"
|
198
|
+
preserve_directory_structure: true
|
199
|
+
overwrite_policy: "ask" # ask, overwrite, skip, backup
|
200
|
+
backup_directory: "./backups"
|
201
|
+
|
202
|
+
# Logging
|
203
|
+
logging:
|
204
|
+
level: "info" # debug, info, warn, error
|
205
|
+
output: "console" # console, file, both
|
206
|
+
file_path: "./logs/translator.log"
|
207
|
+
verbose_translation: false
|
208
|
+
error_log_path: "./logs/errors.log"
|
209
|
+
|
210
|
+
# Error Handling
|
211
|
+
error_handling:
|
212
|
+
on_error: "log_and_continue" # stop, log_and_continue, skip_file
|
213
|
+
max_consecutive_errors: 5
|
214
|
+
retry_on_failure: 2
|
215
|
+
generate_error_report: true
|
216
|
+
error_report_path: "./logs/error_report.md"
|
217
|
+
|
218
|
+
# Performance
|
219
|
+
performance:
|
220
|
+
concurrent_files: 3
|
221
|
+
batch_size: 5
|
222
|
+
request_interval: 1 # seconds between requests
|
223
|
+
max_memory_mb: 500
|
224
|
+
|
225
|
+
# Output
|
226
|
+
output:
|
227
|
+
show_progress: true
|
228
|
+
show_statistics: true
|
229
|
+
generate_report: true
|
230
|
+
report_path: "./reports/translation_report.md"
|
231
|
+
format: "markdown"
|
232
|
+
include_metadata: true
|
233
|
+
```
|
234
|
+
|
235
|
+
## Examples
|
236
|
+
|
237
|
+
### Translate Documentation
|
238
|
+
|
239
|
+
```bash
|
240
|
+
# Translate all markdown files in ./docs to Chinese
|
241
|
+
llm_translate translate --input ./docs --output ./docs-zh
|
242
|
+
|
243
|
+
# Use custom prompt
|
244
|
+
llm_translate translate --prompt "翻译以下内容为中文,保持技术术语不变: {content}"
|
245
|
+
|
246
|
+
# Dry run to see what would be translated
|
247
|
+
llm_translate translate --dry-run --verbose
|
248
|
+
```
|
249
|
+
|
250
|
+
### Batch Translation
|
251
|
+
|
252
|
+
```bash
|
253
|
+
# Translate multiple language versions
|
254
|
+
for lang in zh-CN ja-JP ko-KR; do
|
255
|
+
llm_translate translate --config "./configs/llm_translate-${lang}.yml"
|
256
|
+
done
|
257
|
+
```
|
258
|
+
|
259
|
+
## Development
|
260
|
+
|
261
|
+
After checking out the repo, run:
|
262
|
+
|
263
|
+
```bash
|
264
|
+
bundle install
|
265
|
+
```
|
266
|
+
|
267
|
+
To run tests:
|
268
|
+
|
269
|
+
```bash
|
270
|
+
bundle exec rspec
|
271
|
+
```
|
272
|
+
|
273
|
+
To run linting:
|
274
|
+
|
275
|
+
```bash
|
276
|
+
bundle exec rubocop
|
277
|
+
```
|
278
|
+
|
279
|
+
To install this gem onto your local machine:
|
280
|
+
|
281
|
+
```bash
|
282
|
+
bundle exec rake install
|
283
|
+
```
|
284
|
+
|
285
|
+
## Contributing
|
286
|
+
|
287
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/llm_translate/llm_translate.
|
288
|
+
|
289
|
+
## License
|
290
|
+
|
291
|
+
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
292
|
+
|
293
|
+
## Changelog
|
294
|
+
|
295
|
+
### v0.1.0
|
296
|
+
- Initial release
|
297
|
+
- Support for OpenAI, Anthropic, and Ollama providers
|
298
|
+
- Markdown format preservation
|
299
|
+
- Configurable translation prompts
|
300
|
+
- Batch file processing
|
301
|
+
- Comprehensive error handling and logging
|
data/README.zh.md
ADDED
@@ -0,0 +1,209 @@
|
|
1
|
+
# LlmTranslate
|
2
|
+
|
3
|
+
一个由AI驱动的Markdown翻译器,能够在翻译内容时保持格式不变,同时使用各种AI提供者。
|
4
|
+
|
5
|
+
## 特性
|
6
|
+
|
7
|
+
- 🤖 **AI驱动的翻译**:支持OpenAI、Anthropic和Ollama
|
8
|
+
- 📝 **Markdown格式保留**:保持代码块、链接、图像和格式不变
|
9
|
+
- 🔧 **灵活配置**:基于YAML的配置,支持环境变量
|
10
|
+
- 📁 **批量处理**:递归处理整个目录结构
|
11
|
+
- 🚀 **CLI接口**:易于使用的命令行接口,使用Thor
|
12
|
+
- 📊 **进度跟踪**:内置日志记录和报告
|
13
|
+
- ⚡ **错误处理**:强大的错误处理机制,带有重试机制
|
14
|
+
- 🎯 **可定制**:自定义提示、文件模式和输出策略
|
15
|
+
|
16
|
+
## 安装
|
17
|
+
|
18
|
+
将此行添加到您应用程序的Gemfile中:
|
19
|
+
|
20
|
+
```ruby
|
21
|
+
gem 'llm_translate'
|
22
|
+
```
|
23
|
+
|
24
|
+
然后执行:
|
25
|
+
|
26
|
+
```bash
|
27
|
+
bundle install
|
28
|
+
```
|
29
|
+
|
30
|
+
或者您也可以自己安装:
|
31
|
+
|
32
|
+
```bash
|
33
|
+
gem install llm_translate
|
34
|
+
```
|
35
|
+
|
36
|
+
## 依赖
|
37
|
+
|
38
|
+
该gem需要`rubyllm` gem进行AI集成:
|
39
|
+
|
40
|
+
```bash
|
41
|
+
gem install rubyllm
|
42
|
+
```
|
43
|
+
|
44
|
+
## 快速开始
|
45
|
+
|
46
|
+
1. **初始化配置文件**:
|
47
|
+
```bash
|
48
|
+
llm_translate init
|
49
|
+
```
|
50
|
+
|
51
|
+
2. **设置您的API密钥**:
|
52
|
+
```bash
|
53
|
+
export LLM_TRANSLATE_API_KEY="your-api-key-here"
|
54
|
+
```
|
55
|
+
|
56
|
+
3. **翻译您的markdown文件**:
|
57
|
+
```bash
|
58
|
+
llm_translate translate --config ./translator.yml
|
59
|
+
```
|
60
|
+
|
61
|
+
## 配置
|
62
|
+
|
63
|
+
翻译器使用YAML配置文件。以下是一个最小示例:
|
64
|
+
|
65
|
+
```yaml
|
66
|
+
# translator.yml
|
67
|
+
ai:
|
68
|
+
api_key: ${LLM_TRANSLATE_API_KEY}
|
69
|
+
provider: "openai"
|
70
|
+
model: "gpt-4"
|
71
|
+
temperature: 0.3
|
72
|
+
|
73
|
+
translation:
|
74
|
+
target_language: "zh-CN"
|
75
|
+
default_prompt: |
|
76
|
+
Please translate the following Markdown content to Chinese, keeping all formatting intact:
|
77
|
+
- Preserve code blocks, links, images, and other Markdown syntax
|
78
|
+
- Keep English technical terms and product names
|
79
|
+
- Ensure natural and fluent translation
|
80
|
+
|
81
|
+
Content:
|
82
|
+
{content}
|
83
|
+
|
84
|
+
files:
|
85
|
+
input_directory: "./docs"
|
86
|
+
output_directory: "./docs-translated"
|
87
|
+
filename_suffix: ".zh"
|
88
|
+
|
89
|
+
logging:
|
90
|
+
level: "info"
|
91
|
+
output: "console"
|
92
|
+
```
|
93
|
+
|
94
|
+
### AI提供者
|
95
|
+
|
96
|
+
#### OpenAI
|
97
|
+
```yaml
|
98
|
+
ai:
|
99
|
+
provider: "openai"
|
100
|
+
api_key: ${OPENAI_API_KEY}
|
101
|
+
model: "gpt-4"
|
102
|
+
```
|
103
|
+
|
104
|
+
#### Anthropic
|
105
|
+
```yaml
|
106
|
+
ai:
|
107
|
+
provider: "anthropic"
|
108
|
+
api_key: ${ANTHROPIC_API_KEY}
|
109
|
+
model: "claude-3-sonnet-20240229"
|
110
|
+
```
|
111
|
+
|
112
|
+
#### Ollama(本地)
|
113
|
+
```bash
|
114
|
+
bundle install
|
115
|
+
```0
|
116
|
+
|
117
|
+
## 使用
|
118
|
+
|
119
|
+
### 基本翻译
|
120
|
+
|
121
|
+
#### 目录模式(默认)
|
122
|
+
```bash
|
123
|
+
bundle install
|
124
|
+
```1
|
125
|
+
|
126
|
+
#### 单文件模式
|
127
|
+
要翻译单个文件,请在配置中设置`input_file```bash
|
128
|
+
llm_translate init
|
129
|
+
```7output_file`:
|
130
|
+
|
131
|
+
```bash
|
132
|
+
bundle install
|
133
|
+
```2
|
134
|
+
|
135
|
+
当同时指定`input_file```bash
|
136
|
+
llm_translate init
|
137
|
+
```7output_file`时,翻译器将以单文件模式运行,忽略与目录相关的设置。
|
138
|
+
|
139
|
+
### 命令行选项
|
140
|
+
|
141
|
+
```bash
|
142
|
+
bundle install
|
143
|
+
```3
|
144
|
+
|
145
|
+
### 配置文件结构
|
146
|
+
|
147
|
+
```bash
|
148
|
+
bundle install
|
149
|
+
```4[\s\S]*?```bash
|
150
|
+
bundle install
|
151
|
+
```5
|
152
|
+
|
153
|
+
## 示例
|
154
|
+
|
155
|
+
### 翻译文档
|
156
|
+
|
157
|
+
```bash
|
158
|
+
bundle install
|
159
|
+
```6
|
160
|
+
|
161
|
+
### 批量翻译
|
162
|
+
|
163
|
+
```bash
|
164
|
+
bundle install
|
165
|
+
```7
|
166
|
+
|
167
|
+
## 开发
|
168
|
+
|
169
|
+
克隆代码库后,运行:
|
170
|
+
|
171
|
+
```bash
|
172
|
+
bundle install
|
173
|
+
```
|
174
|
+
|
175
|
+
要运行测试:
|
176
|
+
|
177
|
+
```bash
|
178
|
+
bundle install
|
179
|
+
```9
|
180
|
+
|
181
|
+
要运行代码检查:
|
182
|
+
|
183
|
+
```bash
|
184
|
+
gem install llm_translate
|
185
|
+
```0
|
186
|
+
|
187
|
+
要将此gem安装到您的本地机器上:
|
188
|
+
|
189
|
+
```bash
|
190
|
+
gem install llm_translate
|
191
|
+
```1
|
192
|
+
|
193
|
+
## 贡献
|
194
|
+
|
195
|
+
欢迎在GitHub上提交错误报告和拉取请求,地址为 https://github.com/translator/translator。
|
196
|
+
|
197
|
+
## 许可证
|
198
|
+
|
199
|
+
该gem在[MIT许可证](https://opensource.org/licenses/MIT)条款下作为开源软件提供。
|
200
|
+
|
201
|
+
## 更新日志
|
202
|
+
|
203
|
+
### v0.1.0
|
204
|
+
- 初始发布
|
205
|
+
- 支持OpenAI、Anthropic和Ollama提供者
|
206
|
+
- Markdown格式保留
|
207
|
+
- 可配置的翻译提示
|
208
|
+
- 批量文件处理
|
209
|
+
- 综合的错误处理和日志记录
|
data/Rakefile
ADDED