zhtw 1.5.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- zhtw-1.5.0/.github/workflows/ci.yml +31 -0
- zhtw-1.5.0/.github/workflows/publish.yml +29 -0
- zhtw-1.5.0/.gitignore +72 -0
- zhtw-1.5.0/CLAUDE.md +208 -0
- zhtw-1.5.0/LICENSE +21 -0
- zhtw-1.5.0/PKG-INFO +255 -0
- zhtw-1.5.0/README.md +224 -0
- zhtw-1.5.0/pyproject.toml +61 -0
- zhtw-1.5.0/src/zhtw/__init__.py +20 -0
- zhtw-1.5.0/src/zhtw/__main__.py +6 -0
- zhtw-1.5.0/src/zhtw/cli.py +482 -0
- zhtw-1.5.0/src/zhtw/converter.py +417 -0
- zhtw-1.5.0/src/zhtw/data/terms/cn/base.json +157 -0
- zhtw-1.5.0/src/zhtw/data/terms/cn/business.json +48 -0
- zhtw-1.5.0/src/zhtw/data/terms/cn/it.json +138 -0
- zhtw-1.5.0/src/zhtw/data/terms/hk/base.json +48 -0
- zhtw-1.5.0/src/zhtw/data/terms/hk/tech.json +25 -0
- zhtw-1.5.0/src/zhtw/dictionary.py +118 -0
- zhtw-1.5.0/src/zhtw/matcher.py +182 -0
- zhtw-1.5.0/tests/__init__.py +0 -0
- zhtw-1.5.0/tests/test_converter.py +152 -0
- zhtw-1.5.0/tests/test_dictionary.py +92 -0
- zhtw-1.5.0/tests/test_matcher.py +97 -0
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [main]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
test:
|
|
11
|
+
runs-on: ubuntu-latest
|
|
12
|
+
strategy:
|
|
13
|
+
matrix:
|
|
14
|
+
python-version: ['3.9', '3.11', '3.12']
|
|
15
|
+
|
|
16
|
+
steps:
|
|
17
|
+
- uses: actions/checkout@v4
|
|
18
|
+
|
|
19
|
+
- name: Set up Python ${{ matrix.python-version }}
|
|
20
|
+
uses: actions/setup-python@v5
|
|
21
|
+
with:
|
|
22
|
+
python-version: ${{ matrix.python-version }}
|
|
23
|
+
|
|
24
|
+
- name: Install dependencies
|
|
25
|
+
run: pip install -e ".[dev]"
|
|
26
|
+
|
|
27
|
+
- name: Lint with ruff
|
|
28
|
+
run: ruff check .
|
|
29
|
+
|
|
30
|
+
- name: Run tests
|
|
31
|
+
run: pytest
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
name: Publish to PyPI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
release:
|
|
5
|
+
types: [published]
|
|
6
|
+
workflow_dispatch:
|
|
7
|
+
|
|
8
|
+
jobs:
|
|
9
|
+
publish:
|
|
10
|
+
runs-on: ubuntu-latest
|
|
11
|
+
steps:
|
|
12
|
+
- uses: actions/checkout@v4
|
|
13
|
+
|
|
14
|
+
- name: Set up Python
|
|
15
|
+
uses: actions/setup-python@v5
|
|
16
|
+
with:
|
|
17
|
+
python-version: '3.11'
|
|
18
|
+
|
|
19
|
+
- name: Install build tools
|
|
20
|
+
run: pip install build twine
|
|
21
|
+
|
|
22
|
+
- name: Build package
|
|
23
|
+
run: python -m build
|
|
24
|
+
|
|
25
|
+
- name: Publish to PyPI
|
|
26
|
+
env:
|
|
27
|
+
TWINE_USERNAME: __token__
|
|
28
|
+
TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
|
|
29
|
+
run: twine upload dist/*
|
zhtw-1.5.0/.gitignore
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
# Python
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[cod]
|
|
4
|
+
*$py.class
|
|
5
|
+
*.so
|
|
6
|
+
.Python
|
|
7
|
+
build/
|
|
8
|
+
develop-eggs/
|
|
9
|
+
dist/
|
|
10
|
+
downloads/
|
|
11
|
+
eggs/
|
|
12
|
+
.eggs/
|
|
13
|
+
lib/
|
|
14
|
+
lib64/
|
|
15
|
+
parts/
|
|
16
|
+
sdist/
|
|
17
|
+
var/
|
|
18
|
+
wheels/
|
|
19
|
+
*.egg-info/
|
|
20
|
+
.installed.cfg
|
|
21
|
+
*.egg
|
|
22
|
+
|
|
23
|
+
# Virtual environments
|
|
24
|
+
.env
|
|
25
|
+
.venv
|
|
26
|
+
env/
|
|
27
|
+
venv/
|
|
28
|
+
ENV/
|
|
29
|
+
env.bak/
|
|
30
|
+
venv.bak/
|
|
31
|
+
|
|
32
|
+
# IDE
|
|
33
|
+
.idea/
|
|
34
|
+
.vscode/
|
|
35
|
+
*.swp
|
|
36
|
+
*.swo
|
|
37
|
+
*~
|
|
38
|
+
|
|
39
|
+
# Testing
|
|
40
|
+
.tox/
|
|
41
|
+
.nox/
|
|
42
|
+
.coverage
|
|
43
|
+
.coverage.*
|
|
44
|
+
htmlcov/
|
|
45
|
+
.pytest_cache/
|
|
46
|
+
.hypothesis/
|
|
47
|
+
|
|
48
|
+
# Mypy
|
|
49
|
+
.mypy_cache/
|
|
50
|
+
.dmypy.json
|
|
51
|
+
dmypy.json
|
|
52
|
+
|
|
53
|
+
# Ruff
|
|
54
|
+
.ruff_cache/
|
|
55
|
+
|
|
56
|
+
# Distribution / packaging
|
|
57
|
+
.Python
|
|
58
|
+
*.manifest
|
|
59
|
+
*.spec
|
|
60
|
+
|
|
61
|
+
# Installer logs
|
|
62
|
+
pip-log.txt
|
|
63
|
+
pip-delete-this-directory.txt
|
|
64
|
+
|
|
65
|
+
# Cache
|
|
66
|
+
.zhtw-cache
|
|
67
|
+
*.cache
|
|
68
|
+
|
|
69
|
+
# macOS
|
|
70
|
+
.DS_Store
|
|
71
|
+
.AppleDouble
|
|
72
|
+
.LSOverride
|
zhtw-1.5.0/CLAUDE.md
ADDED
|
@@ -0,0 +1,208 @@
|
|
|
1
|
+
# ZHTW - Claude AI 開發指南
|
|
2
|
+
|
|
3
|
+
**專案名稱**: ZHTW - 簡轉繁台灣用語轉換器
|
|
4
|
+
**作者**: rajatim
|
|
5
|
+
**語言**: Python 3.9+
|
|
6
|
+
**最後更新**: 2025-12-26
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## 專案概述
|
|
11
|
+
|
|
12
|
+
將程式碼和文件中的簡體中文轉換為台灣繁體中文用語的 CLI 工具。
|
|
13
|
+
|
|
14
|
+
### 核心設計原則
|
|
15
|
+
|
|
16
|
+
1. **完全離線** - 不傳送任何資料到外部伺服器
|
|
17
|
+
2. **術語表優先** - 只轉換明確定義的詞彙,避免過度轉換
|
|
18
|
+
3. **高效能** - Aho-Corasick 演算法處理大量術語
|
|
19
|
+
4. **可擴充** - JSON 詞庫格式,易於維護
|
|
20
|
+
|
|
21
|
+
### 為什麼不用 OpenCC?
|
|
22
|
+
|
|
23
|
+
OpenCC 會過度轉換台灣正確的詞彙(如 權限→許可權)。我們使用精確術語表避免這問題。
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## 專案結構
|
|
28
|
+
|
|
29
|
+
```
|
|
30
|
+
zhtw/
|
|
31
|
+
├── src/zhtw/
|
|
32
|
+
│ ├── __init__.py # 版本號、匯出
|
|
33
|
+
│ ├── cli.py # CLI 入口 (click)
|
|
34
|
+
│ ├── converter.py # 核心轉換邏輯
|
|
35
|
+
│ ├── dictionary.py # 詞庫載入/管理
|
|
36
|
+
│ ├── matcher.py # Aho-Corasick 匹配器
|
|
37
|
+
│ └── data/
|
|
38
|
+
│ └── terms/
|
|
39
|
+
│ ├── cn/ # 簡體 → 台灣繁體
|
|
40
|
+
│ │ ├── base.json
|
|
41
|
+
│ │ ├── it.json
|
|
42
|
+
│ │ └── business.json
|
|
43
|
+
│ └── hk/ # 港式 → 台灣繁體
|
|
44
|
+
│ ├── base.json
|
|
45
|
+
│ └── tech.json
|
|
46
|
+
├── tests/
|
|
47
|
+
├── pyproject.toml
|
|
48
|
+
├── README.md
|
|
49
|
+
└── CLAUDE.md # 本文件
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## 核心模組說明
|
|
55
|
+
|
|
56
|
+
### cli.py
|
|
57
|
+
- 使用 `click` 框架
|
|
58
|
+
- 子命令: `check`, `fix`
|
|
59
|
+
- 參數: `--source`, `--dict`, `--exclude`, `--json`, `--verbose`, `--dry-run`
|
|
60
|
+
|
|
61
|
+
### converter.py
|
|
62
|
+
- `convert_file(path, matcher, fix)` - 處理單一檔案
|
|
63
|
+
- `convert_directory(path, matcher, fix)` - 處理目錄
|
|
64
|
+
- `process_directory()` - 主要入口,載入詞庫並處理
|
|
65
|
+
|
|
66
|
+
### matcher.py
|
|
67
|
+
- 使用 `pyahocorasick` 建立自動機
|
|
68
|
+
- `Matcher` 類別:
|
|
69
|
+
- `find_matches(text)` - 找出所有匹配
|
|
70
|
+
- `find_matches_with_lines(text)` - 帶行列資訊
|
|
71
|
+
- `replace_all(text)` - 替換所有匹配
|
|
72
|
+
|
|
73
|
+
### dictionary.py
|
|
74
|
+
- `load_builtin(sources)` - 載入內建 cn/hk 詞庫
|
|
75
|
+
- `load_dictionary(sources, custom_path)` - 主要載入函數
|
|
76
|
+
- 支援簡單格式和擴展格式
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## CLI 使用方式
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
# 檢查模式(只報告)
|
|
84
|
+
zhtw check ./src
|
|
85
|
+
|
|
86
|
+
# 修正模式(自動修改)
|
|
87
|
+
zhtw fix ./src
|
|
88
|
+
|
|
89
|
+
# 指定來源
|
|
90
|
+
zhtw check ./src --source cn # 只處理簡體
|
|
91
|
+
zhtw check ./src --source hk # 只處理港式
|
|
92
|
+
zhtw check ./src --source cn,hk # 兩者都處理(預設)
|
|
93
|
+
|
|
94
|
+
# 自訂詞庫
|
|
95
|
+
zhtw fix ./src --dict ./custom.json
|
|
96
|
+
|
|
97
|
+
# JSON 輸出(CI/CD)
|
|
98
|
+
zhtw check ./src --json
|
|
99
|
+
|
|
100
|
+
# 模擬執行
|
|
101
|
+
zhtw fix ./src --dry-run
|
|
102
|
+
|
|
103
|
+
# 詳細輸出
|
|
104
|
+
zhtw check ./src --verbose
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## 開發指令
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
# 安裝開發依賴
|
|
113
|
+
pip install -e ".[dev]"
|
|
114
|
+
|
|
115
|
+
# 執行測試
|
|
116
|
+
pytest
|
|
117
|
+
|
|
118
|
+
# 執行 lint
|
|
119
|
+
ruff check .
|
|
120
|
+
|
|
121
|
+
# 本地測試 CLI
|
|
122
|
+
python -m zhtw check ./test-files
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## 詞庫格式
|
|
128
|
+
|
|
129
|
+
### 簡單格式(v1.0)
|
|
130
|
+
```json
|
|
131
|
+
{
|
|
132
|
+
"version": "1.0",
|
|
133
|
+
"description": "說明文字",
|
|
134
|
+
"terms": {
|
|
135
|
+
"简体": "繁體"
|
|
136
|
+
}
|
|
137
|
+
}
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
### 擴展格式(v1.5+,未來支援)
|
|
141
|
+
```json
|
|
142
|
+
{
|
|
143
|
+
"version": "1.5",
|
|
144
|
+
"terms": {
|
|
145
|
+
"文档": {
|
|
146
|
+
"target": "文件",
|
|
147
|
+
"category": "it",
|
|
148
|
+
"confidence": 1.0,
|
|
149
|
+
"context": "一般情境"
|
|
150
|
+
}
|
|
151
|
+
}
|
|
152
|
+
}
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
---
|
|
156
|
+
|
|
157
|
+
## 效能考量
|
|
158
|
+
|
|
159
|
+
- **Aho-Corasick**: O(n) 時間複雜度掃描,不受術語數量影響
|
|
160
|
+
- **預過濾**: 跳過不含中文字元的檔案(約 70%+)
|
|
161
|
+
- **預設排除**: node_modules, .git, dist, build 等
|
|
162
|
+
- **支援副檔名**: .py, .ts, .tsx, .js, .jsx, .java, .vue, .go, .rs, .json, .yml, .yaml, .md, .txt, .html, .css
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
## 新增詞彙
|
|
167
|
+
|
|
168
|
+
1. 找到對應的詞庫檔案(cn/ 或 hk/)
|
|
169
|
+
2. 新增 key-value 對
|
|
170
|
+
3. 確保轉換在台灣用語中正確
|
|
171
|
+
4. 提交 PR
|
|
172
|
+
|
|
173
|
+
範例:
|
|
174
|
+
```json
|
|
175
|
+
{
|
|
176
|
+
"terms": {
|
|
177
|
+
"existing_term": "existing_translation",
|
|
178
|
+
"new_term": "新翻譯"
|
|
179
|
+
}
|
|
180
|
+
}
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
---
|
|
184
|
+
|
|
185
|
+
## 未來規劃
|
|
186
|
+
|
|
187
|
+
### v2.0 LLM 整合
|
|
188
|
+
- 詞彙探索: 用 LLM 發現新詞彙
|
|
189
|
+
- 上下文感知: 依上下文判斷轉換
|
|
190
|
+
- 主動學習: 收集使用者修正
|
|
191
|
+
|
|
192
|
+
### v3.0 本地模型
|
|
193
|
+
- 微調小型中文模型
|
|
194
|
+
- 完全離線高準確度
|
|
195
|
+
|
|
196
|
+
---
|
|
197
|
+
|
|
198
|
+
## AI 開發注意事項
|
|
199
|
+
|
|
200
|
+
1. **不要使用 OpenCC** - 會過度轉換台灣正確用語
|
|
201
|
+
2. **不要呼叫外部 API** - 保持離線運作
|
|
202
|
+
3. **詞庫修改要謹慎** - 確認轉換在台灣用語中正確
|
|
203
|
+
4. **測試要全面** - 包含邊界案例和誤判測試
|
|
204
|
+
5. **不要新增不確定的詞彙** - 寧可少轉不要錯轉
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
*rajatim 出品*
|
zhtw-1.5.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 rajatim
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
zhtw-1.5.0/PKG-INFO
ADDED
|
@@ -0,0 +1,255 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: zhtw
|
|
3
|
+
Version: 1.5.0
|
|
4
|
+
Summary: Simplified/HK Traditional to Taiwan Traditional Chinese Converter
|
|
5
|
+
Project-URL: Homepage, https://github.com/rajatim/zhtw
|
|
6
|
+
Project-URL: Repository, https://github.com/rajatim/zhtw
|
|
7
|
+
Project-URL: Issues, https://github.com/rajatim/zhtw/issues
|
|
8
|
+
Author-email: rajatim <rajatim@users.noreply.github.com>
|
|
9
|
+
License-Expression: MIT
|
|
10
|
+
License-File: LICENSE
|
|
11
|
+
Keywords: chinese,converter,i18n,simplified,taiwan,traditional
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Environment :: Console
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
16
|
+
Classifier: Operating System :: OS Independent
|
|
17
|
+
Classifier: Programming Language :: Python :: 3
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
22
|
+
Classifier: Topic :: Text Processing :: Linguistic
|
|
23
|
+
Requires-Python: >=3.9
|
|
24
|
+
Requires-Dist: click>=8.0
|
|
25
|
+
Requires-Dist: pyahocorasick>=2.0
|
|
26
|
+
Provides-Extra: dev
|
|
27
|
+
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
|
|
28
|
+
Requires-Dist: pytest>=7.0; extra == 'dev'
|
|
29
|
+
Requires-Dist: ruff>=0.1; extra == 'dev'
|
|
30
|
+
Description-Content-Type: text/markdown
|
|
31
|
+
|
|
32
|
+
# ZHTW
|
|
33
|
+
|
|
34
|
+
> 簡轉繁台灣用語轉換器 | Simplified to Traditional Chinese (Taiwan) Converter
|
|
35
|
+
|
|
36
|
+
[](https://github.com/rajatim/zhtw/actions/workflows/ci.yml)
|
|
37
|
+
[](https://www.python.org/downloads/)
|
|
38
|
+
[](LICENSE)
|
|
39
|
+
|
|
40
|
+
**rajatim 出品**
|
|
41
|
+
|
|
42
|
+
將程式碼和文件中的簡體中文轉換為台灣繁體中文用語。
|
|
43
|
+
|
|
44
|
+
## 特色
|
|
45
|
+
|
|
46
|
+
- **高效能** - Aho-Corasick 演算法,萬級術語秒級掃描
|
|
47
|
+
- **完全離線** - 不傳送任何資料到外部伺服器
|
|
48
|
+
- **可擴充詞庫** - JSON 格式,易於維護和貢獻
|
|
49
|
+
- **精準轉換** - 術語表優先,避免過度轉換
|
|
50
|
+
- **CI/CD 友善** - JSON 輸出,易於整合
|
|
51
|
+
- **忽略註解** - 支援 `zhtw:disable` 跳過特定程式碼
|
|
52
|
+
|
|
53
|
+
## 安裝
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
pip install zhtw
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
或從原始碼安裝:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
git clone https://github.com/rajatim/zhtw.git
|
|
63
|
+
cd zhtw
|
|
64
|
+
pip install -e .
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## 快速開始
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
# 檢查模式(只報告,不修改)
|
|
71
|
+
zhtw check ./src
|
|
72
|
+
|
|
73
|
+
# 修正模式(自動修改檔案)
|
|
74
|
+
zhtw fix ./src
|
|
75
|
+
|
|
76
|
+
# JSON 輸出(CI/CD 整合)
|
|
77
|
+
zhtw check ./src --json
|
|
78
|
+
|
|
79
|
+
# 使用自訂詞庫
|
|
80
|
+
zhtw fix ./src --dict ./my-terms.json
|
|
81
|
+
|
|
82
|
+
# 排除目錄
|
|
83
|
+
zhtw check ./src --exclude node_modules,dist
|
|
84
|
+
|
|
85
|
+
# 只處理簡體
|
|
86
|
+
zhtw check ./src --source cn
|
|
87
|
+
|
|
88
|
+
# 只處理港式
|
|
89
|
+
zhtw check ./src --source hk
|
|
90
|
+
|
|
91
|
+
# 模擬執行(不實際修改)
|
|
92
|
+
zhtw fix ./src --dry-run
|
|
93
|
+
|
|
94
|
+
# 顯示詞庫統計
|
|
95
|
+
zhtw stats
|
|
96
|
+
|
|
97
|
+
# 驗證詞庫品質
|
|
98
|
+
zhtw validate
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## 輸出範例
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
📁 掃描 ./src
|
|
105
|
+
|
|
106
|
+
📄 src/components/Header.tsx
|
|
107
|
+
L12:5: "用户" → "使用者"
|
|
108
|
+
...顯示用户資訊...
|
|
109
|
+
|
|
110
|
+
📄 src/utils/api.ts
|
|
111
|
+
L8:10: "网络" → "網路"
|
|
112
|
+
...檢查网络連線...
|
|
113
|
+
|
|
114
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
115
|
+
⚠️ 發現 2 處問題(2 個檔案)
|
|
116
|
+
掃描: 150 個檔案 (跳過 45 個無中文檔案)
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## 詞庫格式
|
|
120
|
+
|
|
121
|
+
```json
|
|
122
|
+
{
|
|
123
|
+
"version": "1.0",
|
|
124
|
+
"description": "自訂詞庫說明",
|
|
125
|
+
"terms": {
|
|
126
|
+
"文档": "文件",
|
|
127
|
+
"代码": "程式碼",
|
|
128
|
+
"软件": "軟體"
|
|
129
|
+
}
|
|
130
|
+
}
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
## 內建詞庫
|
|
134
|
+
|
|
135
|
+
| 來源 | 類別 | 詞彙數 | 說明 |
|
|
136
|
+
|------|------|--------|------|
|
|
137
|
+
| CN | base | 151 | 簡體基礎詞彙 |
|
|
138
|
+
| CN | it | 132 | IT/程式術語 |
|
|
139
|
+
| CN | business | 42 | 商業術語 |
|
|
140
|
+
| HK | base | 42 | 港式基礎詞彙 |
|
|
141
|
+
| HK | tech | 19 | 港式科技術語 |
|
|
142
|
+
|
|
143
|
+
**總計:386 個詞彙**
|
|
144
|
+
|
|
145
|
+
## 忽略註解
|
|
146
|
+
|
|
147
|
+
在程式碼中使用註解來跳過特定內容的檢查:
|
|
148
|
+
|
|
149
|
+
```python
|
|
150
|
+
# 忽略當前行
|
|
151
|
+
user_info = "用户信息" # zhtw:disable-line
|
|
152
|
+
|
|
153
|
+
# 忽略下一行
|
|
154
|
+
# zhtw:disable-next
|
|
155
|
+
simplified = "软件"
|
|
156
|
+
|
|
157
|
+
# 忽略區塊
|
|
158
|
+
# zhtw:disable
|
|
159
|
+
test_data = [
|
|
160
|
+
"软件",
|
|
161
|
+
"硬件",
|
|
162
|
+
"网络",
|
|
163
|
+
]
|
|
164
|
+
# zhtw:enable
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
支援各種註解風格:`#`、`//`、`/* */`、`<!-- -->`
|
|
168
|
+
|
|
169
|
+
## CI/CD 整合
|
|
170
|
+
|
|
171
|
+
### GitHub Actions
|
|
172
|
+
|
|
173
|
+
```yaml
|
|
174
|
+
- name: Check Traditional Chinese
|
|
175
|
+
run: |
|
|
176
|
+
pip install zhtw
|
|
177
|
+
zhtw check ./src --json > result.json
|
|
178
|
+
if [ $? -ne 0 ]; then
|
|
179
|
+
echo "發現簡體中文用語,請修正"
|
|
180
|
+
exit 1
|
|
181
|
+
fi
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
### Jenkins
|
|
185
|
+
|
|
186
|
+
```groovy
|
|
187
|
+
stage('繁體中文檢查') {
|
|
188
|
+
steps {
|
|
189
|
+
sh 'pip install zhtw'
|
|
190
|
+
sh 'zhtw check . --json > terminology-report.json'
|
|
191
|
+
}
|
|
192
|
+
}
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### JSON 輸出格式
|
|
196
|
+
|
|
197
|
+
```json
|
|
198
|
+
{
|
|
199
|
+
"total_issues": 3,
|
|
200
|
+
"files_with_issues": 2,
|
|
201
|
+
"files_checked": 150,
|
|
202
|
+
"files_modified": 0,
|
|
203
|
+
"files_skipped": 45,
|
|
204
|
+
"status": "fail",
|
|
205
|
+
"issues": [
|
|
206
|
+
{
|
|
207
|
+
"file": "src/components/Header.tsx",
|
|
208
|
+
"line": 12,
|
|
209
|
+
"column": 5,
|
|
210
|
+
"source": "用户",
|
|
211
|
+
"target": "使用者"
|
|
212
|
+
}
|
|
213
|
+
]
|
|
214
|
+
}
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
## 為什麼不用 OpenCC?
|
|
218
|
+
|
|
219
|
+
OpenCC 會過度轉換一些在台灣已經正確的詞彙:
|
|
220
|
+
|
|
221
|
+
| 原文 | OpenCC 結果 | 正確(台灣) |
|
|
222
|
+
|------|-------------|--------------|
|
|
223
|
+
| 權限 | 許可權 | 權限 |
|
|
224
|
+
| 設備 | 裝置 | 設備 |
|
|
225
|
+
| 視頻 | 視訊 | 影片 |
|
|
226
|
+
|
|
227
|
+
ZHTW 使用精確的術語表,只轉換明確定義的詞彙,避免這類問題。
|
|
228
|
+
|
|
229
|
+
## 開發
|
|
230
|
+
|
|
231
|
+
```bash
|
|
232
|
+
# 安裝開發依賴
|
|
233
|
+
pip install -e ".[dev]"
|
|
234
|
+
|
|
235
|
+
# 執行測試
|
|
236
|
+
pytest
|
|
237
|
+
|
|
238
|
+
# 執行 lint
|
|
239
|
+
ruff check .
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
## 路線圖
|
|
243
|
+
|
|
244
|
+
- [x] v1.0 - 基礎 CLI + 詞庫
|
|
245
|
+
- [x] v1.5 - 統計報告 + 詞庫驗證 + 忽略註解
|
|
246
|
+
- [ ] v2.0 - LLM 輔助詞彙探索
|
|
247
|
+
- [ ] v3.0 - 本地模型上下文感知
|
|
248
|
+
|
|
249
|
+
## License
|
|
250
|
+
|
|
251
|
+
MIT License
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
**rajatim 出品**
|