@yeongjaeyou/claude-code-config 0.21.2 → 0.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -0,0 +1,194 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: translate-web-article
|
|
3
|
+
description: Convert web pages to Korean markdown documents. Fetches page via firecrawl, translates text to Korean, analyzes images with VLM for Korean captions, preserves code/tables with explanations. Use for tech blogs, papers, documentation. Triggers on "translate web page", "blog to Korean", "translate this article".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Web Article Translator
|
|
7
|
+
|
|
8
|
+
Converts web pages to Korean markdown while analyzing images with VLM to generate context-aware Korean captions.
|
|
9
|
+
|
|
10
|
+
## Workflow
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
URL Input
|
|
14
|
+
|
|
|
15
|
+
+-- Fetch page via firecrawl (markdown + links)
|
|
16
|
+
|
|
|
17
|
+
+-- Ask user options via AskUserQuestion
|
|
18
|
+
| +-- Output directory
|
|
19
|
+
| +-- Download images locally or not
|
|
20
|
+
|
|
|
21
|
+
+-- Process content
|
|
22
|
+
| +-- Text: Translate to Korean (keep tech terms)
|
|
23
|
+
| +-- Images: Download -> VLM analysis -> Korean caption
|
|
24
|
+
| +-- Code/Tables: Keep original + add explanation
|
|
25
|
+
|
|
|
26
|
+
+-- Generate markdown file
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Step 1: Fetch Web Page
|
|
30
|
+
|
|
31
|
+
Use firecrawl MCP:
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
mcp__firecrawl__firecrawl_scrape
|
|
35
|
+
- url: target URL
|
|
36
|
+
- formats: ["markdown", "links"]
|
|
37
|
+
- onlyMainContent: true
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Return error for inaccessible pages:
|
|
41
|
+
- Login required
|
|
42
|
+
- Paywall content
|
|
43
|
+
- Blocked sites
|
|
44
|
+
|
|
45
|
+
## Step 2: User Options
|
|
46
|
+
|
|
47
|
+
Use AskUserQuestion to confirm:
|
|
48
|
+
|
|
49
|
+
1. **Output directory**: Where to save translated markdown
|
|
50
|
+
2. **Download images**: Save locally or keep URL references
|
|
51
|
+
|
|
52
|
+
## Step 3: Translation Rules
|
|
53
|
+
|
|
54
|
+
### General Text
|
|
55
|
+
|
|
56
|
+
Translate to natural Korean.
|
|
57
|
+
|
|
58
|
+
### Technical Terms
|
|
59
|
+
|
|
60
|
+
Keep original English. See `references/tech-terms.md`.
|
|
61
|
+
|
|
62
|
+
```
|
|
63
|
+
Transformer, Fine-tuning, API, GPU, CUDA, Tokenizer,
|
|
64
|
+
Embedding, Attention, Backbone, Checkpoint, Epoch,
|
|
65
|
+
Batch Size, Learning Rate, Loss, Gradient, Weight...
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### Code Blocks
|
|
69
|
+
|
|
70
|
+
Keep original + add Korean explanation below:
|
|
71
|
+
|
|
72
|
+
````markdown
|
|
73
|
+
```python
|
|
74
|
+
def train(model, data):
|
|
75
|
+
optimizer.zero_grad()
|
|
76
|
+
loss = model(data)
|
|
77
|
+
loss.backward()
|
|
78
|
+
optimizer.step()
|
|
79
|
+
```
|
|
80
|
+
> 이 코드는 모델 학습의 한 스텝을 수행합니다. gradient 초기화, forward pass, backward pass, weight 업데이트 순으로 진행됩니다.
|
|
81
|
+
````
|
|
82
|
+
|
|
83
|
+
### Tables
|
|
84
|
+
|
|
85
|
+
Keep original + add Korean explanation below:
|
|
86
|
+
|
|
87
|
+
```markdown
|
|
88
|
+
| Model | Params | Score |
|
|
89
|
+
|-------|--------|-------|
|
|
90
|
+
| BERT | 110M | 89.3 |
|
|
91
|
+
| GPT-2 | 1.5B | 91.2 |
|
|
92
|
+
|
|
93
|
+
> 이 테이블은 모델별 파라미터 수와 성능 점수를 비교합니다.
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### Links
|
|
97
|
+
|
|
98
|
+
Keep URL, translate link text only:
|
|
99
|
+
|
|
100
|
+
```markdown
|
|
101
|
+
자세한 내용은 [공식 문서](https://example.com/docs)를 참고하세요.
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Step 4: Image Processing
|
|
105
|
+
|
|
106
|
+
### Process Flow
|
|
107
|
+
|
|
108
|
+
1. Extract image URLs from markdown
|
|
109
|
+
2. Download to `/tmp` (use scripts/download_image.sh)
|
|
110
|
+
3. Analyze with Read tool (VLM auto-applied)
|
|
111
|
+
4. Generate Korean caption considering surrounding context
|
|
112
|
+
5. Add VLM analysis as blockquote below image (alt text is hidden in preview)
|
|
113
|
+
|
|
114
|
+
### Caption Guidelines
|
|
115
|
+
|
|
116
|
+
- Around 2 sentences
|
|
117
|
+
- Describe image meaning and role
|
|
118
|
+
- Reflect surrounding context
|
|
119
|
+
- Use blockquote format for visibility in markdown preview
|
|
120
|
+
|
|
121
|
+
Example:
|
|
122
|
+
```markdown
|
|
123
|
+

|
|
124
|
+
*원문 캡션*
|
|
125
|
+
|
|
126
|
+
> Transformer 아키텍처의 전체 구조를 보여주는 다이어그램입니다. Encoder와 Decoder가 병렬로 배치되어 있으며, Multi-Head Attention 레이어가 핵심 구성요소입니다.
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Error Handling
|
|
130
|
+
|
|
131
|
+
When image load fails:
|
|
132
|
+
|
|
133
|
+
```markdown
|
|
134
|
+

|
|
135
|
+
> [경고] 이미지를 불러올 수 없습니다: {error_message}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
Show warning and continue translation.
|
|
139
|
+
|
|
140
|
+
## Step 5: Output Generation
|
|
141
|
+
|
|
142
|
+
### File Structure
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
{output_dir}/
|
|
146
|
+
├── {article_name}.md # Translated markdown
|
|
147
|
+
└── images/ # Downloaded images (if selected)
|
|
148
|
+
├── image_001.png
|
|
149
|
+
└── image_002.png
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
### Markdown Header
|
|
153
|
+
|
|
154
|
+
```markdown
|
|
155
|
+
# 번역된 제목
|
|
156
|
+
|
|
157
|
+
원문: {original_url}
|
|
158
|
+
번역일: {YYYY-MM-DD}
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
(Body starts here)
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Edge Cases
|
|
166
|
+
|
|
167
|
+
| Scenario | Handling |
|
|
168
|
+
|----------|----------|
|
|
169
|
+
| Image URL inaccessible | Show warning, keep original URL, continue |
|
|
170
|
+
| Login/Paywall | Return error, stop processing |
|
|
171
|
+
| Document > 10,000 chars | Chunk by sections, process sequentially |
|
|
172
|
+
| No images | Translate text only |
|
|
173
|
+
| Non-English source | Translate from that language to Korean |
|
|
174
|
+
|
|
175
|
+
## Scripts
|
|
176
|
+
|
|
177
|
+
### download_image.sh
|
|
178
|
+
|
|
179
|
+
Downloads image URL to /tmp:
|
|
180
|
+
|
|
181
|
+
```bash
|
|
182
|
+
scripts/download_image.sh "https://example.com/image.png"
|
|
183
|
+
# Output: /tmp/img_<hash>.png
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
## References
|
|
187
|
+
|
|
188
|
+
- `references/tech-terms.md` - Technical terms to keep in English
|
|
189
|
+
|
|
190
|
+
## Limitations
|
|
191
|
+
|
|
192
|
+
- Cannot process PDF directly
|
|
193
|
+
- Cannot process video content
|
|
194
|
+
- Dynamic JS-rendered content (if firecrawl fails)
|
|
@@ -0,0 +1,176 @@
|
|
|
1
|
+
# Technical Terms (Keep Original)
|
|
2
|
+
|
|
3
|
+
List of technical terms that should remain in English when translating to Korean.
|
|
4
|
+
|
|
5
|
+
## Machine Learning / Deep Learning
|
|
6
|
+
|
|
7
|
+
- Transformer
|
|
8
|
+
- Attention
|
|
9
|
+
- Multi-Head Attention
|
|
10
|
+
- Self-Attention
|
|
11
|
+
- Cross-Attention
|
|
12
|
+
- Encoder
|
|
13
|
+
- Decoder
|
|
14
|
+
- Embedding
|
|
15
|
+
- Tokenizer
|
|
16
|
+
- Fine-tuning
|
|
17
|
+
- Pre-training
|
|
18
|
+
- Transfer Learning
|
|
19
|
+
- Zero-shot
|
|
20
|
+
- Few-shot
|
|
21
|
+
- In-context Learning
|
|
22
|
+
- Prompt
|
|
23
|
+
- Prompt Engineering
|
|
24
|
+
|
|
25
|
+
## Model Architecture
|
|
26
|
+
|
|
27
|
+
- CNN (Convolutional Neural Network)
|
|
28
|
+
- RNN (Recurrent Neural Network)
|
|
29
|
+
- LSTM (Long Short-Term Memory)
|
|
30
|
+
- GRU (Gated Recurrent Unit)
|
|
31
|
+
- ResNet
|
|
32
|
+
- BERT
|
|
33
|
+
- GPT
|
|
34
|
+
- T5
|
|
35
|
+
- ViT (Vision Transformer)
|
|
36
|
+
- CLIP
|
|
37
|
+
- Diffusion
|
|
38
|
+
- VAE (Variational Autoencoder)
|
|
39
|
+
- GAN (Generative Adversarial Network)
|
|
40
|
+
|
|
41
|
+
## Training
|
|
42
|
+
|
|
43
|
+
- Loss
|
|
44
|
+
- Gradient
|
|
45
|
+
- Backpropagation
|
|
46
|
+
- Optimizer
|
|
47
|
+
- SGD (Stochastic Gradient Descent)
|
|
48
|
+
- Adam
|
|
49
|
+
- AdamW
|
|
50
|
+
- Learning Rate
|
|
51
|
+
- Batch Size
|
|
52
|
+
- Epoch
|
|
53
|
+
- Iteration
|
|
54
|
+
- Checkpoint
|
|
55
|
+
- Early Stopping
|
|
56
|
+
- Regularization
|
|
57
|
+
- Dropout
|
|
58
|
+
- Batch Normalization
|
|
59
|
+
- Layer Normalization
|
|
60
|
+
|
|
61
|
+
## Data
|
|
62
|
+
|
|
63
|
+
- Dataset
|
|
64
|
+
- Dataloader
|
|
65
|
+
- Preprocessing
|
|
66
|
+
- Augmentation
|
|
67
|
+
- Normalization
|
|
68
|
+
- Train/Val/Test Split
|
|
69
|
+
- Cross-validation
|
|
70
|
+
- Overfitting
|
|
71
|
+
- Underfitting
|
|
72
|
+
- Generalization
|
|
73
|
+
|
|
74
|
+
## Evaluation
|
|
75
|
+
|
|
76
|
+
- Accuracy
|
|
77
|
+
- Precision
|
|
78
|
+
- Recall
|
|
79
|
+
- F1 Score
|
|
80
|
+
- AUC
|
|
81
|
+
- ROC
|
|
82
|
+
- BLEU
|
|
83
|
+
- ROUGE
|
|
84
|
+
- Perplexity
|
|
85
|
+
- Benchmark
|
|
86
|
+
|
|
87
|
+
## Infrastructure
|
|
88
|
+
|
|
89
|
+
- GPU
|
|
90
|
+
- CUDA
|
|
91
|
+
- TPU
|
|
92
|
+
- CPU
|
|
93
|
+
- VRAM
|
|
94
|
+
- Distributed Training
|
|
95
|
+
- Data Parallel
|
|
96
|
+
- Model Parallel
|
|
97
|
+
- Mixed Precision
|
|
98
|
+
- FP16
|
|
99
|
+
- BF16
|
|
100
|
+
- Quantization
|
|
101
|
+
|
|
102
|
+
## Frameworks & Libraries
|
|
103
|
+
|
|
104
|
+
- PyTorch
|
|
105
|
+
- TensorFlow
|
|
106
|
+
- JAX
|
|
107
|
+
- Hugging Face
|
|
108
|
+
- Transformers
|
|
109
|
+
- Diffusers
|
|
110
|
+
- Accelerate
|
|
111
|
+
- DeepSpeed
|
|
112
|
+
- FSDP
|
|
113
|
+
- vLLM
|
|
114
|
+
- TensorRT
|
|
115
|
+
|
|
116
|
+
## APIs & Services
|
|
117
|
+
|
|
118
|
+
- API
|
|
119
|
+
- REST
|
|
120
|
+
- gRPC
|
|
121
|
+
- SDK
|
|
122
|
+
- CLI
|
|
123
|
+
- Endpoint
|
|
124
|
+
- Inference
|
|
125
|
+
- Serving
|
|
126
|
+
- Deployment
|
|
127
|
+
|
|
128
|
+
## LLM Specific
|
|
129
|
+
|
|
130
|
+
- Context Window
|
|
131
|
+
- Token
|
|
132
|
+
- BPE (Byte Pair Encoding)
|
|
133
|
+
- SentencePiece
|
|
134
|
+
- RLHF (Reinforcement Learning from Human Feedback)
|
|
135
|
+
- DPO (Direct Preference Optimization)
|
|
136
|
+
- RAG (Retrieval Augmented Generation)
|
|
137
|
+
- Chain-of-Thought
|
|
138
|
+
- Reasoning
|
|
139
|
+
- Hallucination
|
|
140
|
+
- Grounding
|
|
141
|
+
|
|
142
|
+
## Computer Vision
|
|
143
|
+
|
|
144
|
+
- Backbone
|
|
145
|
+
- Feature Extraction
|
|
146
|
+
- Object Detection
|
|
147
|
+
- Segmentation
|
|
148
|
+
- Classification
|
|
149
|
+
- Bounding Box
|
|
150
|
+
- IoU (Intersection over Union)
|
|
151
|
+
- mAP (mean Average Precision)
|
|
152
|
+
- OCR
|
|
153
|
+
|
|
154
|
+
## NLP
|
|
155
|
+
|
|
156
|
+
- NER (Named Entity Recognition)
|
|
157
|
+
- POS Tagging
|
|
158
|
+
- Dependency Parsing
|
|
159
|
+
- Sentiment Analysis
|
|
160
|
+
- Text Classification
|
|
161
|
+
- Summarization
|
|
162
|
+
- Translation
|
|
163
|
+
- Question Answering
|
|
164
|
+
|
|
165
|
+
## Usage Note
|
|
166
|
+
|
|
167
|
+
Keep these terms in English when translating.
|
|
168
|
+
|
|
169
|
+
Good example:
|
|
170
|
+
- "Transformer 모델을 Fine-tuning하여..." (O)
|
|
171
|
+
|
|
172
|
+
Bad example:
|
|
173
|
+
- "변환기 모델을 미세조정하여..." (X)
|
|
174
|
+
|
|
175
|
+
When context requires explanation, add Korean in parentheses:
|
|
176
|
+
- "Attention(주의 메커니즘)을 통해..."
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
# Download image from URL to /tmp directory
|
|
3
|
+
# Usage: download_image.sh <image_url> [output_dir]
|
|
4
|
+
# Output: Prints the local file path
|
|
5
|
+
|
|
6
|
+
set -e
|
|
7
|
+
|
|
8
|
+
IMAGE_URL="$1"
|
|
9
|
+
OUTPUT_DIR="${2:-/tmp}"
|
|
10
|
+
|
|
11
|
+
if [ -z "$IMAGE_URL" ]; then
|
|
12
|
+
echo "Usage: download_image.sh <image_url> [output_dir]" >&2
|
|
13
|
+
exit 1
|
|
14
|
+
fi
|
|
15
|
+
|
|
16
|
+
# Generate hash from URL for unique filename
|
|
17
|
+
URL_HASH=$(echo -n "$IMAGE_URL" | md5sum | cut -d' ' -f1 | head -c 12)
|
|
18
|
+
|
|
19
|
+
# Extract extension from URL (default to png)
|
|
20
|
+
EXT=$(echo "$IMAGE_URL" | grep -oE '\.(png|jpg|jpeg|gif|webp|svg)' | tail -1 || echo ".png")
|
|
21
|
+
if [ -z "$EXT" ]; then
|
|
22
|
+
EXT=".png"
|
|
23
|
+
fi
|
|
24
|
+
|
|
25
|
+
# Create output directory if needed
|
|
26
|
+
mkdir -p "$OUTPUT_DIR"
|
|
27
|
+
|
|
28
|
+
# Generate output filename
|
|
29
|
+
OUTPUT_FILE="${OUTPUT_DIR}/img_${URL_HASH}${EXT}"
|
|
30
|
+
|
|
31
|
+
# Download image
|
|
32
|
+
if curl -sL -o "$OUTPUT_FILE" "$IMAGE_URL"; then
|
|
33
|
+
# Verify file is not empty
|
|
34
|
+
if [ -s "$OUTPUT_FILE" ]; then
|
|
35
|
+
echo "$OUTPUT_FILE"
|
|
36
|
+
exit 0
|
|
37
|
+
else
|
|
38
|
+
echo "Error: Downloaded file is empty" >&2
|
|
39
|
+
rm -f "$OUTPUT_FILE"
|
|
40
|
+
exit 1
|
|
41
|
+
fi
|
|
42
|
+
else
|
|
43
|
+
echo "Error: Failed to download image from $IMAGE_URL" >&2
|
|
44
|
+
exit 1
|
|
45
|
+
fi
|