image-to-code 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,243 @@
1
+ # image-to-code
2
+
3
+ > Extract structured data (colors, layout, OCR text) from images. **No AI vision required.**
4
+ > Cross-platform: macOS · Linux · Windows
5
+ > Uses Tesseract OCR + Pillow for fully programmatic analysis.
6
+
7
+ ## Quick Install
8
+
9
+ ```bash
10
+ # NPX (easiest — auto-installs Python deps)
11
+ npx image-to-code screenshot.png
12
+
13
+ # Pip
14
+ pip install image-to-code
15
+ image-to-code screenshot.png
16
+ ```
17
+
18
+ ## Features
19
+
20
+ - **Color Extraction** — dominant colors, semantic role detection (background, text, button, border, surface), WCAG contrast ratio, color harmony classification, gradient detection
21
+ - **Layout Detection** — horizontal section segmentation, vertical column detection, component labeling with hero-padding awareness
22
+ - **OCR Text Extraction** — multi-PSM scanning, histogram stretch + adaptive threshold preprocessing, footer/branding crop scans, Thai grapheme cluster merging, intelligent dedup
23
+ - **Button Detection** — heuristic-based UI button identification from bounding box sizes
24
+ - **Photo/UI Classification** — classifies images as photo (organic background) vs UI (flat/schematic) using luminance variance + edge ratio heuristics
25
+ - **CSS Output** — generates CSS custom properties and media query recommendations
26
+ - **Clipboard Support** — read directly from clipboard (`--clipboard` flag)
27
+
28
+ ## Requirements
29
+
30
+ | Dependency | Version | Notes |
31
+ |---|---|---|
32
+ | Python | 3.10+ | Core runtime |
33
+ | [Tesseract OCR](https://github.com/tesseract-ocr/tesseract) | 5.x | OCR engine (must be on PATH) |
34
+ | Node.js (optional) | 18+ | Only needed for `npx image-to-code` |
35
+ | [Pillow](https://python-pillow.org/) | 10.0+ | Image processing |
36
+ | [pytesseract](https://github.com/madmaze/pytesseract) | 0.3.10+ | Python Tesseract wrapper |
37
+
38
+ ### Install Tesseract
39
+
40
+ ```bash
41
+ # macOS
42
+ brew install tesseract tesseract-lang
43
+
44
+ # Linux (Ubuntu/Debian)
45
+ sudo apt install tesseract-ocr tesseract-ocr-tha tesseract-ocr-osd
46
+
47
+ # Linux (Arch)
48
+ sudo pacman -S tesseract tesseract-data-tha
49
+
50
+ # Windows
51
+ winget install -e --id UB-Mannheim.TesseractOCR
52
+ # Or download from https://github.com/UB-Mannheim/tesseract/wiki
53
+ ```
54
+
55
+ ## Installation
56
+
57
+ ### Option 1: NPX (easiest)
58
+
59
+ ```bash
60
+ npx image-to-code screenshot.png
61
+ ```
62
+
63
+ > First run auto-installs the Python package. Requires Python 3.10+.
64
+
65
+ ### Option 2: pip
66
+
67
+ ```bash
68
+ pip install image-to-code
69
+ image-to-code screenshot.png
70
+ ```
71
+
72
+ ### Option 3: From source
73
+
74
+ ```bash
75
+ git clone https://github.com/phumitchreal/image-to-code.git
76
+ cd image-to-code
77
+ pip install -r requirements.txt
78
+ python -m image_to_code.analyze screenshot.png
79
+ ```
80
+
81
+ ## Usage
82
+
83
+ ### CLI
84
+
85
+ ```bash
86
+ # Basic analysis
87
+ python -m image_to_code.analyze screenshot.png
88
+
89
+ # JSON output only
90
+ python -m image_to_code.analyze screenshot.png --json
91
+
92
+ # Full report + JSON
93
+ python -m image_to_code.analyze screenshot.png --full
94
+
95
+ # Read from clipboard
96
+ python -m image_to_code.analyze --clipboard
97
+
98
+ # Specify language and confidence threshold
99
+ python -m image_to_code.analyze screenshot.png --lang eng --min-confidence 80
100
+
101
+ # Custom color sampling
102
+ python -m image_to_code.analyze screenshot.png --sample-count 3000 --quantize-tolerance 20
103
+ ```
104
+
105
+ ### Python Library
106
+
107
+ ```python
108
+ from image_to_code import analyze
109
+
110
+ # Full analysis pipeline
111
+ result = analyze.analyze_image("screenshot.png")
112
+
113
+ # Or use individual modules
114
+ from image_to_code.colors import extract_colors
115
+ from image_to_code.layout import detect_layout
116
+ from image_to_code.ocr import extract_text
117
+
118
+ colors = extract_colors("image.png")
119
+ layout = detect_layout("image.png")
120
+ text = extract_text("image.png", language="tha+eng", min_confidence=70)
121
+
122
+ print(f"Background: {colors['background']}")
123
+ print(f"Text: {colors['text']} (contrast: {colors['contrastRatio']}:1)")
124
+ print(f"Layout: {layout['layoutType']}")
125
+ print(f"OCR: {text['rawText']}")
126
+ ```
127
+
128
+ ## Output Example
129
+
130
+ ```
131
+ =======================================================================
132
+ IMAGE ANALYSIS REPORT
133
+ =======================================================================
134
+
135
+ Image: 1913x995 (landscape/desktop, photo)
136
+
137
+ --- Colors ---
138
+ Background: #0F0F0F
139
+ Surfaces: #1E1E1E, #000000, #2D2D1E
140
+ Text: #FFFFFF (contrast: 16.9:1)
141
+ Button: #5A69F0
142
+ Border: #2D1E1E
143
+ Harmony: neutral
144
+ Palette: 20 unique colors
145
+
146
+ --- Layout Components ---
147
+ hero-padding y= 0% h=45% color=#282828
148
+ bottom-segment y=45% h=53% color=#141414
149
+ bottom-segment y=98% h= 2% color=#000000
150
+
151
+ --- OCR Text (35 words >=70%) ---
152
+ DISCORD COMMUNITY HUB
153
+ ติดตามสมาชิกแก๊งแบบเรียลไทม์และดูพาร์ทเนอร์ที่ร่วมงานกับเรา
154
+ Gang Partners
155
+
156
+ --- UI Buttons (2) ---
157
+ [button] Gang (z=middle, y=570, c=94.4%)
158
+ [button] Partners (z=middle, y=580, c=96.6%)
159
+
160
+ --- CSS Recommendations ---
161
+ --bg: #0F0F0F
162
+ --surface: #1E1E1E
163
+ --text: #FFFFFF
164
+ --primary: #5A69F0
165
+ --border: #2D1E1E
166
+ --radius: 6px
167
+
168
+ =======================================================================
169
+ ```
170
+
171
+ ## JSON Output Structure
172
+
173
+ ```json
174
+ {
175
+ "imageType": "photo",
176
+ "image": { "width": 1913, "height": 995 },
177
+ "colors": {
178
+ "background": "#0F0F0F",
179
+ "text": "#FFFFFF",
180
+ "button": "#5A69F0",
181
+ "border": "#2D1E1E",
182
+ "contrastRatio": 16.9,
183
+ "harmony": "neutral",
184
+ "palette": [ ... ],
185
+ "gradient": null
186
+ },
187
+ "layout": {
188
+ "type": "landscape/desktop",
189
+ "components": [ ... ]
190
+ },
191
+ "text": {
192
+ "words": 35,
193
+ "boxes": [ ... ],
194
+ "buttons": [ ... ],
195
+ "fullText": "...",
196
+ "byZone": { "top": "...", "middle": "...", "bottom": "..." }
197
+ },
198
+ "css": {
199
+ "customProperties": {
200
+ "--bg": "#0F0F0F",
201
+ "--text": "#FFFFFF",
202
+ "--primary": "#5A69F0"
203
+ }
204
+ }
205
+ }
206
+ ```
207
+
208
+ ## PowerShell Version (Windows)
209
+
210
+ The `powershell/` directory contains the original Windows PowerShell scripts. These work on Windows only (require `System.Drawing`). Usage:
211
+
212
+ ```powershell
213
+ # Full analysis
214
+ .\powershell\analyze-image.ps1 -ImagePath screenshot.png -Full
215
+
216
+ # From clipboard
217
+ .\powershell\analyze-image.ps1 -Clipboard
218
+
219
+ # JSON output
220
+ .\powershell\analyze-image.ps1 -ImagePath screenshot.png -Json
221
+ ```
222
+
223
+ ## How It Works
224
+
225
+ ### Photo vs UI Classification
226
+ Uses three heuristics on a coarse pixel sample:
227
+ 1. **Distinct color count** — photos have >50 distinct colors (after 4-bit quantization)
228
+ 2. **Luminance IQR** — photos have narrow interquartile range (<80) with moderate color count
229
+ 3. **Edge ratio** — photos have low edge ratio (<0.3) on adjacent spatial samples with wide luminance range
230
+
231
+ ### Thai Text Handling
232
+ Tesseract splits Thai characters into individual grapheme components. The `merge_thai_text()` post-processor removes spaces between Thai Unicode characters (U+0E00–U+0E7F) to reconstruct correct words.
233
+
234
+ ### Adaptive Thresholding
235
+ For photo backgrounds, two preprocessing passes run:
236
+ 1. Histogram stretch (full contrast enhancement)
237
+ 2. Adaptive threshold (hard clip at 100/160 luminance)
238
+
239
+ OCR runs on all versions (original + preprocessed) with multiple PSM modes and deduplicates results.
240
+
241
+ ## License
242
+
243
+ MIT
package/README.th.md ADDED
@@ -0,0 +1,206 @@
1
+ # image-to-code
2
+
3
+ > แยกข้อมูลโครงสร้าง (สี, เลย์เอาต์, ข้อความ) จากรูปภาพ **โดยไม่ต้องใช้ AI ภาพ**
4
+ > รองรับทุกแพลตฟอร์ม: macOS · Linux · Windows
5
+ > ใช้ Tesseract OCR + Pillow วิเคราะห์ภาพแบบ programmatic 100%
6
+
7
+ ## ติดตั้ง
8
+
9
+ ### ตัวเลือก 1: NPX (ง่ายที่สุด)
10
+
11
+ ```bash
12
+ npx image-to-code รูป.png
13
+ ```
14
+
15
+ > ครั้งแรกจะโหลด Python package โดยอัตโนมัติ ต้องการ Python 3.10+
16
+
17
+ ### ตัวเลือก 2: pip
18
+
19
+ ```bash
20
+ pip install image-to-code
21
+ image-to-code รูป.png
22
+ ```
23
+
24
+ ### ตัวเลือก 3: จาก source
25
+
26
+ ```bash
27
+ git clone https://github.com/phumitchreal/image-to-code.git
28
+ cd image-to-code
29
+ pip install -r requirements.txt
30
+ python -m image_to_code.analyze รูป.png
31
+ ```
32
+
33
+ ### ติดตั้ง Tesseract OCR
34
+
35
+ ```bash
36
+ # macOS
37
+ brew install tesseract tesseract-lang
38
+
39
+ # Linux (Ubuntu/Debian)
40
+ sudo apt install tesseract-ocr tesseract-ocr-tha tesseract-ocr-osd
41
+
42
+ # Linux (Arch)
43
+ sudo pacman -S tesseract tesseract-data-tha
44
+
45
+ # Windows
46
+ winget install -e --id UB-Mannheim.TesseractOCR
47
+ # หรือโหลดจาก https://github.com/UB-Mannheim/tesseract/wiki
48
+ ```
49
+
50
+ > ภาษาไทย (`tha.traineddata`) จะโหลดอัตโนมัติครั้งแรกที่ใช้งาน OCR
51
+
52
+ ## ความสามารถ
53
+
54
+ | ฟีเจอร์ | รายละเอียด |
55
+ |---|---|
56
+ | **แยกสี** | สีหลัก, สีพื้นหลัง, สีข้อความ, สีปุ่ม, สีขอบ, WCAG contrast ratio, ประเภทสี harmony, gradient |
57
+ | **วิเคราะห์เลย์เอาต์** | หาส่วนแนวนอน, คอลัมน์แนวตั้ง, component labeling (hero-padding, bottom-segment) |
58
+ | **OCR ข้อความ** | หลาย PSM mode, histogram stretch + adaptive threshold, สแกน footer/branding เพิ่มเติม, จับกลุ่มตัวอักษรไทย |
59
+ | **ปุ่ม UI** | จำแนกปุ่มจากขนาด bounding box |
60
+ | **แยกประเภทภาพ** | photo (พื้นหลังออร์แกนิก) vs UI (แบน/ schematic) |
61
+ | **CSS Output** | สร้าง CSS custom properties และ media query |
62
+ | **Clipboard** | อ่านรูปจากคลิปบอร์ด (`--clipboard`) |
63
+
64
+ ## การใช้งาน
65
+
66
+ ### CLI
67
+
68
+ ```bash
69
+ # วิเคราะห์พื้นฐาน
70
+ image-to-code screenshot.png
71
+
72
+ # แสดงเป็น JSON อย่างเดียว
73
+ image-to-code screenshot.png --json
74
+
75
+ # รายงานเต็ม + JSON
76
+ image-to-code screenshot.png --full
77
+
78
+ # อ่านจากคลิปบอร์ด
79
+ image-to-code --clipboard
80
+
81
+ # เปลี่ยนภาษา OCR และความมั่นใจขั้นต่ำ
82
+ image-to-code screenshot.png --lang eng --min-confidence 80
83
+
84
+ # กำหนดจำนวนตัวอย่างสี
85
+ image-to-code screenshot.png --sample-count 3000 --quantize-tolerance 20
86
+ ```
87
+
88
+ ### ใช้เป็น Python Library
89
+
90
+ ```python
91
+ from image_to_code.colors import extract_colors
92
+ from image_to_code.layout import detect_layout
93
+ from image_to_code.ocr import extract_text
94
+
95
+ colors = extract_colors("image.png")
96
+ layout = detect_layout("image.png")
97
+ text = extract_text("image.png", language="tha+eng", min_confidence=70)
98
+
99
+ print(f"พื้นหลัง: {colors['background']}")
100
+ print(f"ข้อความ: {colors['text']} (contrast: {colors['contrastRatio']}:1)")
101
+ print(f"เลย์เอาต์: {layout['layoutType']}")
102
+ print(f"OCR: {text['rawText']}")
103
+ ```
104
+
105
+ ## ตัวอย่าง Output
106
+
107
+ ```
108
+ =======================================================================
109
+ รายงานวิเคราะห์ภาพ
110
+ =======================================================================
111
+
112
+ Image: 1913x995 (landscape/desktop, photo)
113
+
114
+ --- สี ---
115
+ พื้นหลัง: #0F0F0F
116
+ พื้นผิว: #1E1E1E, #000000, #2D2D1E
117
+ ข้อความ: #FFFFFF (contrast: 16.9:1)
118
+ ปุ่ม: #5A69F0
119
+ ขอบ: #2D1E1E
120
+
121
+ --- ส่วนประกอบเลย์เอาต์ ---
122
+ hero-padding y= 0% h=45% color=#282828
123
+ bottom-segment y=45% h=53% color=#141414
124
+ bottom-segment y=98% h= 2% color=#000000
125
+
126
+ --- OCR (35 คำ >=70%) ---
127
+ DISCORD COMMUNITY HUB
128
+ ติดตามสมาชิกแก๊งแบบเรียลไทม์และดูพาร์ทเนอร์ที่ร่วมงานกับเรา
129
+ Gang Partners
130
+
131
+ --- CSS ---
132
+ --bg: #0F0F0F
133
+ --surface: #1E1E1E
134
+ --text: #FFFFFF
135
+ --primary: #5A69F0
136
+ --border: #2D1E1E
137
+ ```
138
+
139
+ ## โครงสร้าง JSON Output
140
+
141
+ ```json
142
+ {
143
+ "imageType": "photo",
144
+ "image": { "width": 1913, "height": 995 },
145
+ "colors": {
146
+ "background": "#0F0F0F",
147
+ "text": "#FFFFFF",
148
+ "button": "#5A69F0",
149
+ "border": "#2D1E1E",
150
+ "contrastRatio": 16.9,
151
+ "harmony": "neutral",
152
+ "palette": [ ... ],
153
+ "gradient": null
154
+ },
155
+ "layout": {
156
+ "type": "landscape/desktop",
157
+ "components": [ ... ]
158
+ },
159
+ "text": {
160
+ "words": 35,
161
+ "boxes": [ ... ],
162
+ "buttons": [ { "text": "Gang", "x": 0, "y": 570, "w": 100, "h": 40 } ],
163
+ "fullText": "...",
164
+ "byZone": { "top": "...", "middle": "...", "bottom": "..." }
165
+ },
166
+ "css": {
167
+ "customProperties": {
168
+ "--bg": "#0F0F0F",
169
+ "--text": "#FFFFFF",
170
+ "--primary": "#5A69F0"
171
+ }
172
+ }
173
+ }
174
+ ```
175
+
176
+ ## การทำงานภายใน
177
+
178
+ ### แยกประเภท Photo vs UI
179
+ ใช้ 3 heuristic กับ pixel sample:
180
+ 1. **จำนวนสี distinct** — ภาพถ่ายมี >50 สี (หลัง 4-bit quantization)
181
+ 2. **Luminance IQR** — ภาพถ่ายมีช่วง interquartile แคบ (<80) + จำนวนสีปานกลาง
182
+ 3. **Edge ratio** — ภาพถ่ายมี edge ratio ต่ำ (<0.3) บน spatial sample ที่อยู่ติดกัน
183
+
184
+ ### การจัดการภาษาไทย
185
+ Tesseract มักแยกตัวอักษรไทยออกเป็น grapheme ย่อยๆ ฟังก์ชัน `merge_thai_text()` จะลบช่องว่างระหว่างอักขระไทย (U+0E00–U+0E7F) เพื่อรวมเป็นคำที่ถูกต้อง
186
+
187
+ ### Adaptive Thresholding
188
+ สำหรับภาพพื้นหลังที่เป็นรูปถ่าย จะมีการประมวลผลล่วงหน้า 2 แบบ:
189
+ 1. Histogram stretch (เพิ่ม contrast เต็มที่)
190
+ 2. Adaptive threshold (ตัดที่ 100/160 luminance)
191
+
192
+ OCR จะรันบนทุกเวอร์ชัน (ต้นฉบับ + processed) ด้วยหลาย PSM mode และ deduplicate ผลลัพธ์
193
+
194
+ ## PowerShell Version (Windows)
195
+
196
+ โฟลเดอร์ `powershell/` มี PowerShell scripts ต้นฉบับ สำหรับ Windows เท่านั้น:
197
+
198
+ ```powershell
199
+ .\powershell\analyze-image.ps1 -ImagePath screenshot.png -Full
200
+ .\powershell\analyze-image.ps1 -Clipboard
201
+ .\powershell\analyze-image.ps1 -ImagePath screenshot.png -Json
202
+ ```
203
+
204
+ ## License
205
+
206
+ MIT
package/bin/cli.js ADDED
@@ -0,0 +1,66 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * image-to-code — npm wrapper.
4
+ * Auto-installs the Python package via pip on first run, then delegates.
5
+ */
6
+ const { execSync, spawn } = require("child_process");
7
+ const path = require("path");
8
+
9
+ const PYTHON_MODULE = "image_to_code";
10
+ const REQUIRED_DEPS = ["Pillow>=10.0.0", "pytesseract>=0.3.10"];
11
+
12
+ function checkPython() {
13
+ try {
14
+ execSync("python --version", { stdio: "pipe", timeout: 10000 });
15
+ return "python";
16
+ } catch {
17
+ try {
18
+ execSync("python3 --version", { stdio: "pipe", timeout: 10000 });
19
+ return "python3";
20
+ } catch {
21
+ return null;
22
+ }
23
+ }
24
+ }
25
+
26
+ function checkPackage(python) {
27
+ try {
28
+ execSync(`${python} -c "import ${PYTHON_MODULE}"`, {
29
+ stdio: "pipe",
30
+ timeout: 10000,
31
+ });
32
+ return true;
33
+ } catch {
34
+ return false;
35
+ }
36
+ }
37
+
38
+ function installPackage(python) {
39
+ console.log("→ Installing image-to-code Python package...");
40
+ execSync(`${python} -m pip install ${PYTHON_MODULE} --upgrade`, {
41
+ stdio: "inherit",
42
+ timeout: 120000,
43
+ });
44
+ }
45
+
46
+ function main() {
47
+ const python = checkPython();
48
+ if (!python) {
49
+ console.error(
50
+ "✖ Python not found. Install Python 3.10+ from https://python.org"
51
+ );
52
+ process.exit(1);
53
+ }
54
+
55
+ if (!checkPackage(python)) {
56
+ installPackage(python);
57
+ }
58
+
59
+ const args = process.argv.slice(2);
60
+ const child = spawn(python, ["-m", PYTHON_MODULE + ".analyze", ...args], {
61
+ stdio: "inherit",
62
+ });
63
+ child.on("exit", (code) => process.exit(code));
64
+ }
65
+
66
+ main();
package/package.json ADDED
@@ -0,0 +1,30 @@
1
+ {
2
+ "name": "image-to-code",
3
+ "version": "0.1.0",
4
+ "description": "Extract structured data (colors, layout, OCR text) from images. No AI vision required.",
5
+ "bin": {
6
+ "image-to-code": "bin/cli.js"
7
+ },
8
+ "repository": {
9
+ "type": "git",
10
+ "url": "git+https://github.com/phumitchreal/image-to-code.git"
11
+ },
12
+ "keywords": [
13
+ "ocr",
14
+ "image-analysis",
15
+ "color-extraction",
16
+ "layout-detection",
17
+ "tesseract",
18
+ "thai"
19
+ ],
20
+ "license": "MIT",
21
+ "bugs": {
22
+ "url": "https://github.com/phumitchreal/image-to-code/issues"
23
+ },
24
+ "homepage": "https://github.com/phumitchreal/image-to-code#readme",
25
+ "files": [
26
+ "bin/",
27
+ "package.json",
28
+ "README.md"
29
+ ]
30
+ }