eye2byte 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,290 @@
1
+ Metadata-Version: 2.4
2
+ Name: eye2byte
3
+ Version: 0.3.0
4
+ Summary: Screen-context sidecar for coding agents
5
+ Author: wolverin0
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/wolverin0/Eye2byte
8
+ Project-URL: Changelog, https://github.com/wolverin0/Eye2byte/blob/claude/screen-context-sidecar-KDVSF/CHANGELOG.md
9
+ Project-URL: Issues, https://github.com/wolverin0/Eye2byte/issues
10
+ Keywords: screen-capture,mcp,coding-agent,vision,context
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Environment :: Console
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Operating System :: OS Independent
15
+ Classifier: Programming Language :: Python :: 3
16
+ Classifier: Programming Language :: Python :: 3.10
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Programming Language :: Python :: 3.13
20
+ Classifier: Topic :: Software Development :: Quality Assurance
21
+ Requires-Python: >=3.10
22
+ Description-Content-Type: text/markdown
23
+ Requires-Dist: Pillow
24
+ Requires-Dist: fastmcp>=2.10
25
+ Provides-Extra: voice
26
+ Requires-Dist: openai-whisper; extra == "voice"
27
+ Provides-Extra: ui
28
+ Requires-Dist: customtkinter>=5.0; extra == "ui"
29
+ Provides-Extra: all
30
+ Requires-Dist: openai-whisper; extra == "all"
31
+ Requires-Dist: customtkinter>=5.0; extra == "all"
32
+
33
+ <p align="center">
34
+ <h1 align="center">Eye2byte</h1>
35
+ <p align="center">Screen-context sidecar for coding agents</p>
36
+ </p>
37
+
38
+ <p align="center">
39
+ <a href="#setup"><img src="https://img.shields.io/badge/python-3.10+-blue?logo=python&logoColor=white" alt="Python 3.10+"></a>
40
+ <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-green" alt="MIT License"></a>
41
+ <a href="#platforms"><img src="https://img.shields.io/badge/platform-Windows%20%7C%20macOS%20%7C%20Linux%20%7C%20Android-lightgrey" alt="Cross-platform"></a>
42
+ <a href="CHANGELOG.md"><img src="https://img.shields.io/badge/changelog-CHANGELOG.md-orange" alt="Changelog"></a>
43
+ </p>
44
+
45
+ ---
46
+
47
+ Captures your screen, voice, and annotations, feeds them to any vision model, and produces structured **Context Packs** your coding agent can act on.
48
+
49
+ ```
50
+ Screen / Voice / Annotations --> Vision Model + Whisper --> Context Pack --> Coding Agent
51
+ ```
52
+
53
+ ## Features
54
+
55
+ - **Multi-monitor capture** — active, specific (1/2/3), or all monitors at once
56
+ - **Voice narration** — record, clean (noise removal + normalization), transcribe locally
57
+ - **Annotations** — arrows, circles, rectangles, freehand, multi-line text on a frozen screenshot
58
+ - **Screen clips** — record short videos, extract keyframes, analyze the sequence
59
+ - **Image optimization** — auto resize + compress (~5x smaller, zero quality loss)
60
+ - **MCP server** — coding agents query your screen directly via Model Context Protocol
61
+ - **Context Packs** — structured output: goal, environment, errors, signals, next steps
62
+
63
+ ## Platforms
64
+
65
+ | Platform | Screenshot | Voice | Annotation | Hotkeys |
66
+ |----------|-----------|-------|------------|---------|
67
+ | Windows | PowerShell .NET | ffmpeg | Pillow | Ctrl+Shift+1-5 |
68
+ | macOS | screencapture | ffmpeg | Pillow | - |
69
+ | Linux | scrot/maim/flameshot | ffmpeg | Pillow | - |
70
+ | Android | ADB (Termux) | Termux:API | - | - |
71
+
72
+ ## Setup
73
+
74
+ ### 1. Install dependencies
75
+
76
+ ```bash
77
+ pip install Pillow fastmcp # Core + MCP server
78
+ pip install openai-whisper # Local voice transcription (optional)
79
+ # ffmpeg is required for voice/clips — install via your package manager
80
+ ```
81
+
82
+ ### 2. Configure a vision provider
83
+
84
+ Eye2byte works with **any vision model** — local or cloud. Set your provider in `~/.eye2byte/config.json` or the Settings UI:
85
+
86
+ | Provider | Setup | Cost |
87
+ |----------|-------|------|
88
+ | **Ollama** (local) | [Install Ollama](https://ollama.com), `ollama pull qwen3-vl:8b` | Free |
89
+ | **Gemini** | Set `GEMINI_API_KEY` in `.env` | Free tier (1000 req/day) |
90
+ | **OpenRouter** | Set `OPENROUTER_API_KEY` in `.env` | Free models available |
91
+ | **Hyperbolic** | Set `HYPERBOLIC_API_KEY` in `.env` | Pay per use |
92
+
93
+ ```bash
94
+ # .env file (project dir, cwd, or ~/.eye2byte/.env)
95
+ GEMINI_API_KEY=your-key-here
96
+ # or OPENROUTER_API_KEY=...
97
+ # or HYPERBOLIC_API_KEY=...
98
+ ```
99
+
100
+ ### 3. Run
101
+
102
+ ```bash
103
+ python eye2byte.py capture # Screenshot + analysis
104
+ python eye2byte.py capture --voice # + voice narration
105
+ python eye2byte.py capture --mode window # Active window only
106
+ python eye2byte_ui.py # Launch control panel
107
+ ```
108
+
109
+ ## Control Panel
110
+
111
+ ```bash
112
+ python eye2byte_ui.py
113
+ ```
114
+
115
+ A small always-on-top floating panel. Drag it anywhere. Global hotkeys work even when the panel isn't focused.
116
+
117
+ ### Global Hotkeys (Windows)
118
+
119
+ These work system-wide — no need to focus the Eye2byte window:
120
+
121
+ | Hotkey | Action | Notes |
122
+ |--------|--------|-------|
123
+ | `Ctrl+Shift+1` | Capture screenshot | Uses current mode (Full/Window/Region) |
124
+ | `Ctrl+Shift+2` | Annotate | Freezes screen, opens drawing overlay |
125
+ | `Ctrl+Shift+3` | Toggle voice recording | Press once to start, again to stop |
126
+ | `Ctrl+Shift+5` | Grab clipboard image | Analyzes whatever image is on your clipboard |
127
+
128
+ All keyboard shortcuts are customizable from Settings > Keyboard Shortcuts.
129
+
130
+ ### Panel Controls
131
+
132
+ | Control | Action |
133
+ |---------|--------|
134
+ | `Space` (hold) | Push-to-talk — hold to record, release to stop |
135
+ | Mode selector | Cycle between Full Screen / Window / Region |
136
+ | Settings | Configure provider, model, image quality, cleanup |
137
+ | Copy @path | Copy session path to clipboard for `@`-mentioning |
138
+
139
+ ### Annotation Overlay
140
+
141
+ When you press `Ctrl+Shift+2` or click Annotate, the screen freezes and you can draw on it:
142
+
143
+ | Key | Tool | How to use |
144
+ |-----|------|-----------|
145
+ | `X` | Arrow | Click and drag to draw an arrow |
146
+ | `C` | Circle | Click and drag to draw an ellipse |
147
+ | `V` | Rectangle | Click and drag to draw a box |
148
+ | `B` | Freehand | Click and drag to draw freely |
149
+ | `T` | Text | Click to place, type your text |
150
+
151
+ | Action | How |
152
+ |--------|-----|
153
+ | **Save** | `Enter` (commits annotations and sends to vision model) |
154
+ | **Cancel** | `Escape` (discards all annotations) |
155
+ | **Undo** | Right-click near an annotation to remove it |
156
+ | **Newline in text** | `Shift+Enter` (Enter alone commits the text) |
157
+ | **Multi-line text** | Text box auto-grows up to 6 lines |
158
+
159
+ ### Voice Recording
160
+
161
+ Three ways to record voice:
162
+
163
+ 1. **Toggle** — `Ctrl+Shift+3` starts recording, press again to stop
164
+ 2. **Push-to-talk** — Hold `Space` while panel is focused
165
+ 3. **Mouse PTT** — Hold click on the Record button
166
+
167
+ While recording, any captures you take are automatically bundled with the voice note into a single session.
168
+
169
+ ## MCP Server
170
+
171
+ Eye2byte exposes 6 tools via the [Model Context Protocol](https://modelcontextprotocol.io), letting coding agents capture and analyze your screen directly.
172
+
173
+ | Tool | Description |
174
+ |------|-------------|
175
+ | `capture_and_summarize` | Screenshot + vision analysis. Supports monitor selection, delay, window targeting |
176
+ | `capture_with_voice` | Screenshot + voice recording + transcription + analysis |
177
+ | `record_clip_and_summarize` | Screen clip with keyframe extraction and sequence analysis |
178
+ | `summarize_screenshot` | Analyze an existing image file |
179
+ | `transcribe_audio` | Local Whisper transcription of any audio file |
180
+ | `get_recent_context` | Retrieve recent Context Pack summaries |
181
+
182
+ ### Local Setup (stdio)
183
+
184
+ Eye2byte runs on the machine whose screen you want to capture. For local agents like Claude Code on the same machine, use stdio transport:
185
+
186
+ **Claude Code** — add to your project's `.mcp.json`:
187
+
188
+ ```json
189
+ {
190
+ "mcpServers": {
191
+ "eye2byte": {
192
+ "command": "python",
193
+ "args": ["C:/path/to/eye2byte_mcp.py"]
194
+ }
195
+ }
196
+ }
197
+ ```
198
+
199
+ That's it — Claude Code will auto-start the server. Use full absolute paths.
200
+
201
+ ### Remote Setup (SSE)
202
+
203
+ When your coding agent runs on a **different machine** (cloud VM, SSH dev box, CI runner) but needs to see your local screen, use SSE transport:
204
+
205
+ **Step 1 — On your local machine** (the one with the screen):
206
+
207
+ ```bash
208
+ # Install Eye2byte + dependencies
209
+ pip install Pillow fastmcp
210
+ pip install openai-whisper # optional, for voice
211
+
212
+ # Start the SSE server
213
+ python eye2byte_mcp.py --sse # No auth (LAN only)
214
+ python eye2byte_mcp.py --sse --token mysecret123 # Bearer token auth
215
+ python eye2byte_mcp.py --sse --port 9000 --token abc # Custom port + auth
216
+ ```
217
+
218
+ The server stays running and accepts connections from any machine on your network. Use `--token` when the server is reachable beyond your trusted LAN.
219
+
220
+ **Step 2 — On the remote machine** (where the coding agent runs):
221
+
222
+ Nothing to install. Just configure the MCP client to point at your local IP:
223
+
224
+ ```json
225
+ {
226
+ "mcpServers": {
227
+ "eye2byte": {
228
+ "url": "http://YOUR_LOCAL_IP:8808/sse",
229
+ "headers": {"Authorization": "Bearer mysecret123"}
230
+ }
231
+ }
232
+ }
233
+ ```
234
+
235
+ Omit the `headers` field if the server was started without `--token`.
236
+
237
+ Find your local IP: `ipconfig` (Windows) or `ifconfig` / `ip addr` (Linux/macOS).
238
+
239
+ **Firewall:** You may need to allow inbound TCP on port 8808. On Windows, run as admin:
240
+
241
+ ```powershell
242
+ netsh advfirewall firewall add rule name="Eye2byte MCP" dir=in action=allow protocol=TCP localport=8808
243
+ ```
244
+
245
+ ### Multi-monitor Examples
246
+
247
+ ```
248
+ capture_and_summarize(monitor=0) # active monitor (default)
249
+ capture_and_summarize(monitor=1) # first monitor
250
+ capture_and_summarize(monitor=2) # second monitor
251
+ capture_and_summarize(monitor=-1) # ALL monitors at once
252
+ ```
253
+
254
+ ## Context Pack Format
255
+
256
+ Every analysis produces a structured Context Pack:
257
+
258
+ ```markdown
259
+ ## Goal — what the user appears to be doing
260
+ ## Environment — OS, editor, repo, branch, language
261
+ ## Screen State — visible panels, files, terminal output
262
+ ## Signals — verbatim errors, stack traces, warnings
263
+ ## Likely Situation — what's probably happening
264
+ ## Suggested Next Info — what a coding agent needs next
265
+ ```
266
+
267
+ ## Configuration
268
+
269
+ Config: `~/.eye2byte/config.json` (created on first run or via `python eye2byte.py init`)
270
+
271
+ | Setting | Default | Description |
272
+ |---------|---------|-------------|
273
+ | `provider` | `"ollama"` | Vision provider: ollama, gemini, openrouter, hyperbolic |
274
+ | `model` | `"auto"` | Model name or "auto" for auto-detection |
275
+ | `voice_clean` | `true` | Noise removal + pause trimming + volume normalization |
276
+ | `auto_cleanup_days` | `7` | Delete old captures/summaries after N days (0=disabled) |
277
+ | `image_max_size` | `1920` | Max image dimension before LLM processing |
278
+ | `image_quality` | `90` | JPEG quality (1-100) |
279
+
280
+ ## Files
281
+
282
+ | File | Purpose |
283
+ |------|---------|
284
+ | `eye2byte.py` | Core engine — capture, voice, clip, summarize, watch |
285
+ | `eye2byte_ui.py` | Control panel with hotkeys and annotation overlay |
286
+ | `eye2byte_mcp.py` | MCP server for coding agent integration |
287
+
288
+ ## License
289
+
290
+ MIT
@@ -0,0 +1,258 @@
1
+ <p align="center">
2
+ <h1 align="center">Eye2byte</h1>
3
+ <p align="center">Screen-context sidecar for coding agents</p>
4
+ </p>
5
+
6
+ <p align="center">
7
+ <a href="#setup"><img src="https://img.shields.io/badge/python-3.10+-blue?logo=python&logoColor=white" alt="Python 3.10+"></a>
8
+ <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-green" alt="MIT License"></a>
9
+ <a href="#platforms"><img src="https://img.shields.io/badge/platform-Windows%20%7C%20macOS%20%7C%20Linux%20%7C%20Android-lightgrey" alt="Cross-platform"></a>
10
+ <a href="CHANGELOG.md"><img src="https://img.shields.io/badge/changelog-CHANGELOG.md-orange" alt="Changelog"></a>
11
+ </p>
12
+
13
+ ---
14
+
15
+ Captures your screen, voice, and annotations, feeds them to any vision model, and produces structured **Context Packs** your coding agent can act on.
16
+
17
+ ```
18
+ Screen / Voice / Annotations --> Vision Model + Whisper --> Context Pack --> Coding Agent
19
+ ```
20
+
21
+ ## Features
22
+
23
+ - **Multi-monitor capture** — active, specific (1/2/3), or all monitors at once
24
+ - **Voice narration** — record, clean (noise removal + normalization), transcribe locally
25
+ - **Annotations** — arrows, circles, rectangles, freehand, multi-line text on a frozen screenshot
26
+ - **Screen clips** — record short videos, extract keyframes, analyze the sequence
27
+ - **Image optimization** — auto resize + compress (~5x smaller, zero quality loss)
28
+ - **MCP server** — coding agents query your screen directly via Model Context Protocol
29
+ - **Context Packs** — structured output: goal, environment, errors, signals, next steps
30
+
31
+ ## Platforms
32
+
33
+ | Platform | Screenshot | Voice | Annotation | Hotkeys |
34
+ |----------|-----------|-------|------------|---------|
35
+ | Windows | PowerShell .NET | ffmpeg | Pillow | Ctrl+Shift+1-5 |
36
+ | macOS | screencapture | ffmpeg | Pillow | - |
37
+ | Linux | scrot/maim/flameshot | ffmpeg | Pillow | - |
38
+ | Android | ADB (Termux) | Termux:API | - | - |
39
+
40
+ ## Setup
41
+
42
+ ### 1. Install dependencies
43
+
44
+ ```bash
45
+ pip install Pillow fastmcp # Core + MCP server
46
+ pip install openai-whisper # Local voice transcription (optional)
47
+ # ffmpeg is required for voice/clips — install via your package manager
48
+ ```
49
+
50
+ ### 2. Configure a vision provider
51
+
52
+ Eye2byte works with **any vision model** — local or cloud. Set your provider in `~/.eye2byte/config.json` or the Settings UI:
53
+
54
+ | Provider | Setup | Cost |
55
+ |----------|-------|------|
56
+ | **Ollama** (local) | [Install Ollama](https://ollama.com), `ollama pull qwen3-vl:8b` | Free |
57
+ | **Gemini** | Set `GEMINI_API_KEY` in `.env` | Free tier (1000 req/day) |
58
+ | **OpenRouter** | Set `OPENROUTER_API_KEY` in `.env` | Free models available |
59
+ | **Hyperbolic** | Set `HYPERBOLIC_API_KEY` in `.env` | Pay per use |
60
+
61
+ ```bash
62
+ # .env file (project dir, cwd, or ~/.eye2byte/.env)
63
+ GEMINI_API_KEY=your-key-here
64
+ # or OPENROUTER_API_KEY=...
65
+ # or HYPERBOLIC_API_KEY=...
66
+ ```
67
+
68
+ ### 3. Run
69
+
70
+ ```bash
71
+ python eye2byte.py capture # Screenshot + analysis
72
+ python eye2byte.py capture --voice # + voice narration
73
+ python eye2byte.py capture --mode window # Active window only
74
+ python eye2byte_ui.py # Launch control panel
75
+ ```
76
+
77
+ ## Control Panel
78
+
79
+ ```bash
80
+ python eye2byte_ui.py
81
+ ```
82
+
83
+ A small always-on-top floating panel. Drag it anywhere. Global hotkeys work even when the panel isn't focused.
84
+
85
+ ### Global Hotkeys (Windows)
86
+
87
+ These work system-wide — no need to focus the Eye2byte window:
88
+
89
+ | Hotkey | Action | Notes |
90
+ |--------|--------|-------|
91
+ | `Ctrl+Shift+1` | Capture screenshot | Uses current mode (Full/Window/Region) |
92
+ | `Ctrl+Shift+2` | Annotate | Freezes screen, opens drawing overlay |
93
+ | `Ctrl+Shift+3` | Toggle voice recording | Press once to start, again to stop |
94
+ | `Ctrl+Shift+5` | Grab clipboard image | Analyzes whatever image is on your clipboard |
95
+
96
+ All keyboard shortcuts are customizable from Settings > Keyboard Shortcuts.
97
+
98
+ ### Panel Controls
99
+
100
+ | Control | Action |
101
+ |---------|--------|
102
+ | `Space` (hold) | Push-to-talk — hold to record, release to stop |
103
+ | Mode selector | Cycle between Full Screen / Window / Region |
104
+ | Settings | Configure provider, model, image quality, cleanup |
105
+ | Copy @path | Copy session path to clipboard for `@`-mentioning |
106
+
107
+ ### Annotation Overlay
108
+
109
+ When you press `Ctrl+Shift+2` or click Annotate, the screen freezes and you can draw on it:
110
+
111
+ | Key | Tool | How to use |
112
+ |-----|------|-----------|
113
+ | `X` | Arrow | Click and drag to draw an arrow |
114
+ | `C` | Circle | Click and drag to draw an ellipse |
115
+ | `V` | Rectangle | Click and drag to draw a box |
116
+ | `B` | Freehand | Click and drag to draw freely |
117
+ | `T` | Text | Click to place, type your text |
118
+
119
+ | Action | How |
120
+ |--------|-----|
121
+ | **Save** | `Enter` (commits annotations and sends to vision model) |
122
+ | **Cancel** | `Escape` (discards all annotations) |
123
+ | **Undo** | Right-click near an annotation to remove it |
124
+ | **Newline in text** | `Shift+Enter` (Enter alone commits the text) |
125
+ | **Multi-line text** | Text box auto-grows up to 6 lines |
126
+
127
+ ### Voice Recording
128
+
129
+ Three ways to record voice:
130
+
131
+ 1. **Toggle** — `Ctrl+Shift+3` starts recording, press again to stop
132
+ 2. **Push-to-talk** — Hold `Space` while panel is focused
133
+ 3. **Mouse PTT** — Hold click on the Record button
134
+
135
+ While recording, any captures you take are automatically bundled with the voice note into a single session.
136
+
137
+ ## MCP Server
138
+
139
+ Eye2byte exposes 6 tools via the [Model Context Protocol](https://modelcontextprotocol.io), letting coding agents capture and analyze your screen directly.
140
+
141
+ | Tool | Description |
142
+ |------|-------------|
143
+ | `capture_and_summarize` | Screenshot + vision analysis. Supports monitor selection, delay, window targeting |
144
+ | `capture_with_voice` | Screenshot + voice recording + transcription + analysis |
145
+ | `record_clip_and_summarize` | Screen clip with keyframe extraction and sequence analysis |
146
+ | `summarize_screenshot` | Analyze an existing image file |
147
+ | `transcribe_audio` | Local Whisper transcription of any audio file |
148
+ | `get_recent_context` | Retrieve recent Context Pack summaries |
149
+
150
+ ### Local Setup (stdio)
151
+
152
+ Eye2byte runs on the machine whose screen you want to capture. For local agents like Claude Code on the same machine, use stdio transport:
153
+
154
+ **Claude Code** — add to your project's `.mcp.json`:
155
+
156
+ ```json
157
+ {
158
+ "mcpServers": {
159
+ "eye2byte": {
160
+ "command": "python",
161
+ "args": ["C:/path/to/eye2byte_mcp.py"]
162
+ }
163
+ }
164
+ }
165
+ ```
166
+
167
+ That's it — Claude Code will auto-start the server. Use full absolute paths.
168
+
169
+ ### Remote Setup (SSE)
170
+
171
+ When your coding agent runs on a **different machine** (cloud VM, SSH dev box, CI runner) but needs to see your local screen, use SSE transport:
172
+
173
+ **Step 1 — On your local machine** (the one with the screen):
174
+
175
+ ```bash
176
+ # Install Eye2byte + dependencies
177
+ pip install Pillow fastmcp
178
+ pip install openai-whisper # optional, for voice
179
+
180
+ # Start the SSE server
181
+ python eye2byte_mcp.py --sse # No auth (LAN only)
182
+ python eye2byte_mcp.py --sse --token mysecret123 # Bearer token auth
183
+ python eye2byte_mcp.py --sse --port 9000 --token abc # Custom port + auth
184
+ ```
185
+
186
+ The server stays running and accepts connections from any machine on your network. Use `--token` when the server is reachable beyond your trusted LAN.
187
+
188
+ **Step 2 — On the remote machine** (where the coding agent runs):
189
+
190
+ Nothing to install. Just configure the MCP client to point at your local IP:
191
+
192
+ ```json
193
+ {
194
+ "mcpServers": {
195
+ "eye2byte": {
196
+ "url": "http://YOUR_LOCAL_IP:8808/sse",
197
+ "headers": {"Authorization": "Bearer mysecret123"}
198
+ }
199
+ }
200
+ }
201
+ ```
202
+
203
+ Omit the `headers` field if the server was started without `--token`.
204
+
205
+ Find your local IP: `ipconfig` (Windows) or `ifconfig` / `ip addr` (Linux/macOS).
206
+
207
+ **Firewall:** You may need to allow inbound TCP on port 8808. On Windows, run as admin:
208
+
209
+ ```powershell
210
+ netsh advfirewall firewall add rule name="Eye2byte MCP" dir=in action=allow protocol=TCP localport=8808
211
+ ```
212
+
213
+ ### Multi-monitor Examples
214
+
215
+ ```
216
+ capture_and_summarize(monitor=0) # active monitor (default)
217
+ capture_and_summarize(monitor=1) # first monitor
218
+ capture_and_summarize(monitor=2) # second monitor
219
+ capture_and_summarize(monitor=-1) # ALL monitors at once
220
+ ```
221
+
222
+ ## Context Pack Format
223
+
224
+ Every analysis produces a structured Context Pack:
225
+
226
+ ```markdown
227
+ ## Goal — what the user appears to be doing
228
+ ## Environment — OS, editor, repo, branch, language
229
+ ## Screen State — visible panels, files, terminal output
230
+ ## Signals — verbatim errors, stack traces, warnings
231
+ ## Likely Situation — what's probably happening
232
+ ## Suggested Next Info — what a coding agent needs next
233
+ ```
234
+
235
+ ## Configuration
236
+
237
+ Config: `~/.eye2byte/config.json` (created on first run or via `python eye2byte.py init`)
238
+
239
+ | Setting | Default | Description |
240
+ |---------|---------|-------------|
241
+ | `provider` | `"ollama"` | Vision provider: ollama, gemini, openrouter, hyperbolic |
242
+ | `model` | `"auto"` | Model name or "auto" for auto-detection |
243
+ | `voice_clean` | `true` | Noise removal + pause trimming + volume normalization |
244
+ | `auto_cleanup_days` | `7` | Delete old captures/summaries after N days (0=disabled) |
245
+ | `image_max_size` | `1920` | Max image dimension before LLM processing |
246
+ | `image_quality` | `90` | JPEG quality (1-100) |
247
+
248
+ ## Files
249
+
250
+ | File | Purpose |
251
+ |------|---------|
252
+ | `eye2byte.py` | Core engine — capture, voice, clip, summarize, watch |
253
+ | `eye2byte_ui.py` | Control panel with hotkeys and annotation overlay |
254
+ | `eye2byte_mcp.py` | MCP server for coding agent integration |
255
+
256
+ ## License
257
+
258
+ MIT