talk-to-copilot 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,129 +1,117 @@
1
1
  # talk-to-copilot
2
2
 
3
- A transparent PTY wrapper for [GitHub Copilot CLI](https://github.com/github/copilot-cli) that adds **voice input** and **screenshot attachment** — without changing how you use Copilot at all.
3
+ > Talk to [GitHub Copilot CLI](https://docs.github.com/copilot/concepts/agents/about-copilot-cli) with your voice and share screenshots — without leaving your terminal.
4
4
 
5
- Run `ttc` instead of `copilot`. Everything works identically, plus two new hotkeys.
5
+ `ttc` is a drop-in replacement for the `copilot` command. It wraps Copilot CLI transparently and adds two hotkeys:
6
6
 
7
- ```
8
- Ctrl+R → Start / stop voice recording (transcription injected as text)
9
- Ctrl+P Interactive screenshot picker (injected as @/path/to/file.png)
10
- ```
7
+ | Hotkey | What it does |
8
+ |--------|-------------|
9
+ | **Ctrl+R** | Start / stop voice recording transcription is typed into your prompt |
10
+ | **Ctrl+P** | Screenshot picker → file path is injected as `@/path/screenshot.png` |
11
+
12
+ Everything else — all Copilot features, slash commands, modes — works exactly as normal.
11
13
 
12
14
  ---
13
15
 
14
- ## Installation
16
+ ## Requirements
17
+
18
+ - **macOS** (uses `avfoundation` for mic input and `screencapture` for screenshots)
19
+ - **[GitHub Copilot CLI](https://docs.github.com/copilot/concepts/agents/about-copilot-cli)** — must be installed and authenticated
20
+ - **Node.js ≥ 18** — `brew install node`
21
+ - **ffmpeg** — `brew install ffmpeg`
22
+ - **whisper.cpp** — `brew install whisper-cpp`
15
23
 
16
- ### Homebrew (recommended installs ffmpeg + whisper-cpp automatically)
24
+ > **Apple Silicon:** The `base.en` model transcribes in ~1–2 s on M1/M2/M3. Use `small.en` for better accuracy at ~3–4 s.
25
+
26
+ ---
27
+
28
+ ## Installation
17
29
 
18
30
  ```bash
19
- brew tap Errr0rr404/ttc
20
- brew install ttc
21
- whisper-cpp-download-ggml-model base.en # one-time: download speech model
22
- ttc --setup # verify everything is ready
31
+ npm install -g talk-to-copilot
23
32
  ```
24
33
 
25
- ### npm
34
+ Then install the speech dependencies if you haven't already:
26
35
 
27
36
  ```bash
28
- npm install -g talk-to-copilot
29
- # You still need ffmpeg and whisper-cpp:
30
37
  brew install ffmpeg whisper-cpp
31
- whisper-cpp-download-ggml-model base.en
32
- ttc --setup
33
38
  ```
34
39
 
35
- ---
40
+ Download a whisper speech model (required for voice input):
36
41
 
37
- ## How it works
42
+ ```bash
43
+ # Option A — using the whisper-cpp helper script (if available)
44
+ whisper-cpp-download-ggml-model base.en
38
45
 
46
+ # Option B — direct download (works everywhere)
47
+ mkdir -p ~/.copilot/models
48
+ curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" \
49
+ -o ~/.copilot/models/ggml-base.en.bin
39
50
  ```
40
- ┌─────────────────────────────────────────────────────────┐
41
- │ ttc (PTY wrapper) │
42
- │ │
43
- │ stdin ──► intercept Ctrl+R / Ctrl+P │
44
- │ │ │ │
45
- │ ▼ ▼ │
46
- │ voice recorder screencapture -i │
47
- │ ffmpeg + whisper-cli saves PNG to /tmp │
48
- │ │ │ │
49
- │ └──────────┬────────────────┘ │
50
- │ ▼ │
51
- │ inject text / @path │
52
- │ │ │
53
- │ copilot (PTY child) ◄──┘ (all other keystrokes pass │
54
- │ through unchanged) │
55
- └─────────────────────────────────────────────────────────┘
51
+
52
+ Verify everything is wired up:
53
+
54
+ ```bash
55
+ ttc --setup
56
56
  ```
57
57
 
58
- Transcriptions are injected as raw text **no Enter is pressed automatically** so you can review and edit before sending. Screenshots are injected as `@/tmp/copilot-screenshots/screenshot-<ts>.png` which Copilot CLI's `@` file-mention picks up.
58
+ You should see all green checkmarks. If anything is missing, the setup output tells you exactly what to fix.
59
59
 
60
60
  ---
61
61
 
62
- ## Prerequisites
62
+ ## Quick Start
63
63
 
64
- | Tool | Install |
65
- |------|---------|
66
- | [GitHub Copilot CLI](https://github.com/github/copilot-cli) | see their docs |
67
- | [ffmpeg](https://ffmpeg.org) | `brew install ffmpeg` |
68
- | [whisper.cpp](https://github.com/ggerganov/whisper.cpp) | `brew install whisper-cpp` |
69
- | A whisper model | `whisper-cpp-download-ggml-model base.en` |
70
- | Node.js ≥ 18 | `brew install node` |
64
+ ```bash
65
+ ttc
66
+ ```
71
67
 
72
- > **Apple Silicon note:** The `base.en` model runs in ~1–2 s on M1/M2/M3. Use `small.en` for better accuracy at ~3–4 s.
68
+ That's it. You're now inside Copilot CLI with voice and screenshot support active.
73
69
 
74
70
  ---
75
71
 
76
- ## Installation
72
+ ## Using Voice Input
77
73
 
78
- ```bash
79
- git clone https://github.com/yourname/talk-to-copilot
80
- cd talk-to-copilot
81
- npm install
82
- npm link # makes `ttc` available system-wide
83
- ```
74
+ 1. **Press `Ctrl+R`** to start recording.
75
+ A macOS notification appears and your terminal title changes to `🎙 Recording…`
84
76
 
85
- Verify everything is wired up:
77
+ 2. **Speak your prompt** naturally — e.g. _"refactor this function to use async await"_
86
78
 
87
- ```bash
88
- talk --setup
89
- ```
79
+ 3. **Press `Ctrl+R` again** to stop.
80
+ Transcription runs locally (`⏳ Transcribing…`) — no audio ever leaves your machine.
81
+
82
+ 4. **Your words appear as text** in the Copilot prompt. Review and edit if needed, then press **Enter** to send.
83
+
84
+ > Press **Ctrl+C** while recording to cancel without transcribing.
90
85
 
91
86
  ---
92
87
 
93
- ## Usage
88
+ ## Using Screenshots
94
89
 
95
- ```bash
96
- talk # drop-in replacement for `copilot`
97
- talk --setup # check dependencies and show config
98
- ```
90
+ 1. **Press `Ctrl+P`** — the macOS screenshot overlay opens (same UI as `⌘⇧4`).
99
91
 
100
- Any flags you pass are forwarded to `copilot` directly:
92
+ 2. **Click and drag** to select any area of your screen — a browser error, a UI bug, a diagram, anything.
101
93
 
102
- ```bash
103
- talk --experimental
104
- talk --banner
105
- ```
94
+ 3. **The file path is injected** into your prompt as `@/tmp/copilot-screenshots/screenshot-<timestamp>.png`.
95
+
96
+ 4. **Add context** if you want (e.g. _"what's wrong with this?"_), then press **Enter**.
106
97
 
107
- ### Voice recording
98
+ ---
108
99
 
109
- 1. Press **Ctrl+R** — the terminal title changes to `🎙 Recording…` and a macOS notification appears.
110
- 2. Speak your prompt.
111
- 3. Press **Ctrl+R** again — transcription begins (`⏳ Transcribing…`).
112
- 4. The transcribed text appears in the Copilot input. Review it, then press **Enter** to send.
113
- 5. Press **Ctrl+C** while recording to cancel without transcribing.
100
+ ## Passing Flags to Copilot
114
101
 
115
- ### Screenshot
102
+ Any arguments after `ttc` are forwarded directly to `copilot`:
116
103
 
117
- 1. Press **Ctrl+P** — the macOS screenshot overlay appears (same as ⌘⇧4).
118
- 2. Draw a selection around the area you want to share.
119
- 3. The path is injected as `@/tmp/copilot-screenshots/screenshot-<ts>.png`.
120
- 4. Type any additional context, then press **Enter**.
104
+ ```bash
105
+ ttc --experimental
106
+ ttc --banner
107
+ ttc --help
108
+ ```
121
109
 
122
110
  ---
123
111
 
124
112
  ## Configuration
125
113
 
126
- Config is stored at `~/.copilot/talk-to-copilot.json`:
114
+ Settings are stored at `~/.copilot/talk-to-copilot.json` and created automatically on first run.
127
115
 
128
116
  ```json
129
117
  {
@@ -135,22 +123,96 @@ Config is stored at `~/.copilot/talk-to-copilot.json`:
135
123
 
136
124
  | Key | Default | Description |
137
125
  |-----|---------|-------------|
138
- | `modelPath` | auto-detected | Path to your `.bin` whisper model |
139
- | `audioDevice` | `:0` | ffmpeg avfoundation mic index (run `ffmpeg -f avfoundation -list_devices true -i ""` to list) |
140
- | `autoSubmit` | `false` | Set to `true` to auto-press Enter after transcription |
126
+ | `modelPath` | auto-detected | Path to your whisper `.bin` model file |
127
+ | `audioDevice` | `:0` | ffmpeg avfoundation audio input index |
128
+ | `autoSubmit` | `false` | `true` = automatically press Enter after transcription |
129
+
130
+ ### Finding your microphone index
131
+
132
+ ```bash
133
+ ffmpeg -f avfoundation -list_devices true -i "" 2>&1 | grep AVFoundation
134
+ ```
135
+
136
+ Look for your microphone in the output. The number in brackets (e.g. `[2]`) is the index — set `audioDevice` to `":2"`.
137
+
138
+ ### Available whisper models
139
+
140
+ | Model | Size | Speed (M2) | Accuracy |
141
+ |-------|------|------------|----------|
142
+ | `tiny.en` | 75 MB | ~0.5 s | Good |
143
+ | `base.en` | 142 MB | ~1 s | Better |
144
+ | `small.en` | 466 MB | ~3 s | Best for most |
145
+
146
+ ```bash
147
+ whisper-cpp-download-ggml-model small.en
148
+ ```
149
+
150
+ Then update `modelPath` in `~/.copilot/talk-to-copilot.json`.
151
+
152
+ ---
153
+
154
+ ## How It Works
155
+
156
+ ```
157
+ ┌──────────────────────────────────────────────────────────┐
158
+ │ ttc (PTY wrapper) │
159
+ │ │
160
+ │ Your keystrokes ──► intercept Ctrl+R / Ctrl+P │
161
+ │ │ │ │
162
+ │ ▼ ▼ │
163
+ │ ffmpeg mic screencapture -i │
164
+ │ + whisper-cli saves PNG to /tmp │
165
+ │ │ │ │
166
+ │ └───────┬───────┘ │
167
+ │ ▼ │
168
+ │ inject text / @filepath │
169
+ │ │ │
170
+ │ copilot ◄─────────────────────────┘ │
171
+ │ (all other keystrokes pass through unchanged) │
172
+ └──────────────────────────────────────────────────────────┘
173
+ ```
174
+
175
+ Transcription is 100% local — whisper.cpp runs on your machine, nothing is sent to any server.
141
176
 
142
177
  ---
143
178
 
144
179
  ## Troubleshooting
145
180
 
146
- **`Error: could not open input device`**
147
- Grant microphone access: *System Settings Privacy & Security Microphone Terminal*.
181
+ **`posix_spawnp failed` on first run**
182
+ Run `npm install -g talk-to-copilot` again the postinstall script will fix the permissions automatically.
183
+
184
+ **Microphone not being captured / transcription is always the same word**
185
+ Your `audioDevice` is pointing to the wrong input (e.g. a virtual audio device).
186
+ Run the device listing command above and update `audioDevice` in your config.
148
187
 
149
- **`No whisper model found`**
150
- Run `whisper-cpp-download-ggml-model base.en`, then `talk --setup` to verify.
188
+ **`Error: could not open input device`**
189
+ Grant microphone access to your terminal:
190
+ *System Settings → Privacy & Security → Microphone → enable your terminal app*
191
+
192
+ **`No whisper model found`**
193
+ ```bash
194
+ # Option A
195
+ whisper-cpp-download-ggml-model base.en
196
+ # Option B (direct download, works if the script is missing)
197
+ mkdir -p ~/.copilot/models
198
+ curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" \
199
+ -o ~/.copilot/models/ggml-base.en.bin
200
+ ```
201
+ Then run `ttc --setup` to confirm it's detected.
202
+
203
+ **Transcription is inaccurate**
204
+ Switch to a larger model:
205
+ ```bash
206
+ whisper-cpp-download-ggml-model small.en
207
+ ```
208
+ Then update `modelPath` in `~/.copilot/talk-to-copilot.json`.
209
+
210
+ **Screenshot doesn't attach**
211
+ Make sure Screen Recording permission is granted:
212
+ *System Settings → Privacy & Security → Screen Recording → enable your terminal app*
213
+
214
+ ---
151
215
 
152
- **Transcription is empty or garbled**
153
- Try a larger model: `whisper-cpp-download-ggml-model small.en`, then update `modelPath` in your config.
216
+ ## License
154
217
 
155
- **Wrong microphone is used**
156
- Run `ffmpeg -f avfoundation -list_devices true -i ""` and set `audioDevice` in the config (e.g. `":1"`).
218
+ MIT © [Errr0rr404](https://github.com/Errr0rr404)
package/bin/ttc CHANGED
@@ -65,7 +65,8 @@ function runSetup() {
65
65
  console.log(`✅ model — ${model}`);
66
66
  } else {
67
67
  console.log('❌ model — no model file found');
68
- console.log(' Run: whisper-cpp-download-ggml-model base.en\n');
68
+ console.log(' Option A: whisper-cpp-download-ggml-model base.en');
69
+ console.log(' Option B: mkdir -p ~/.copilot/models && curl -L https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -o ~/.copilot/models/ggml-base.en.bin\n');
69
70
  allGood = false;
70
71
  }
71
72
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "talk-to-copilot",
3
- "version": "1.0.1",
3
+ "version": "1.0.3",
4
4
  "description": "Voice + screenshot input wrapper for GitHub Copilot CLI — use your mic and screen instead of typing",
5
5
  "bin": {
6
6
  "ttc": "bin/ttc"
package/src/config.js CHANGED
@@ -8,6 +8,9 @@ const CONFIG_PATH = path.join(os.homedir(), '.copilot', 'talk-to-copilot.json');
8
8
 
9
9
  const WHISPER_MODEL_CANDIDATES = [
10
10
  path.join(os.homedir(), '.copilot', 'whisper-model.bin'),
11
+ path.join(os.homedir(), '.copilot', 'models', 'ggml-base.en.bin'),
12
+ path.join(os.homedir(), '.copilot', 'models', 'ggml-small.en.bin'),
13
+ path.join(os.homedir(), '.copilot', 'models', 'ggml-tiny.en.bin'),
11
14
  path.join(__dirname, '..', 'models', 'ggml-base.en.bin'),
12
15
  path.join(__dirname, '..', 'models', 'ggml-small.en.bin'),
13
16
  path.join(__dirname, '..', 'models', 'ggml-tiny.en.bin'),