npm - talk-to-copilot - Versions diffs - 1.0.1 → 1.0.3 - Mend

talk-to-copilot 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -1,129 +1,117 @@
 # talk-to-copilot
-A transparent PTY wrapper for [GitHub Copilot CLI](https://github.com/github/copilot-cli) that adds **voice input** and **screenshot attachment** — without changing how you use Copilot at all.
+> Talk to [GitHub Copilot CLI](https://docs.github.com/copilot/concepts/agents/about-copilot-cli) with your voice — and share screenshots — without leaving your terminal.
-Run `ttc` instead of `copilot`. Everything works identically, plus two new hotkeys.
+`ttc` is a drop-in replacement for the `copilot` command. It wraps Copilot CLI transparently and adds two hotkeys:
-```
-Ctrl+R  →  Start / stop voice recording  (transcription injected as text)
-Ctrl+P  →  Interactive screenshot picker  (injected as @/path/to/file.png)
-```
+| Hotkey | What it does |
+|--------|-------------|
+| **Ctrl+R** | Start / stop voice recording → transcription is typed into your prompt |
+| **Ctrl+P** | Screenshot picker → file path is injected as `@/path/screenshot.png` |
+Everything else — all Copilot features, slash commands, modes — works exactly as normal.
 ---
-## Installation
+## Requirements
+- **macOS** (uses `avfoundation` for mic input and `screencapture` for screenshots)
+- **[GitHub Copilot CLI](https://docs.github.com/copilot/concepts/agents/about-copilot-cli)** — must be installed and authenticated
+- **Node.js ≥ 18** — `brew install node`
+- **ffmpeg** — `brew install ffmpeg`
+- **whisper.cpp** — `brew install whisper-cpp`
-### Homebrew (recommended — installs ffmpeg + whisper-cpp automatically)
+> **Apple Silicon:** The `base.en` model transcribes in ~1–2 s on M1/M2/M3. Use `small.en` for better accuracy at ~3–4 s.
+---
+## Installation
 ```bash
-brew tap Errr0rr404/ttc
-brew install ttc
-whisper-cpp-download-ggml-model base.en   # one-time: download speech model
-ttc --setup                               # verify everything is ready
+npm install -g talk-to-copilot
 ```
-### npm
+Then install the speech dependencies if you haven't already:
 ```bash
-npm install -g talk-to-copilot
-# You still need ffmpeg and whisper-cpp:
 brew install ffmpeg whisper-cpp
-whisper-cpp-download-ggml-model base.en
-ttc --setup
 ```
----
+Download a whisper speech model (required for voice input):
-## How it works
+```bash
+# Option A — using the whisper-cpp helper script (if available)
+whisper-cpp-download-ggml-model base.en
+# Option B — direct download (works everywhere)
+mkdir -p ~/.copilot/models
+curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" \
+  -o ~/.copilot/models/ggml-base.en.bin
 ```
-┌─────────────────────────────────────────────────────────┐
-│  ttc (PTY wrapper)                                      │
-│                                                          │
-│  stdin ──► intercept Ctrl+R / Ctrl+P                    │
-│              │                           │               │
-│              ▼                           ▼               │
-│         voice recorder            screencapture -i       │
-│         ffmpeg + whisper-cli      saves PNG to /tmp      │
-│              │                           │               │
-│              └──────────┬────────────────┘               │
-│                         ▼                                │
-│                  inject text / @path                     │
-│                         │                                │
-│  copilot (PTY child) ◄──┘  (all other keystrokes pass   │
-│                             through unchanged)           │
-└─────────────────────────────────────────────────────────┘
+Verify everything is wired up:
+```bash
+ttc --setup
 ```
-Transcriptions are injected as raw text — **no Enter is pressed automatically** so you can review and edit before sending. Screenshots are injected as `@/tmp/copilot-screenshots/screenshot-<ts>.png` which Copilot CLI's `@` file-mention picks up.
+You should see all green checkmarks. If anything is missing, the setup output tells you exactly what to fix.
 ---
-## Prerequisites
+## Quick Start
-| Tool | Install |
-|------|---------|
-| [GitHub Copilot CLI](https://github.com/github/copilot-cli) | see their docs |
-| [ffmpeg](https://ffmpeg.org) | `brew install ffmpeg` |
-| [whisper.cpp](https://github.com/ggerganov/whisper.cpp) | `brew install whisper-cpp` |
-| A whisper model | `whisper-cpp-download-ggml-model base.en` |
-| Node.js ≥ 18 | `brew install node` |
+```bash
+ttc
+```
-> **Apple Silicon note:** The `base.en` model runs in ~1–2 s on M1/M2/M3. Use `small.en` for better accuracy at ~3–4 s.
+That's it. You're now inside Copilot CLI with voice and screenshot support active.
 ---
-## Installation
+## Using Voice Input
-```bash
-git clone https://github.com/yourname/talk-to-copilot
-cd talk-to-copilot
-npm install
-npm link          # makes `ttc` available system-wide
-```
+1. **Press `Ctrl+R`** to start recording.
+   A macOS notification appears and your terminal title changes to `🎙 Recording…`
-Verify everything is wired up:
+2. **Speak your prompt** naturally — e.g. _"refactor this function to use async await"_
-```bash
-talk --setup
-```
+3. **Press `Ctrl+R` again** to stop.
+   Transcription runs locally (`⏳ Transcribing…`) — no audio ever leaves your machine.
+4. **Your words appear as text** in the Copilot prompt. Review and edit if needed, then press **Enter** to send.
+> Press **Ctrl+C** while recording to cancel without transcribing.
 ---
-## Usage
+## Using Screenshots
-```bash
-talk              # drop-in replacement for `copilot`
-talk --setup      # check dependencies and show config
-```
+1. **Press `Ctrl+P`** — the macOS screenshot overlay opens (same UI as `⌘⇧4`).
-Any flags you pass are forwarded to `copilot` directly:
+2. **Click and drag** to select any area of your screen — a browser error, a UI bug, a diagram, anything.
-```bash
-talk --experimental
-talk --banner
-```
+3. **The file path is injected** into your prompt as `@/tmp/copilot-screenshots/screenshot-<timestamp>.png`.
+4. **Add context** if you want (e.g. _"what's wrong with this?"_), then press **Enter**.
-### Voice recording
+---
-1. Press **Ctrl+R** — the terminal title changes to `🎙 Recording…` and a macOS notification appears.
-2. Speak your prompt.
-3. Press **Ctrl+R** again — transcription begins (`⏳ Transcribing…`).
-4. The transcribed text appears in the Copilot input. Review it, then press **Enter** to send.
-5. Press **Ctrl+C** while recording to cancel without transcribing.
+## Passing Flags to Copilot
-### Screenshot
+Any arguments after `ttc` are forwarded directly to `copilot`:
-1. Press **Ctrl+P** — the macOS screenshot overlay appears (same as ⌘⇧4).
-2. Draw a selection around the area you want to share.
-3. The path is injected as `@/tmp/copilot-screenshots/screenshot-<ts>.png`.
-4. Type any additional context, then press **Enter**.
+```bash
+ttc --experimental
+ttc --banner
+ttc --help
+```
 ---
 ## Configuration
-Config is stored at `~/.copilot/talk-to-copilot.json`:
+Settings are stored at `~/.copilot/talk-to-copilot.json` and created automatically on first run.
 ```json
 {
@@ -135,22 +123,96 @@ Config is stored at `~/.copilot/talk-to-copilot.json`:
 | Key | Default | Description |
 |-----|---------|-------------|
-| `modelPath` | auto-detected | Path to your `.bin` whisper model |
-| `audioDevice` | `:0` | ffmpeg avfoundation mic index (run `ffmpeg -f avfoundation -list_devices true -i ""` to list) |
-| `autoSubmit` | `false` | Set to `true` to auto-press Enter after transcription |
+| `modelPath` | auto-detected | Path to your whisper `.bin` model file |
+| `audioDevice` | `:0` | ffmpeg avfoundation audio input index |
+| `autoSubmit` | `false` | `true` = automatically press Enter after transcription |
+### Finding your microphone index
+```bash
+ffmpeg -f avfoundation -list_devices true -i "" 2>&1 | grep AVFoundation
+```
+Look for your microphone in the output. The number in brackets (e.g. `[2]`) is the index — set `audioDevice` to `":2"`.
+### Available whisper models
+| Model | Size | Speed (M2) | Accuracy |
+|-------|------|------------|----------|
+| `tiny.en` | 75 MB | ~0.5 s | Good |
+| `base.en` | 142 MB | ~1 s | Better |
+| `small.en` | 466 MB | ~3 s | Best for most |
+```bash
+whisper-cpp-download-ggml-model small.en
+```
+Then update `modelPath` in `~/.copilot/talk-to-copilot.json`.
+---
+## How It Works
+```
+┌──────────────────────────────────────────────────────────┐
+│  ttc (PTY wrapper)                                        │
+│                                                           │
+│  Your keystrokes ──► intercept Ctrl+R / Ctrl+P           │
+│                            │               │              │
+│                            ▼               ▼              │
+│                      ffmpeg mic      screencapture -i     │
+│                      + whisper-cli   saves PNG to /tmp    │
+│                            │               │              │
+│                            └───────┬───────┘              │
+│                                    ▼                      │
+│                         inject text / @filepath           │
+│                                    │                      │
+│  copilot ◄─────────────────────────┘                     │
+│  (all other keystrokes pass through unchanged)            │
+└──────────────────────────────────────────────────────────┘
+```
+Transcription is 100% local — whisper.cpp runs on your machine, nothing is sent to any server.
 ---
 ## Troubleshooting
-**`Error: could not open input device`**
-Grant microphone access: *System Settings → Privacy & Security → Microphone → Terminal*.
+**`posix_spawnp failed` on first run**
+Run `npm install -g talk-to-copilot` again — the postinstall script will fix the permissions automatically.
+**Microphone not being captured / transcription is always the same word**
+Your `audioDevice` is pointing to the wrong input (e.g. a virtual audio device).
+Run the device listing command above and update `audioDevice` in your config.
-**`No whisper model found`**
-Run `whisper-cpp-download-ggml-model base.en`, then `talk --setup` to verify.
+**`Error: could not open input device`**
+Grant microphone access to your terminal:
+*System Settings → Privacy & Security → Microphone → enable your terminal app*
+**`No whisper model found`**
+```bash
+# Option A
+whisper-cpp-download-ggml-model base.en
+# Option B (direct download, works if the script is missing)
+mkdir -p ~/.copilot/models
+curl -L "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin" \
+  -o ~/.copilot/models/ggml-base.en.bin
+```
+Then run `ttc --setup` to confirm it's detected.
+**Transcription is inaccurate**
+Switch to a larger model:
+```bash
+whisper-cpp-download-ggml-model small.en
+```
+Then update `modelPath` in `~/.copilot/talk-to-copilot.json`.
+**Screenshot doesn't attach**
+Make sure Screen Recording permission is granted:
+*System Settings → Privacy & Security → Screen Recording → enable your terminal app*
+---
-**Transcription is empty or garbled**
-Try a larger model: `whisper-cpp-download-ggml-model small.en`, then update `modelPath` in your config.
+## License
-**Wrong microphone is used**
-Run `ffmpeg -f avfoundation -list_devices true -i ""` and set `audioDevice` in the config (e.g. `":1"`).
+MIT © [Errr0rr404](https://github.com/Errr0rr404)

package/bin/ttc CHANGED Viewed

@@ -65,7 +65,8 @@ function runSetup() {
         console.log(`✅  model        — ${model}`);
       } else {
         console.log('❌  model        — no model file found');
-        console.log('     Run: whisper-cpp-download-ggml-model base.en\n');
+        console.log('     Option A: whisper-cpp-download-ggml-model base.en');
+        console.log('     Option B: mkdir -p ~/.copilot/models && curl -L https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -o ~/.copilot/models/ggml-base.en.bin\n');
         allGood = false;
       }
     }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "talk-to-copilot",
-  "version": "1.0.1",
+  "version": "1.0.3",
   "description": "Voice + screenshot input wrapper for GitHub Copilot CLI — use your mic and screen instead of typing",
   "bin": {
     "ttc": "bin/ttc"

package/src/config.js CHANGED Viewed

@@ -8,6 +8,9 @@ const CONFIG_PATH = path.join(os.homedir(), '.copilot', 'talk-to-copilot.json');
 const WHISPER_MODEL_CANDIDATES = [
   path.join(os.homedir(), '.copilot', 'whisper-model.bin'),
+  path.join(os.homedir(), '.copilot', 'models', 'ggml-base.en.bin'),
+  path.join(os.homedir(), '.copilot', 'models', 'ggml-small.en.bin'),
+  path.join(os.homedir(), '.copilot', 'models', 'ggml-tiny.en.bin'),
   path.join(__dirname, '..', 'models', 'ggml-base.en.bin'),
   path.join(__dirname, '..', 'models', 'ggml-small.en.bin'),
   path.join(__dirname, '..', 'models', 'ggml-tiny.en.bin'),