npm - talktocursor - Versions diffs - 1.0.0 - Mend

talktocursor 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/INSTALL.md +249 -0
package/README.md +177 -0
package/build/config.js +82 -0
package/build/index.js +166 -0
package/build/settings-server.js +124 -0
package/package.json +54 -0
package/public/index.html +1574 -0
package/scripts/auto-submit.py +394 -0
package/scripts/silence_detector.py +146 -0

package/INSTALL.md ADDED Viewed

@@ -0,0 +1,249 @@
+# TalkToCursor - Installation Guide
+A hands-free voice interface for Cursor AI. Your coding assistant speaks progress updates aloud and can listen for voice commands using ElevenLabs TTS.
+---
+## Quick Install (via npm)
+```bash
+npm install -g talktocursor
+```
+Then add to your Cursor MCP config (`~/.cursor/mcp.json`):
+```json
+{
+  "mcpServers": {
+    "tts": {
+      "command": "npx",
+      "args": ["-y", "talktocursor"]
+    }
+  }
+}
+```
+Skip to [Step 3: Get your ElevenLabs API Key](#step-3-get-your-elevenlabs-api-key).
+---
+## Manual Install (from source)
+### Step 1: Download and extract
+**Option A** - From tar.gz:
+```bash
+tar -xzf talk-to-cursor.tar.gz
+cd cursor-tts-mcp
+```
+**Option B** - From GitHub:
+```bash
+git clone https://github.com/yourusername/cursor-tts-mcp.git
+cd cursor-tts-mcp
+```
+### Step 2: Install dependencies and build
+```bash
+npm install
+npm run build
+```
+Then add to your Cursor MCP config (`~/.cursor/mcp.json`):
+```json
+{
+  "mcpServers": {
+    "tts": {
+      "command": "node",
+      "args": ["/ABSOLUTE/PATH/TO/cursor-tts-mcp/build/index.js"]
+    }
+  }
+}
+```
+> **Important:** Replace `/ABSOLUTE/PATH/TO/cursor-tts-mcp` with the actual path on your machine.
+>
+> - macOS/Linux: `/Users/yourname/cursor-tts-mcp/build/index.js`
+> - Windows: `C:\\Users\\yourname\\cursor-tts-mcp\\build\\index.js`
+---
+## Step 3: Get your ElevenLabs API Key
+1. Go to [elevenlabs.io/app/settings/api-keys](https://elevenlabs.io/app/settings/api-keys)
+2. Sign up or log in (free tier available with 10,000 characters/month)
+3. Create a new API key and copy it
+## Step 4: Configure via Settings UI
+```bash
+npm run settings
+```
+Open **http://localhost:3847** in your browser, then:
+1. Paste your ElevenLabs API key and click **Save API Key**
+2. Click **Test Key** to verify it works
+3. (Optional) Browse and select a voice
+4. (Optional) Adjust voice settings (speed, stability, style)
+5. (Optional) Enable Auto-Listen for hands-free voice loop
+> **Alternatively**, you can set your API key via environment variable:
+> ```json
+> {
+>   "mcpServers": {
+>     "tts": {
+>       "command": "npx",
+>       "args": ["-y", "talktocursor"],
+>       "env": {
+>         "ELEVENLABS_API_KEY": "your-api-key-here"
+>       }
+>     }
+>   }
+> }
+> ```
+## Step 5: Restart Cursor
+**Fully quit Cursor** (Cmd+Q on Mac) and reopen it. The MCP server needs a fresh restart to load.
+## Step 6: Test it
+1. Open a new Cursor chat (Cmd+L)
+2. Check that the `speak` tool appears in "Available Tools"
+3. Type: **"Say hello using the speak tool"**
+4. You should hear the voice through your speakers!
+---
+## Optional: Voice Feedback Rule
+For the best experience, create a Cursor rule so the agent automatically speaks at key moments.
+Create the file `~/.cursor/rules/voice-feedback.mdc`:
+```markdown
+---
+description: MANDATORY voice feedback - agent MUST speak at task start and completion
+alwaysApply: true
+---
+# Voice Feedback Rule
+You MUST use the `speak` tool at these moments:
+- **Task Start**: Briefly announce what you're about to do
+- **Task Completion**: Summarize what was done
+Keep messages concise (1-2 sentences). Always speak at start and end of every task.
+```
+---
+## Optional: Hands-Free Dictation (macOS only)
+For a fully hands-free experience with voice dictation:
+### Auto-Submit Setup
+1. Enable **Auto-Submit** in the settings UI
+2. Set up a Python virtual environment:
+```bash
+cd cursor-tts-mcp
+python3 -m venv .venv
+source .venv/bin/activate
+pip install pynput pyobjc-framework-ApplicationServices
+```
+3. Run in a separate terminal:
+```bash
+npm run auto-submit
+```
+4. Grant Accessibility permissions when prompted:
+   - System Settings > Privacy & Security > Accessibility
+   - Add your terminal app (Terminal.app, iTerm, or Cursor)
+### Wispr Voice Loop Setup (requires Wispr Flow)
+For a full conversational voice loop using [Wispr Flow](https://wispr.com):
+1. Install Wispr Flow and configure its dictation hotkey
+2. Enable **Wispr Voice Loop** in the settings UI
+3. Configure the hotkey to match your Wispr Flow settings
+4. Install additional Python dependency:
+```bash
+source .venv/bin/activate
+pip install sounddevice numpy
+brew install portaudio
+```
+5. Grant Microphone permissions to your terminal app
+6. Run the auto-submit script (handles both auto-submit and voice loop):
+```bash
+npm run auto-submit
+```
+---
+## Configuration
+All settings are stored in `config.json` in the project root. You can edit this directly or use the settings UI.
+| Setting | Description | Default |
+|---------|-------------|---------|
+| `apiKey` | ElevenLabs API key | (required) |
+| `voiceId` | ElevenLabs voice ID | Rachel |
+| `model` | TTS model | `eleven_flash_v2_5` |
+| `voiceSettings.speed` | Speech speed (0.7-1.2) | 1.0 |
+| `voiceSettings.stability` | Voice stability (0-1) | 0.5 |
+| `voiceSettings.similarityBoost` | Voice similarity (0-1) | 0.75 |
+| `voiceSettings.style` | Style exaggeration (0-1) | 0.0 |
+| `autoListen` | Auto-listen after tasks | true |
+| `autoSubmit.enabled` | Auto-press Enter | false |
+| `wisprLoop.enabled` | Voice loop with Wispr | false |
+---
+## Troubleshooting
+### Tool doesn't appear in Cursor
+- Fully quit and restart Cursor (Cmd+Q)
+- Verify `~/.cursor/mcp.json` has the correct path
+- Run `npm run build` to ensure the project is compiled
+### "API key not set" error
+- Open settings: `npm run settings`
+- Enter your API key and save
+- Restart Cursor
+### No audio output
+- Check system volume and speaker output
+- Verify `mpv` is installed: `brew install mpv`
+- Test your API key in the settings UI
+### Auto-submit not working
+- Ensure macOS Accessibility permissions are granted
+- Check that Cursor is the frontmost app
+- Try increasing the silence delay in settings
+---
+## Scripts Reference
+| Command | Description |
+|---------|-------------|
+| `npm run build` | Compile TypeScript |
+| `npm run settings` | Open settings UI (port 3847) |
+| `npm run auto-submit` | Start auto-submit + voice loop (macOS) |
+---
+## License
+MIT

package/README.md ADDED Viewed

@@ -0,0 +1,177 @@
+# Cursor TTS MCP Server
+A Model Context Protocol (MCP) server that adds text-to-speech capabilities to Cursor AI. The agent can speak progress updates, completions, and responses aloud using ElevenLabs TTS, enabling hands-free coding workflows.
+## Features
+- 🔊 **Text-to-Speech** - Agent speaks aloud via ElevenLabs API
+- 🎛️ **Settings UI** - Web interface to configure API key, voice, and speech parameters
+- ⚡ **Auto-Submit** - Optional: automatically press Enter when dictation finishes (hands-free)
+- 🎨 **Voice Presets** - Quick settings for fast, slow, expressive, stable, and dramatic speech
+- 🔧 **Configurable** - Speed, stability, similarity boost, and style exaggeration controls
+## Installation
+### 1. Clone or download this repository
+```bash
+git clone https://github.com/yourusername/cursor-tts-mcp.git
+cd cursor-tts-mcp
+```
+Or download and extract the ZIP.
+### 2. Install dependencies
+```bash
+npm install
+```
+### 3. Build the project
+```bash
+npm run build
+```
+### 4. Configure Cursor to use the MCP server
+Edit (or create) `~/.cursor/mcp.json`:
+```json
+{
+  "mcpServers": {
+    "tts": {
+      "command": "node",
+      "args": ["/ABSOLUTE/PATH/TO/cursor-tts-mcp/build/index.js"]
+    }
+  }
+}
+```
+**Important:** Replace `/ABSOLUTE/PATH/TO/cursor-tts-mcp` with the actual full path to where you cloned/downloaded this project.
+For example:
+- macOS/Linux: `/Users/yourname/cursor-tts-mcp/build/index.js`
+- Windows: `C:\\Users\\yourname\\cursor-tts-mcp\\build\\index.js`
+### 5. Get your ElevenLabs API key
+1. Go to [elevenlabs.io/app/settings/api-keys](https://elevenlabs.io/app/settings/api-keys)
+2. Sign up or log in (free tier available)
+3. Create a new API key and copy it
+### 6. Configure the MCP server
+Open the settings UI:
+```bash
+npm run settings
+```
+Then open http://localhost:3847 in your browser and:
+1. Paste your ElevenLabs API key
+2. Click "Test Key" to verify it works
+3. Click "Save API Key"
+4. (Optional) Choose a voice, model, and voice settings
+5. (Optional) Enable Auto-Submit if you want hands-free dictation
+### 7. Restart Cursor
+**Fully quit Cursor** (Cmd+Q on Mac, or close completely on Windows/Linux) and reopen it.
+### 8. Test it
+1. Open a new Cursor chat (Cmd+L)
+2. Check that the `speak` or `user-tts-speak` tool appears in "Available Tools"
+3. Type: **"Say hello using the speak tool"**
+4. You should hear the voice through your speakers!
+## Usage
+Once installed, the Cursor AI agent will automatically speak at key moments:
+- When starting a task
+- When completing a task
+- When encountering errors or needing clarification
+- At major progress milestones
+You can customize when the agent speaks by editing `~/.cursor/rules/voice-feedback.mdc`.
+## Voice Settings
+The settings UI lets you adjust:
+- **Speed** (0.7x - 1.2x) - How fast the speech is delivered
+- **Stability** (0-1) - More consistent vs. more expressive
+- **Similarity Boost** (0-1) - How closely it matches the original voice
+- **Style Exaggeration** (0-1) - Amplifies the speaker's style (V2+ models)
+**Quick Presets:**
+- Default - Balanced settings
+- Fast - Quick and energetic
+- Slow - Clear and measured
+- Expressive - Dynamic and varied
+- Stable - Consistent tone
+- Dramatic - Maximum style
+## Auto-Submit (Optional)
+For completely hands-free dictation:
+1. Enable "Auto-Submit" in the settings UI
+2. Adjust the silence delay (how long to wait after you stop speaking)
+3. Save the settings
+4. Run in a separate terminal:
+```bash
+npm run auto-submit
+```
+**Requirements:**
+- macOS only (uses Accessibility API)
+- Grant Accessibility permissions: System Settings > Privacy & Security > Accessibility > Add your terminal app
+The script monitors the text field and automatically presses Enter when dictation finishes.
+## Configuration Files
+- **`config.json`** - Stores API key, voice settings, and auto-submit preferences
+- **`~/.cursor/mcp.json`** - Registers the MCP server with Cursor
+- **`~/.cursor/rules/voice-feedback.mdc`** - Controls when the agent speaks
+## Troubleshooting
+**Tool doesn't appear in Cursor?**
+- Make sure you fully quit and restarted Cursor (Cmd+Q)
+- Check that `~/.cursor/mcp.json` has the correct absolute path
+- Run `npm run build` to ensure the project is compiled
+**"API key not set" error?**
+- Open the settings UI: `npm run settings`
+- Enter your ElevenLabs API key and click "Save API Key"
+- Restart Cursor
+**No audio?**
+- Check system volume and speaker output
+- Verify `mpv` is installed: `mpv --version` (installed automatically by ElevenLabs SDK)
+- Test your API key in the settings UI
+**Auto-submit not working?**
+- Ensure macOS Accessibility permissions are granted
+- Check that Cursor is the frontmost app when dictating
+- Adjust the "Min Text Length" if short dictations aren't triggering
+- Increase "Silence Delay" if prompts are being submitted too early
+## Scripts
+- `npm run build` - Compile TypeScript to JavaScript
+- `npm run settings` - Open the web settings UI
+- `npm run auto-submit` - Start the auto-submit script (macOS only)
+## License
+MIT
+## Credits
+- [ElevenLabs](https://elevenlabs.io) for TTS API
+- [Model Context Protocol](https://modelcontextprotocol.io) for MCP SDK

package/build/config.js ADDED Viewed

@@ -0,0 +1,82 @@
+import { readFileSync, writeFileSync, existsSync } from "fs";
+import { join, dirname } from "path";
+import { fileURLToPath } from "url";
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = dirname(__filename);
+const CONFIG_PATH = join(__dirname, "..", "config.json");
+const DEFAULT_VOICE_SETTINGS = {
+    speed: 1.0,
+    stability: 0.5,
+    similarityBoost: 0.75,
+    style: 0.0,
+};
+const DEFAULT_AUTO_SUBMIT = {
+    enabled: false,
+    silenceDelay: 3.0,
+    minTextLength: 15,
+    targetApp: "Cursor",
+};
+const DEFAULT_WISPR_LOOP = {
+    enabled: false,
+    ttsDelay: 8.0,
+    silenceThreshold: 0.02,
+    silenceDuration: 2.0,
+    wisprHotkey: "shift+ctrl",
+    manualTriggerHotkey: "ctrl+shift+l",
+};
+const DEFAULT_CONFIG = {
+    apiKey: "",
+    voiceId: "21m00Tcm4TlvDq8ikWAM",
+    model: "eleven_flash_v2_5",
+    voiceSettings: { ...DEFAULT_VOICE_SETTINGS },
+    autoSubmit: { ...DEFAULT_AUTO_SUBMIT },
+    wisprLoop: { ...DEFAULT_WISPR_LOOP },
+    autoListen: true,
+};
+export function loadConfig() {
+    try {
+        if (existsSync(CONFIG_PATH)) {
+            const raw = readFileSync(CONFIG_PATH, "utf-8");
+            const parsed = JSON.parse(raw);
+            return {
+                ...DEFAULT_CONFIG,
+                ...parsed,
+                voiceSettings: {
+                    ...DEFAULT_VOICE_SETTINGS,
+                    ...(parsed.voiceSettings || {}),
+                },
+                autoSubmit: {
+                    ...DEFAULT_AUTO_SUBMIT,
+                    ...(parsed.autoSubmit || {}),
+                },
+                wisprLoop: {
+                    ...DEFAULT_WISPR_LOOP,
+                    ...(parsed.wisprLoop || {}),
+                },
+                autoListen: parsed.autoListen !== undefined ? parsed.autoListen : DEFAULT_CONFIG.autoListen,
+            };
+        }
+    }
+    catch (error) {
+        console.error("[Config] Error reading config.json:", error);
+    }
+    return { ...DEFAULT_CONFIG, voiceSettings: { ...DEFAULT_VOICE_SETTINGS }, autoSubmit: { ...DEFAULT_AUTO_SUBMIT }, wisprLoop: { ...DEFAULT_WISPR_LOOP }, autoListen: DEFAULT_CONFIG.autoListen };
+}
+export function saveConfig(config) {
+    const current = loadConfig();
+    const updated = { ...current, ...config };
+    writeFileSync(CONFIG_PATH, JSON.stringify(updated, null, 2), "utf-8");
+    return updated;
+}
+export function getEffectiveConfig() {
+    const fileConfig = loadConfig();
+    return {
+        apiKey: process.env.ELEVENLABS_API_KEY || fileConfig.apiKey,
+        voiceId: process.env.ELEVENLABS_VOICE_ID || fileConfig.voiceId,
+        model: fileConfig.model || DEFAULT_CONFIG.model,
+        voiceSettings: fileConfig.voiceSettings,
+        autoSubmit: fileConfig.autoSubmit,
+        wisprLoop: fileConfig.wisprLoop,
+        autoListen: fileConfig.autoListen,
+    };
+}

package/build/index.js ADDED Viewed

@@ -0,0 +1,166 @@
+#!/usr/bin/env node
+import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
+import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
+import { z } from "zod";
+import { ElevenLabsClient, play } from "@elevenlabs/elevenlabs-js";
+import { getEffectiveConfig } from "./config.js";
+import { writeFileSync } from "fs";
+import { join, dirname } from "path";
+import { fileURLToPath } from "url";
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = dirname(__filename);
+// Load config (config.json with env var overrides)
+const config = getEffectiveConfig();
+// Create server instance
+const server = new McpServer({
+    name: "cursor-tts",
+    version: "1.0.0",
+});
+// Initialize ElevenLabs client
+const elevenlabs = new ElevenLabsClient({
+    apiKey: config.apiKey,
+});
+const voiceId = config.voiceId;
+const ttsQueue = [];
+let isProcessingQueue = false;
+async function processTTSQueue() {
+    if (isProcessingQueue || ttsQueue.length === 0) {
+        return;
+    }
+    isProcessingQueue = true;
+    while (ttsQueue.length > 0) {
+        const item = ttsQueue.shift();
+        try {
+            console.error(`[TTS] Speaking: ${item.text}`);
+            // Call ElevenLabs TTS API
+            const audio = await elevenlabs.textToSpeech.convert(voiceId, {
+                text: item.text,
+                modelId: config.model,
+                voiceSettings: {
+                    speed: config.voiceSettings.speed,
+                    stability: config.voiceSettings.stability,
+                    similarityBoost: config.voiceSettings.similarityBoost,
+                    style: config.voiceSettings.style,
+                },
+            });
+            // Play and WAIT for audio to complete
+            await play(audio);
+            // Write TTS completion signal for background script
+            const completionPath = join(__dirname, "..", "tts-complete.json");
+            const completionSignal = {
+                timestamp: new Date().toISOString(),
+                completed: true,
+            };
+            writeFileSync(completionPath, JSON.stringify(completionSignal, null, 2), "utf-8");
+            console.error(`[TTS] Playback complete, signal written: ${completionPath}`);
+            item.resolve({
+                content: [
+                    {
+                        type: "text",
+                        text: `Spoken: "${item.text}"`,
+                    },
+                ],
+            });
+        }
+        catch (error) {
+            const errorMessage = error instanceof Error ? error.message : String(error);
+            console.error(`[TTS] Error: ${errorMessage}`);
+            item.reject({
+                content: [
+                    {
+                        type: "text",
+                        text: `Failed to speak: ${errorMessage}`,
+                    },
+                ],
+                isError: true,
+            });
+        }
+    }
+    isProcessingQueue = false;
+}
+function queueTTS(text) {
+    return new Promise((resolve, reject) => {
+        ttsQueue.push({ text, resolve, reject });
+        processTTSQueue();
+    });
+}
+// Register the speak tool
+server.registerTool("speak", {
+    description: "Speak text aloud using text-to-speech. Use this to announce task progress, completions, and important updates so the user can follow along without looking at the screen.",
+    inputSchema: {
+        text: z
+            .string()
+            .describe("The text to speak aloud. Keep it concise (1-2 sentences max)."),
+    },
+}, async ({ text }) => {
+    // Queue the TTS request to prevent overlapping audio
+    return await queueTTS(text);
+});
+// Register the listen tool
+server.registerTool("listen", {
+    description: "Signal the background script to start listening for user voice input via Wispr Flow. Call this after speaking task completion to enable hands-free conversational loop.",
+    inputSchema: {},
+}, async () => {
+    try {
+        // Check if auto-listen is enabled
+        if (!config.autoListen) {
+            console.error(`[TTS] Auto-listen is disabled, skipping listen signal`);
+            return {
+                content: [
+                    {
+                        type: "text",
+                        text: "Auto-listen is disabled",
+                    },
+                ],
+            };
+        }
+        const signalPath = join(__dirname, "..", "listen-signal.json");
+        const signal = {
+            timestamp: new Date().toISOString(),
+            triggered: true,
+        };
+        writeFileSync(signalPath, JSON.stringify(signal, null, 2), "utf-8");
+        console.error(`[TTS] Listen signal written: ${signalPath}`);
+        return {
+            content: [
+                {
+                    type: "text",
+                    text: "Listening for user input...",
+                },
+            ],
+        };
+    }
+    catch (error) {
+        const errorMessage = error instanceof Error ? error.message : String(error);
+        console.error(`[TTS] Listen error: ${errorMessage}`);
+        return {
+            content: [
+                {
+                    type: "text",
+                    text: `Failed to start listening: ${errorMessage}`,
+                },
+            ],
+            isError: true,
+        };
+    }
+});
+// Main function to start the server
+async function main() {
+    // Validate API key
+    if (!config.apiKey) {
+        console.error("[TTS] ERROR: No API key found! Set ELEVENLABS_API_KEY env var or configure via the settings UI.");
+        console.error("[TTS] Run 'npm run settings' to open the settings UI.");
+        console.error("[TTS] Or get your API key from: https://elevenlabs.io/app/settings/api-keys");
+        process.exit(1);
+    }
+    console.error(`[TTS] Starting Cursor TTS MCP Server...`);
+    console.error(`[TTS] Voice ID: ${voiceId}`);
+    console.error(`[TTS] Model: ${config.model}`);
+    const transport = new StdioServerTransport();
+    await server.connect(transport);
+    console.error("[TTS] Server running on stdio");
+}
+main().catch((error) => {
+    console.error("[TTS] Fatal error in main():", error);
+    process.exit(1);
+});