cursor-buddy 0.0.8 → 0.0.9-beta.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -12
- package/dist/{client-D73KQZf8.mjs → client-CliXcNch.mjs} +296 -389
- package/dist/client-CliXcNch.mjs.map +1 -0
- package/dist/{client-Crn8tW7w.d.mts → client-sjVVGYPU.d.mts} +7 -39
- package/dist/client-sjVVGYPU.d.mts.map +1 -0
- package/dist/index.d.mts +3 -2
- package/dist/index.mjs +3 -2
- package/dist/point-tool-DZJmhD8e.mjs +16 -0
- package/dist/point-tool-DZJmhD8e.mjs.map +1 -0
- package/dist/point-tool-l3FewgM9.d.mts +22 -0
- package/dist/point-tool-l3FewgM9.d.mts.map +1 -0
- package/dist/react/index.d.mts +1 -1
- package/dist/react/index.mjs +1 -1
- package/dist/server/adapters/next.d.mts +2 -3
- package/dist/server/adapters/next.d.mts.map +1 -1
- package/dist/server/adapters/next.mjs +2 -5
- package/dist/server/adapters/next.mjs.map +1 -1
- package/dist/server/index.d.mts +4 -7
- package/dist/server/index.d.mts.map +1 -1
- package/dist/server/index.mjs +127 -39
- package/dist/server/index.mjs.map +1 -1
- package/dist/{types-BxBhjZju.d.mts → types-BJfkApb_.d.mts} +2 -1
- package/dist/types-BJfkApb_.d.mts.map +1 -0
- package/package.json +3 -2
- package/dist/client-Crn8tW7w.d.mts.map +0 -1
- package/dist/client-D73KQZf8.mjs.map +0 -1
- package/dist/types-BxBhjZju.d.mts.map +0 -1
package/README.md
CHANGED
|
@@ -12,7 +12,7 @@ Customize its prompt, pass custom tools, choose between browser or server-side s
|
|
|
12
12
|
|
|
13
13
|
- **Push-to-talk voice input** — Hold a hotkey to speak, release to send
|
|
14
14
|
- **Browser-first live transcription** — Realtime transcript while speaking, with server fallback
|
|
15
|
-
- **
|
|
15
|
+
- **DOM snapshot context** — AI sees a token-efficient representation of your visible page structure
|
|
16
16
|
- **Voice responses** — Browser or server TTS, with optional streaming playback
|
|
17
17
|
- **Cursor pointing** — AI can point at UI elements it references
|
|
18
18
|
- **Voice interruption** — Start talking again to cut off current response
|
|
@@ -57,7 +57,7 @@ export const cursorBuddy = createCursorBuddyHandler({
|
|
|
57
57
|
import { toNextJsHandler } from "cursor-buddy/server/next"
|
|
58
58
|
import { cursorBuddy } from "@/lib/cursor-buddy"
|
|
59
59
|
|
|
60
|
-
export const {
|
|
60
|
+
export const { POST } = toNextJsHandler(cursorBuddy)
|
|
61
61
|
```
|
|
62
62
|
|
|
63
63
|
### 2. Client Setup
|
|
@@ -367,17 +367,15 @@ client.stopListening()
|
|
|
367
367
|
|
|
368
368
|
1. User holds the hotkey
|
|
369
369
|
2. Microphone captures audio, waveform shows audio level, and browser speech recognition starts when available
|
|
370
|
-
3.
|
|
371
|
-
4.
|
|
370
|
+
3. At the same time, a screenshot and token-efficient DOM snapshot of the viewport are captured in the background. This runs in parallel with speech capture to minimize latency
|
|
371
|
+
4. User releases hotkey
|
|
372
372
|
5. The client prefers the browser transcript; if it is unavailable or empty in `auto` mode, the recorded audio is transcribed on the server
|
|
373
|
-
6.
|
|
374
|
-
7. AI responds with text
|
|
375
|
-
- Preferred: `[POINT:5:Submit]` for numbered interactive elements
|
|
376
|
-
- Fallback: `[POINT:640,360:Error text]` for arbitrary screen coordinates
|
|
373
|
+
6. The already-captured screenshot + DOM snapshot are sent to the AI model. Each element has an `@ID` (e.g., `@12`) that the AI can reference.
|
|
374
|
+
7. AI responds with text and can optionally call the `point` tool to indicate an element on screen by its `@ID` from the DOM snapshot
|
|
377
375
|
8. Response is spoken in the browser or on the server based on `speech.mode`,
|
|
378
|
-
|
|
379
|
-
|
|
380
|
-
9. If
|
|
376
|
+
and can either wait for the full response or stream sentence-by-sentence
|
|
377
|
+
based on `speech.allowStreaming`
|
|
378
|
+
9. If the AI calls the point tool, the cursor animates to the target element's current position (it resolves the element from the snapshot registry and computes its center point)
|
|
381
379
|
10. **If user presses hotkey again at any point, current response is interrupted**
|
|
382
380
|
|
|
383
381
|
## Security Best Practices
|
|
@@ -415,7 +413,6 @@ export const GET = POST
|
|
|
415
413
|
|
|
416
414
|
## TODOs
|
|
417
415
|
|
|
418
|
-
- [ ] High: Make tool calls first class: Pointing becomes tool call (once per turn) + re-use pointing bubble UI for tool calls
|
|
419
416
|
- [ ] Medium: Proper test structure without relying on `as any` for audio and voice capture
|
|
420
417
|
|
|
421
418
|
## License
|