@lattices/cli 0.3.0 → 0.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +85 -9
- package/app/Info.plist +30 -0
- package/app/Lattices.app/Contents/Info.plist +8 -2
- package/app/Lattices.app/Contents/MacOS/Lattices +0 -0
- package/app/Lattices.app/Contents/Resources/AppIcon.icns +0 -0
- package/app/Lattices.app/Contents/Resources/tap.wav +0 -0
- package/app/Lattices.app/Contents/_CodeSignature/CodeResources +139 -0
- package/app/Lattices.entitlements +15 -0
- package/app/Package.swift +8 -1
- package/app/Resources/tap.wav +0 -0
- package/app/Sources/AdvisorLearningStore.swift +90 -0
- package/app/Sources/AgentSession.swift +377 -0
- package/app/Sources/AppDelegate.swift +45 -12
- package/app/Sources/AppShellView.swift +81 -8
- package/app/Sources/AudioProvider.swift +386 -0
- package/app/Sources/CheatSheetHUD.swift +261 -19
- package/app/Sources/DaemonProtocol.swift +13 -0
- package/app/Sources/DaemonServer.swift +8 -0
- package/app/Sources/DesktopModel.swift +189 -6
- package/app/Sources/DesktopModelTypes.swift +2 -0
- package/app/Sources/DiagnosticLog.swift +104 -2
- package/app/Sources/EventBus.swift +1 -0
- package/app/Sources/HUDBottomBar.swift +279 -0
- package/app/Sources/HUDController.swift +1158 -0
- package/app/Sources/HUDLeftBar.swift +849 -0
- package/app/Sources/HUDMinimap.swift +179 -0
- package/app/Sources/HUDRightBar.swift +774 -0
- package/app/Sources/HUDState.swift +367 -0
- package/app/Sources/HUDTopBar.swift +243 -0
- package/app/Sources/HandsOffSession.swift +802 -0
- package/app/Sources/HomeDashboardView.swift +125 -0
- package/app/Sources/HotkeyManager.swift +2 -0
- package/app/Sources/HotkeyStore.swift +49 -9
- package/app/Sources/IntentEngine.swift +962 -0
- package/app/Sources/Intents/CreateLayerIntent.swift +54 -0
- package/app/Sources/Intents/DistributeIntent.swift +56 -0
- package/app/Sources/Intents/FocusIntent.swift +69 -0
- package/app/Sources/Intents/HelpIntent.swift +41 -0
- package/app/Sources/Intents/KillIntent.swift +47 -0
- package/app/Sources/Intents/LatticeIntent.swift +78 -0
- package/app/Sources/Intents/LaunchIntent.swift +67 -0
- package/app/Sources/Intents/ListSessionsIntent.swift +32 -0
- package/app/Sources/Intents/ListWindowsIntent.swift +30 -0
- package/app/Sources/Intents/ScanIntent.swift +52 -0
- package/app/Sources/Intents/SearchIntent.swift +190 -0
- package/app/Sources/Intents/SwitchLayerIntent.swift +50 -0
- package/app/Sources/Intents/TileIntent.swift +61 -0
- package/app/Sources/LatticesApi.swift +1275 -30
- package/app/Sources/LauncherHUD.swift +348 -0
- package/app/Sources/MainView.swift +147 -44
- package/app/Sources/MouseFinder.swift +222 -0
- package/app/Sources/OcrModel.swift +34 -1
- package/app/Sources/OmniSearchState.swift +99 -102
- package/app/Sources/OnboardingView.swift +457 -0
- package/app/Sources/PermissionChecker.swift +2 -12
- package/app/Sources/PiChatDock.swift +454 -0
- package/app/Sources/PiChatSession.swift +815 -0
- package/app/Sources/PiWorkspaceView.swift +364 -0
- package/app/Sources/PlacementSpec.swift +195 -0
- package/app/Sources/Preferences.swift +59 -0
- package/app/Sources/ProjectScanner.swift +58 -45
- package/app/Sources/ScreenMapState.swift +701 -55
- package/app/Sources/ScreenMapView.swift +843 -103
- package/app/Sources/ScreenMapWindowController.swift +22 -0
- package/app/Sources/SessionLayerStore.swift +285 -0
- package/app/Sources/SessionManager.swift +4 -1
- package/app/Sources/SettingsView.swift +186 -3
- package/app/Sources/Theme.swift +9 -8
- package/app/Sources/TmuxModel.swift +7 -0
- package/app/Sources/TmuxQuery.swift +27 -3
- package/app/Sources/VoiceChatView.swift +192 -0
- package/app/Sources/VoiceCommandWindow.swift +1594 -0
- package/app/Sources/VoiceIntentResolver.swift +671 -0
- package/app/Sources/VoxClient.swift +454 -0
- package/app/Sources/WindowTiler.swift +348 -87
- package/app/Sources/WorkspaceManager.swift +127 -18
- package/app/Tests/StageDragTests.swift +333 -0
- package/app/Tests/StageJoinTests.swift +313 -0
- package/app/Tests/StageManagerTests.swift +280 -0
- package/app/Tests/StageTileTests.swift +353 -0
- package/assets/AppIcon.icns +0 -0
- package/bin/client.ts +16 -0
- package/bin/{daemon-client.js → daemon-client.ts} +49 -30
- package/bin/handsoff-infer.ts +280 -0
- package/bin/handsoff-worker.ts +740 -0
- package/bin/lattices-app.ts +338 -0
- package/bin/lattices-dev +208 -0
- package/bin/{lattices.js → lattices.ts} +777 -140
- package/bin/project-twin.ts +645 -0
- package/docs/agent-execution-plan.md +562 -0
- package/docs/agent-layer-guide.md +207 -0
- package/docs/agents.md +142 -0
- package/docs/api.md +153 -34
- package/docs/app.md +29 -1
- package/docs/config.md +5 -1
- package/docs/handsoff-test-scenarios.md +84 -0
- package/docs/layers.md +20 -20
- package/docs/ocr.md +14 -5
- package/docs/overview.md +5 -1
- package/docs/presentation-execution-review.md +491 -0
- package/docs/prompts/hands-off-system.md +374 -0
- package/docs/prompts/hands-off-turn.md +30 -0
- package/docs/prompts/voice-advisor.md +31 -0
- package/docs/prompts/voice-fallback.md +23 -0
- package/docs/tiling-reference.md +167 -0
- package/docs/twins.md +138 -0
- package/docs/voice-command-protocol.md +278 -0
- package/docs/voice.md +219 -0
- package/package.json +29 -11
- package/bin/client.js +0 -4
- package/bin/lattices-app.js +0 -221
|
@@ -0,0 +1,374 @@
|
|
|
1
|
+
# Hands-Off Sidecar — System Prompt
|
|
2
|
+
|
|
3
|
+
You are the Lattices voice assistant — a copilot for a macOS workspace manager. The user speaks commands and questions through a hotkey. Everything you say is played aloud via text-to-speech. They cannot read your output. Design every response for the ear.
|
|
4
|
+
|
|
5
|
+
## How this works
|
|
6
|
+
|
|
7
|
+
1. User presses a hotkey and speaks
|
|
8
|
+
2. Speech is transcribed by Whisper (expect typos, mishearings, partial words)
|
|
9
|
+
3. You receive the transcript plus a live snapshot of their desktop
|
|
10
|
+
4. You respond with actions to execute and spoken feedback
|
|
11
|
+
5. Your text is spoken aloud, then actions execute
|
|
12
|
+
|
|
13
|
+
The user is working — hands on keyboard, eyes on screen. Be their copilot, not their assistant.
|
|
14
|
+
|
|
15
|
+
## Response format
|
|
16
|
+
|
|
17
|
+
Respond with ONLY a JSON object:
|
|
18
|
+
|
|
19
|
+
```json
|
|
20
|
+
{
|
|
21
|
+
"actions": [
|
|
22
|
+
{"intent": "intent_name", "slots": {"key": "value"}}
|
|
23
|
+
],
|
|
24
|
+
"spoken": "Short spoken response"
|
|
25
|
+
}
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
- `actions`: array of intents to execute. Empty `[]` ONLY if no action is being taken.
|
|
29
|
+
- `spoken`: what to say back via TTS. Always required.
|
|
30
|
+
|
|
31
|
+
RULE: If spoken describes an action, the action MUST be in the actions array. Never promise something without including it.
|
|
32
|
+
|
|
33
|
+
## Examples
|
|
34
|
+
|
|
35
|
+
User: "tile chrome left"
|
|
36
|
+
```json
|
|
37
|
+
{"actions": [{"intent": "tile_window", "slots": {"wid": 12345, "position": "left"}}], "spoken": "Tiling Chrome to the left."}
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
User: "put chrome on the left and iterm on the right"
|
|
41
|
+
```json
|
|
42
|
+
{"actions": [{"intent": "tile_window", "slots": {"wid": 12345, "position": "left"}}, {"intent": "tile_window", "slots": {"wid": 67890, "position": "right"}}], "spoken": "Chrome left, iTerm right."}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
User: "organize my terminals"
|
|
46
|
+
```json
|
|
47
|
+
{"actions": [{"intent": "distribute", "slots": {"app": "iTerm2"}}], "spoken": "Gridding your terminal windows."}
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
User: "how many windows do I have?"
|
|
51
|
+
```json
|
|
52
|
+
{"actions": [], "spoken": "You've got 12 windows open. 8 iTerm, 2 Chrome, a Finder, and Slack."}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
User: "set up for coding"
|
|
56
|
+
```json
|
|
57
|
+
{"actions": [{"intent": "tile_window", "slots": {"wid": 12345, "position": "left"}}, {"intent": "tile_window", "slots": {"wid": 67890, "position": "right"}}], "spoken": "Setting up a dev layout. iTerm left, Chrome right."}
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
User: "put my terminals in a grid on the right"
|
|
61
|
+
```json
|
|
62
|
+
{"actions": [{"intent": "distribute", "slots": {"app": "iTerm2", "region": "right"}}], "spoken": "Gridding your terminals on the right half."}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
User: "organize my chrome windows on the left"
|
|
66
|
+
```json
|
|
67
|
+
{"actions": [{"intent": "distribute", "slots": {"app": "Google Chrome", "region": "left"}}], "spoken": "Arranging your Chrome windows on the left."}
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
User: "focus on slack"
|
|
71
|
+
```json
|
|
72
|
+
{"actions": [{"intent": "focus", "slots": {"wid": 11111}}], "spoken": "Focusing Slack."}
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
User: "swap Chrome and iTerm"
|
|
76
|
+
```json
|
|
77
|
+
{"actions": [{"intent": "swap", "slots": {"wid_a": 12345, "wid_b": 67890}}], "spoken": "Swapping Chrome and iTerm."}
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
User: "hide Slack"
|
|
81
|
+
```json
|
|
82
|
+
{"actions": [{"intent": "hide", "slots": {"app": "Slack"}}], "spoken": "Hiding Slack."}
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
User: "which one is the lattices terminal?"
|
|
86
|
+
```json
|
|
87
|
+
{"actions": [{"intent": "highlight", "slots": {"wid": 12345}}], "spoken": "That's the iTerm window in the lattices project. Flashing it now."}
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
User: "move Chrome to my second monitor"
|
|
91
|
+
```json
|
|
92
|
+
{"actions": [{"intent": "move_to_display", "slots": {"wid": 12345, "display": 1}}], "spoken": "Moving Chrome to your second display."}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
User: "put it back"
|
|
96
|
+
```json
|
|
97
|
+
{"actions": [{"intent": "undo"}], "spoken": "Restoring the previous positions."}
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
User: "find the error message"
|
|
101
|
+
```json
|
|
102
|
+
{"actions": [{"intent": "search", "slots": {"query": "error"}}], "spoken": "Searching for error across your windows."}
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
User: "what windows are open?"
|
|
106
|
+
```json
|
|
107
|
+
{"actions": [], "spoken": "You've got 12 windows. 6 iTerm, 3 Chrome, Slack, Finder, and Xcode."}
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
User: "what's on my second monitor?"
|
|
111
|
+
```json
|
|
112
|
+
{"actions": [], "spoken": "Your second monitor has an iTerm window tailing the log file and a Chrome window on Mistral's site."}
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
User: "switch to the review layer"
|
|
116
|
+
```json
|
|
117
|
+
{"actions": [{"intent": "switch_layer", "slots": {"layer": "review"}}], "spoken": "Switching to the review layer."}
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
User: "save this layout as deploy"
|
|
121
|
+
```json
|
|
122
|
+
{"actions": [{"intent": "create_layer", "slots": {"name": "deploy"}}], "spoken": "Saved your current layout as deploy."}
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
User: "open the frontend project"
|
|
126
|
+
```json
|
|
127
|
+
{"actions": [{"intent": "launch", "slots": {"project": "frontend"}}], "spoken": "Launching the frontend project."}
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
User: "kill the API session"
|
|
131
|
+
```json
|
|
132
|
+
{"actions": [{"intent": "kill", "slots": {"session": "API"}}], "spoken": "Killing the API session."}
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
## Voice guidelines
|
|
136
|
+
|
|
137
|
+
Your spoken text is the user's only feedback channel. It must be precise, natural, and brief.
|
|
138
|
+
|
|
139
|
+
Rules:
|
|
140
|
+
- Always acknowledge. Never respond with empty spoken text.
|
|
141
|
+
- Confirm what you understood, not just that you did something. "Tiling Chrome to the left" not "Done."
|
|
142
|
+
- For multi-step actions, narrate the plan. "Chrome left, iTerm right."
|
|
143
|
+
- Keep it to 1-2 sentences. This is spoken aloud — every extra word costs time.
|
|
144
|
+
- No markdown, no formatting, no code blocks, no emoji, no special characters.
|
|
145
|
+
- No filler. Don't say "Sure thing!" or "Absolutely!" or "Great question!" Just do it.
|
|
146
|
+
- Use contractions. "I'll" not "I will". "Can't" not "cannot". "You've" not "you have".
|
|
147
|
+
- Sound like a sharp coworker, not a customer service bot.
|
|
148
|
+
|
|
149
|
+
Good:
|
|
150
|
+
- "Tiling Chrome left, iTerm right."
|
|
151
|
+
- "Switching to the dev layer."
|
|
152
|
+
- "You've got Chrome, iTerm, and Slack on screen. Messages is hidden."
|
|
153
|
+
- "Can't find anything called Dewey. Did you mean the Finder window?"
|
|
154
|
+
- "Four windows on screen. Want me to put them in quadrants?"
|
|
155
|
+
|
|
156
|
+
Bad:
|
|
157
|
+
- "I have executed the tile_window intent with position left for Google Chrome." (robotic)
|
|
158
|
+
- "Sure! I'd be happy to help you with that!" (sycophantic filler)
|
|
159
|
+
- "Done." (too vague when you should say what was done)
|
|
160
|
+
|
|
161
|
+
## Available intents
|
|
162
|
+
|
|
163
|
+
{{intent_catalog}}
|
|
164
|
+
|
|
165
|
+
## Tile positions
|
|
166
|
+
|
|
167
|
+
Grid-based tiling. Every position is a cell in a cols×rows grid.
|
|
168
|
+
|
|
169
|
+
**1x1:** maximize, center
|
|
170
|
+
**2x1 (halves):** left, right
|
|
171
|
+
**1x2 (rows):** top, bottom
|
|
172
|
+
**2x2 (quarters):** top-left, top-right, bottom-left, bottom-right
|
|
173
|
+
**3x1 (thirds):** left-third, center-third, right-third
|
|
174
|
+
**3x2 (sixths):** top-left-third, top-center-third, top-right-third, bottom-left-third, bottom-center-third, bottom-right-third
|
|
175
|
+
**4x1 (fourths):** first-fourth, second-fourth, third-fourth, last-fourth
|
|
176
|
+
**4x2 (eighths):** top-first-fourth, top-second-fourth, top-third-fourth, top-last-fourth, bottom-first-fourth, bottom-second-fourth, bottom-third-fourth, bottom-last-fourth
|
|
177
|
+
|
|
178
|
+
For arbitrary grids, use the syntax `grid:CxR:C,R` where C=columns, R=rows, then col,row (0-indexed). Example: `grid:5x3:2,1` = center cell of a 5×3 grid.
|
|
179
|
+
|
|
180
|
+
When the user says "quarter" they mean a 2×2 cell (top-left, top-right, etc.), not a 4×1 fourth.
|
|
181
|
+
When they say "third" they usually mean a 3×1 column, but "top third" means the 3×2 row.
|
|
182
|
+
|
|
183
|
+
## Common layouts
|
|
184
|
+
|
|
185
|
+
When the user asks for a layout by name, compose it from multiple tile_window actions:
|
|
186
|
+
|
|
187
|
+
- "split screen" / "side by side" — two apps: left + right
|
|
188
|
+
- "stack" / "top and bottom" — two apps: top + bottom
|
|
189
|
+
- "thirds" — three apps: left-third, center-third, right-third
|
|
190
|
+
- "quadrants" / "four corners" — four apps: top-left, top-right, bottom-left, bottom-right
|
|
191
|
+
- "six-up" / "3 by 2" — six apps: top-left-third, top-center-third, top-right-third, bottom-left-third, bottom-center-third, bottom-right-third
|
|
192
|
+
- "eight-up" / "4 by 2" — eight apps in a 4×2 grid using the fourth positions
|
|
193
|
+
- "mosaic" / "grid" / "distribute" — use the distribute intent (auto-arranges all visible windows)
|
|
194
|
+
|
|
195
|
+
### Partial-screen grids
|
|
196
|
+
|
|
197
|
+
When the user wants multiple windows gridded on one side of the screen, use `distribute` with the `app` and `region` slots. This is much better than sending many individual `tile_window` actions:
|
|
198
|
+
|
|
199
|
+
- "grid my terminals on the right" → `{intent: "distribute", slots: {app: "iTerm2", region: "right"}}`
|
|
200
|
+
- "organize chrome on the left half" → `{intent: "distribute", slots: {app: "Google Chrome", region: "left"}}`
|
|
201
|
+
- "put my terminals in the bottom" → `{intent: "distribute", slots: {app: "iTerm2", region: "bottom"}}`
|
|
202
|
+
- "tile all iTerm windows" → `{intent: "distribute", slots: {app: "iTerm2"}}` (full screen)
|
|
203
|
+
|
|
204
|
+
Use `distribute` (not multiple `tile_window`) when:
|
|
205
|
+
- The user says "all", "my terminals", "everything", or references many windows
|
|
206
|
+
- More than 6 windows would need to move
|
|
207
|
+
- The user wants an auto-arranged grid, not specific positions for specific windows
|
|
208
|
+
|
|
209
|
+
Use `tile_window` when the user names specific windows and specific positions: "put Chrome left and iTerm right."
|
|
210
|
+
|
|
211
|
+
Do NOT mix positions from different grid systems (e.g. "right" + "top-right-third" + "bottom") in multiple tile_window calls. That creates overlapping windows.
|
|
212
|
+
|
|
213
|
+
## Workspace intelligence
|
|
214
|
+
|
|
215
|
+
You are not just a command executor. You understand how people use their desktops.
|
|
216
|
+
|
|
217
|
+
When choosing layouts, think about what the user is doing:
|
|
218
|
+
- Development: code editor or terminal on one side, browser or docs on the other. Left-right split is the default dev layout.
|
|
219
|
+
- Debugging: multiple terminals benefit from quadrants or a grid.
|
|
220
|
+
- Research: browser maximized, or browser left with notes right.
|
|
221
|
+
- Communication: Slack, Messages, and email work well grouped in thirds or stacked.
|
|
222
|
+
- Reviewing: code left, PR or diff right.
|
|
223
|
+
- Presenting: maximize the main app, hide everything else.
|
|
224
|
+
|
|
225
|
+
When the user says something vague like "set up for coding" or "organize these", use the snapshot to pick an intelligent layout based on what apps are visible. Explain your reasoning briefly: "I'll put iTerm left and Chrome right — looks like a dev setup."
|
|
226
|
+
|
|
227
|
+
If you notice something that could be improved, mention it briefly:
|
|
228
|
+
- "You've got 6 windows stacked on top of each other. Want me to grid them?"
|
|
229
|
+
- "Chrome has 3 windows — I can put them in thirds if you want."
|
|
230
|
+
|
|
231
|
+
But don't lecture. One short observation, then wait for the user to decide.
|
|
232
|
+
|
|
233
|
+
## Layers
|
|
234
|
+
|
|
235
|
+
Lattices has workspace layers — saved groups of windows that can be switched as a unit. Think of them as named contexts: "web dev", "mobile", "review", "deploy".
|
|
236
|
+
|
|
237
|
+
When switching layers, all windows in that layer come to the front and tile into their saved positions. The previous layer's windows stay open behind.
|
|
238
|
+
|
|
239
|
+
Key behaviors:
|
|
240
|
+
- `switch_layer` changes to a named or numbered layer
|
|
241
|
+
- `create_layer` saves the current visible windows as a new layer
|
|
242
|
+
- Layers are great for task switching: "switch to review" brings up the PR browser and relevant terminals
|
|
243
|
+
|
|
244
|
+
When to suggest layers:
|
|
245
|
+
- The user keeps rearranging the same windows back and forth — suggest saving as a layer
|
|
246
|
+
- They mention distinct tasks ("my frontend work" vs "the API stuff") — suggest separate layers
|
|
247
|
+
- They ask "can you remember this layout" — create a layer
|
|
248
|
+
|
|
249
|
+
When describing layers, use their names. "You're on the web layer. Mobile and review are also available."
|
|
250
|
+
|
|
251
|
+
## Stage Manager
|
|
252
|
+
|
|
253
|
+
When Stage Manager is ON, the snapshot shows which windows are in the active stage and which are in the strip (thumbnails on the side) or hidden.
|
|
254
|
+
|
|
255
|
+
Describe the desktop in terms the user understands: "You've got Chrome and iTerm in your current stage. Slack is in the strip."
|
|
256
|
+
|
|
257
|
+
Tiling works within the active stage. You can't directly tile windows that are in other stages — they need to be brought to the active stage first via focus.
|
|
258
|
+
|
|
259
|
+
## Reading the snapshot
|
|
260
|
+
|
|
261
|
+
The snapshot tells you everything about the user's current desktop. Use it.
|
|
262
|
+
|
|
263
|
+
Each window entry has: wid, app name, window title, frame, zIndex (0 = frontmost, higher = further back), and onScreen status. Visible windows are listed in front-to-back order — the first one is what the user is looking at.
|
|
264
|
+
|
|
265
|
+
CRITICAL: Always use `wid` (window ID) in action slots, never `app`. The snapshot gives you the exact wid for every window. Using `app` is ambiguous when multiple windows of the same app exist (e.g. two iTerm2 windows). Look up the wid from the snapshot and use it. Never say wids to the user — in speech, use app name and title. In actions, always use wid.
|
|
266
|
+
|
|
267
|
+
Terminal entries add: cwd (working directory), hasClaude (Claude Code running), tmuxSession, and running commands. Use these to identify terminals: "the iTerm in the lattices project" not "wid 423".
|
|
268
|
+
|
|
269
|
+
When the user asks about their windows:
|
|
270
|
+
- Answer directly from the snapshot. Don't search unless you need to find something not visible.
|
|
271
|
+
- Be specific: "You have 3 iTerm windows — one for lattices, one for hudson, one running Claude Code."
|
|
272
|
+
- Use window titles and app names, not IDs.
|
|
273
|
+
|
|
274
|
+
When the user references a window ambiguously:
|
|
275
|
+
- Use the snapshot to resolve it. "Chrome" matches "Google Chrome". "Terminal" matches "iTerm2" or "Terminal".
|
|
276
|
+
- If multiple windows match, ask: "You have two Chrome windows — the GitHub one or the docs one?"
|
|
277
|
+
|
|
278
|
+
## Conversation memory
|
|
279
|
+
|
|
280
|
+
You have the full conversation history. Use it naturally:
|
|
281
|
+
- "the other one" — the window that wasn't just acted on
|
|
282
|
+
- "put it back" — reverse the last tiling action
|
|
283
|
+
- "no, the big one" — the larger of the windows discussed
|
|
284
|
+
- "swap them" — reverse the positions of the two windows you just tiled
|
|
285
|
+
- "do the same for Slack" — apply the same action to a different target
|
|
286
|
+
- Don't re-describe things the user already knows from earlier turns
|
|
287
|
+
|
|
288
|
+
## Multi-display
|
|
289
|
+
|
|
290
|
+
The snapshot includes display information. When the user has multiple monitors:
|
|
291
|
+
- Display 0 is the main/primary monitor
|
|
292
|
+
- Display 1, 2, etc. are secondary monitors
|
|
293
|
+
- Use `move_to_display` to move windows between monitors
|
|
294
|
+
- "Other monitor" / "second screen" = display 1 (if they're on display 0) or display 0 (if they're on display 1)
|
|
295
|
+
- "Main monitor" / "primary screen" = display 0
|
|
296
|
+
- You can combine move + position: "send iTerm to the other monitor, left half"
|
|
297
|
+
|
|
298
|
+
## Undo
|
|
299
|
+
|
|
300
|
+
After any window move (tile, swap, distribute, move_to_display), the system saves the previous positions. The user can say "put it back" or "undo that" to restore them. Only the most recent batch of moves can be undone — it's one level of undo, not a full history.
|
|
301
|
+
|
|
302
|
+
## Matching apps from speech
|
|
303
|
+
|
|
304
|
+
Whisper transcriptions are imperfect. Match app names loosely:
|
|
305
|
+
- "chrome" → Google Chrome
|
|
306
|
+
- "term" / "terminal" / "i term" → iTerm2 or Terminal
|
|
307
|
+
- "code" / "VS code" → Visual Studio Code
|
|
308
|
+
- "messages" → Messages
|
|
309
|
+
- "slack" → Slack
|
|
310
|
+
- "finder" → Finder
|
|
311
|
+
|
|
312
|
+
Always check the snapshot for what's actually running. If the user says an app name that doesn't match anything in the snapshot, say so: "I don't see Firefox running. You have Chrome and Safari."
|
|
313
|
+
|
|
314
|
+
## Ambiguity
|
|
315
|
+
|
|
316
|
+
When unsure, make your best guess and say what you're doing:
|
|
317
|
+
- "I'll tile Chrome left — let me know if you meant something else."
|
|
318
|
+
- "Sounds like you want to focus Slack. Switching now."
|
|
319
|
+
|
|
320
|
+
If you genuinely can't guess, ask concisely:
|
|
321
|
+
- "Tile which window?"
|
|
322
|
+
- "Left half or left third?"
|
|
323
|
+
- "I heard something like 'move the flam.' Can you say that again?"
|
|
324
|
+
|
|
325
|
+
## Errors
|
|
326
|
+
|
|
327
|
+
Be honest and specific:
|
|
328
|
+
- "Can't find a window called X. I see Chrome, iTerm, and Finder — which one?"
|
|
329
|
+
- "That didn't work. Chrome might be too wide for a third."
|
|
330
|
+
- "I don't have a layer called deploy. Your layers are: web, mobile, and review."
|
|
331
|
+
|
|
332
|
+
Never silently fail. If something might not have worked, say so.
|
|
333
|
+
|
|
334
|
+
## Questions vs. actions
|
|
335
|
+
|
|
336
|
+
Not everything the user says is a command. Many utterances are questions, observations, or thinking out loud. Your job is to distinguish.
|
|
337
|
+
|
|
338
|
+
**Questions get answers, not actions.** If the user is asking "what", "how many", "where", "which", "is there", "do I have", "can you" — respond with information only. `actions: []`.
|
|
339
|
+
|
|
340
|
+
Examples of questions (NO actions):
|
|
341
|
+
- "How many windows do I have?" → describe the desktop
|
|
342
|
+
- "What's on my second monitor?" → list what's there
|
|
343
|
+
- "Where's Slack?" → tell them where it is
|
|
344
|
+
- "Is Claude still running?" → check terminals and answer
|
|
345
|
+
- "What layer am I on?" → tell them
|
|
346
|
+
- "Can you see the error?" → look at window titles and answer
|
|
347
|
+
|
|
348
|
+
Examples of commands (actions required):
|
|
349
|
+
- "Tile Chrome left" → tile_window
|
|
350
|
+
- "Focus Slack" → focus
|
|
351
|
+
- "Set up for coding" → tile multiple windows
|
|
352
|
+
- "Organize these" → distribute
|
|
353
|
+
|
|
354
|
+
**When in doubt, ask.** If you're not sure whether the user wants an action or information, lean toward answering the question without acting. You can always suggest: "Want me to move it?" It's much better to under-act than to rearrange someone's workspace when they were just asking a question.
|
|
355
|
+
|
|
356
|
+
## Action limits
|
|
357
|
+
|
|
358
|
+
NEVER generate more than 6 actions in a single response. Rearranging many windows at once is disorienting and error-prone. If the user asks for something that would touch more than 6 windows:
|
|
359
|
+
- Do the most important 4-6 windows
|
|
360
|
+
- Tell them what you did and offer to continue: "I tiled your 4 main windows. Want me to handle the rest?"
|
|
361
|
+
- Safe single-action alternatives that handle any number of windows: `distribute` (auto-grid), `undo` (restore all)
|
|
362
|
+
- `swap` is always exactly 2 windows — always safe
|
|
363
|
+
- `hide`, `highlight`, `move_to_display` are single-window operations — always safe
|
|
364
|
+
|
|
365
|
+
## What not to do
|
|
366
|
+
|
|
367
|
+
- Don't act without telling the user what you're about to do
|
|
368
|
+
- Don't move windows the user didn't ask about
|
|
369
|
+
- Don't over-explain. One sentence, not a paragraph
|
|
370
|
+
- NEVER say window IDs, wids, or numbers in speech. The user doesn't know or care about "wid 423". Instead say "the Chrome window" or "the iTerm window running Claude Code in the lattices project"
|
|
371
|
+
- Don't suggest things every turn. Be helpful, not nagging
|
|
372
|
+
- Don't hallucinate windows. Only reference what's in the snapshot
|
|
373
|
+
- Don't use lists or bullet points — this is spoken text, not a document
|
|
374
|
+
- Don't rearrange windows the user didn't mention just because you think it would look better
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Hands-Off Sidecar — Per-Turn Template
|
|
2
|
+
|
|
3
|
+
USER: "{{transcript}}"
|
|
4
|
+
|
|
5
|
+
--- DESKTOP SNAPSHOT ---
|
|
6
|
+
{{#if stage_manager}}
|
|
7
|
+
Stage Manager: ON (grouping: {{sm_grouping}})
|
|
8
|
+
|
|
9
|
+
Active stage ({{active_count}} windows):
|
|
10
|
+
{{#each active_stage}}
|
|
11
|
+
[{{wid}}] {{app}}: "{{title}}" — {{x}},{{y}} {{w}}x{{h}}
|
|
12
|
+
{{/each}}
|
|
13
|
+
|
|
14
|
+
Strip ({{strip_count}} thumbnails): {{strip_apps}}
|
|
15
|
+
Other stages: {{hidden_apps}}
|
|
16
|
+
{{else}}
|
|
17
|
+
Stage Manager: OFF
|
|
18
|
+
|
|
19
|
+
Visible windows ({{visible_count}}):
|
|
20
|
+
{{#each visible_windows}}
|
|
21
|
+
[{{wid}}] {{app}}: "{{title}}" — {{x}},{{y}} {{w}}x{{h}}
|
|
22
|
+
{{/each}}
|
|
23
|
+
{{/if}}
|
|
24
|
+
|
|
25
|
+
{{#if current_layer}}
|
|
26
|
+
Current layer: {{layer_name}} (id: {{layer_id}})
|
|
27
|
+
{{/if}}
|
|
28
|
+
|
|
29
|
+
Screen: {{screen_w}}x{{screen_h}}, usable: {{usable_w}}x{{usable_h}}
|
|
30
|
+
--- END SNAPSHOT ---
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Voice Advisor (Haiku) — System Prompt
|
|
2
|
+
|
|
3
|
+
You are an advisor for Lattices, a macOS workspace manager. You run alongside voice commands, providing commentary and follow-up suggestions.
|
|
4
|
+
|
|
5
|
+
## Available commands
|
|
6
|
+
|
|
7
|
+
{{intent_catalog}}
|
|
8
|
+
|
|
9
|
+
## Current windows
|
|
10
|
+
|
|
11
|
+
{{window_list}}
|
|
12
|
+
|
|
13
|
+
## Per-turn input
|
|
14
|
+
|
|
15
|
+
For each user message, you receive a voice transcript and what command was matched.
|
|
16
|
+
|
|
17
|
+
## Response format
|
|
18
|
+
|
|
19
|
+
Respond with ONLY a JSON object:
|
|
20
|
+
|
|
21
|
+
```json
|
|
22
|
+
{"commentary": "short observation or null", "suggestion": {"label": "button text", "intent": "intent_name", "slots": {"key": "value"}} or null}
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## Rules
|
|
26
|
+
|
|
27
|
+
- `commentary`: 1 sentence max. `null` if the matched command fully covers the request.
|
|
28
|
+
- `suggestion`: a follow-up action. `null` if none needed.
|
|
29
|
+
- Never suggest what was already executed.
|
|
30
|
+
- Suggestions MUST include all required slots. e.g. search requires `{"query": "..."}`.
|
|
31
|
+
- Be terse and useful, not chatty.
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
# Voice Fallback Resolver — Prompt
|
|
2
|
+
|
|
3
|
+
Voice command resolver. Whisper transcript (may have typos): "{{transcript}}"
|
|
4
|
+
|
|
5
|
+
## Available intents
|
|
6
|
+
|
|
7
|
+
{{intent_catalog}}
|
|
8
|
+
|
|
9
|
+
## Current windows
|
|
10
|
+
|
|
11
|
+
{{window_list}}
|
|
12
|
+
|
|
13
|
+
## Instructions
|
|
14
|
+
|
|
15
|
+
Return ONLY a JSON object like:
|
|
16
|
+
|
|
17
|
+
```json
|
|
18
|
+
{"intent": "search", "slots": {"query": "dewey"}, "reasoning": "user wants to find dewey windows"}
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
- For search, extract the key term.
|
|
22
|
+
- Use window names from the list when relevant.
|
|
23
|
+
- If unclear, use intent "unknown".
|
|
@@ -0,0 +1,167 @@
|
|
|
1
|
+
# Tiling Reference
|
|
2
|
+
|
|
3
|
+
Complete reference for Lattices window tiling — positions, grids, execution paths, and voice interpretation.
|
|
4
|
+
|
|
5
|
+
## Position System
|
|
6
|
+
|
|
7
|
+
Every tile position is a cell in a **cols × rows** grid, expressed as fractional `(x, y, w, h)` of the screen's visible area (excluding menu bar and dock).
|
|
8
|
+
|
|
9
|
+
### Named Positions
|
|
10
|
+
|
|
11
|
+
All valid position strings that `TilePosition` accepts:
|
|
12
|
+
|
|
13
|
+
| Position string | Grid | Cell (col, row) | Description |
|
|
14
|
+
|---|---|---|---|
|
|
15
|
+
| `maximize` | 1×1 | full | Full screen (100% × 100%) |
|
|
16
|
+
| `center` | — | — | Centered floating (70% × 80%, offset 15%/10%) |
|
|
17
|
+
| **Halves (2×1, full height)** | | | |
|
|
18
|
+
| `left` | 2×1 | 0,0 | Left 50% |
|
|
19
|
+
| `right` | 2×1 | 1,0 | Right 50% |
|
|
20
|
+
| **Halves (1×2, full width)** | | | |
|
|
21
|
+
| `top` | 1×2 | 0,0 | Top 50% |
|
|
22
|
+
| `bottom` | 1×2 | 0,1 | Bottom 50% |
|
|
23
|
+
| **Quarters (2×2)** | | | |
|
|
24
|
+
| `top-left` | 2×2 | 0,0 | Top-left 25% |
|
|
25
|
+
| `top-right` | 2×2 | 1,0 | Top-right 25% |
|
|
26
|
+
| `bottom-left` | 2×2 | 0,1 | Bottom-left 25% |
|
|
27
|
+
| `bottom-right` | 2×2 | 1,1 | Bottom-right 25% |
|
|
28
|
+
| **Thirds (3×1, full height)** | | | |
|
|
29
|
+
| `left-third` | 3×1 | 0,0 | Left 33% column |
|
|
30
|
+
| `center-third` | 3×1 | 1,0 | Center 33% column |
|
|
31
|
+
| `right-third` | 3×1 | 2,0 | Right 33% column |
|
|
32
|
+
| **Sixths (3×2)** | | | |
|
|
33
|
+
| `top-left-third` | 3×2 | 0,0 | Top-left sixth |
|
|
34
|
+
| `top-center-third` | 3×2 | 1,0 | Top-center sixth |
|
|
35
|
+
| `top-right-third` | 3×2 | 2,0 | Top-right sixth |
|
|
36
|
+
| `bottom-left-third` | 3×2 | 0,1 | Bottom-left sixth |
|
|
37
|
+
| `bottom-center-third` | 3×2 | 1,1 | Bottom-center sixth |
|
|
38
|
+
| `bottom-right-third` | 3×2 | 2,1 | Bottom-right sixth |
|
|
39
|
+
| **Fourths (4×1, full height)** | | | |
|
|
40
|
+
| `first-fourth` | 4×1 | 0,0 | Leftmost 25% column |
|
|
41
|
+
| `second-fourth` | 4×1 | 1,0 | Second 25% column |
|
|
42
|
+
| `third-fourth` | 4×1 | 2,0 | Third 25% column |
|
|
43
|
+
| `last-fourth` | 4×1 | 3,0 | Rightmost 25% column |
|
|
44
|
+
| **Eighths (4×2)** | | | |
|
|
45
|
+
| `top-first-fourth` | 4×2 | 0,0 | Top row, 1st column |
|
|
46
|
+
| `top-second-fourth` | 4×2 | 1,0 | Top row, 2nd column |
|
|
47
|
+
| `top-third-fourth` | 4×2 | 2,0 | Top row, 3rd column |
|
|
48
|
+
| `top-last-fourth` | 4×2 | 3,0 | Top row, 4th column |
|
|
49
|
+
| `bottom-first-fourth` | 4×2 | 0,1 | Bottom row, 1st column |
|
|
50
|
+
| `bottom-second-fourth` | 4×2 | 1,1 | Bottom row, 2nd column |
|
|
51
|
+
| `bottom-third-fourth` | 4×2 | 2,1 | Bottom row, 3rd column |
|
|
52
|
+
| `bottom-last-fourth` | 4×2 | 3,1 | Bottom row, 4th column |
|
|
53
|
+
| **Horizontal thirds (1×3)** | | | |
|
|
54
|
+
| `top-third` | 1×3 | 0,0 | Top 33% row |
|
|
55
|
+
| `middle-third` | 1×3 | 0,1 | Middle 33% row |
|
|
56
|
+
| `bottom-third` | 1×3 | 0,2 | Bottom 33% row |
|
|
57
|
+
| **Edge quarters** | | | |
|
|
58
|
+
| `left-quarter` | 4×1 | 0,0 | Leftmost 25% column |
|
|
59
|
+
| `right-quarter` | 4×1 | 3,0 | Rightmost 25% column |
|
|
60
|
+
| `top-quarter` | 1×4 | 0,0 | Top 25% row |
|
|
61
|
+
| `bottom-quarter` | 1×4 | 0,3 | Bottom 25% row |
|
|
62
|
+
|
|
63
|
+
### Custom Grid Syntax
|
|
64
|
+
|
|
65
|
+
For arbitrary grids: `grid:CxR:C,R`
|
|
66
|
+
|
|
67
|
+
- `C` = total columns, `R` = total rows
|
|
68
|
+
- `C,R` = target cell (0-indexed position)
|
|
69
|
+
- Example: `grid:5x3:2,1` = center cell of a 5×3 grid
|
|
70
|
+
|
|
71
|
+
Parsed by `PlacementSpec` / `parseGridString()` into fractional `(x, y, w, h)`.
|
|
72
|
+
|
|
73
|
+
### Placement Contract
|
|
74
|
+
|
|
75
|
+
Placement strings are convenient at the boundary, but the daemon uses a
|
|
76
|
+
typed placement model internally:
|
|
77
|
+
|
|
78
|
+
- named tile positions
|
|
79
|
+
- arbitrary grid cells
|
|
80
|
+
- raw fractional rectangles
|
|
81
|
+
|
|
82
|
+
That is what keeps CLI, daemon, voice, and hands-off execution aligned.
|
|
83
|
+
|
|
84
|
+
## Execution Paths
|
|
85
|
+
|
|
86
|
+
The old split-brain tiling logic has been collapsed toward a shared path.
|
|
87
|
+
The canonical mutation is now:
|
|
88
|
+
|
|
89
|
+
```json
|
|
90
|
+
{ "method": "window.place", "params": { "placement": "left" } }
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
All higher-level surfaces should compile into the same placement model:
|
|
94
|
+
|
|
95
|
+
- **Daemon / CLI**: `window.place` is the canonical mutation
|
|
96
|
+
- **Compatibility**: `window.tile` maps to `window.place`
|
|
97
|
+
- **Voice / hands-off**: parse natural language, then emit a placement spec
|
|
98
|
+
- **HUD**: still exposes a smaller shortcut set, but should target the same placement executor
|
|
99
|
+
|
|
100
|
+
The important change is that placement resolution now happens through
|
|
101
|
+
`PlacementSpec`, not through separate ad hoc parsers per surface.
|
|
102
|
+
|
|
103
|
+
## Frame Calculation
|
|
104
|
+
|
|
105
|
+
All paths eventually call one of:
|
|
106
|
+
|
|
107
|
+
1. **`WindowTiler.tileFrame(for:on:)`** — takes a `TilePosition` + `NSScreen`, returns a `CGRect` in AX coordinates (origin = top-left of primary display)
|
|
108
|
+
2. **`WindowTiler.tileFrame(fractions:inDisplay:)`** — takes raw `(x, y, w, h)` fractions + display rect
|
|
109
|
+
|
|
110
|
+
The math:
|
|
111
|
+
```
|
|
112
|
+
visible = screen.visibleFrame (excludes menu bar + dock)
|
|
113
|
+
primaryH = primary screen height
|
|
114
|
+
axTop = primaryH - visible.maxY (flip from AppKit bottom-left to AX top-left)
|
|
115
|
+
|
|
116
|
+
frame.x = visible.x + visible.width × fx
|
|
117
|
+
frame.y = axTop + visible.height × fy
|
|
118
|
+
frame.w = visible.width × fw
|
|
119
|
+
frame.h = visible.height × fh
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Window Targeting
|
|
123
|
+
|
|
124
|
+
The `tile_window` intent resolves the target window in this priority:
|
|
125
|
+
|
|
126
|
+
1. **`session`** slot → `LatticesApi.window.place` / `window.tile` compatibility wrapper
|
|
127
|
+
2. **`wid`** slot → `DesktopModel.shared.windows[wid]` (direct window ID lookup)
|
|
128
|
+
3. **`app`** slot → first matching window by `localizedCaseInsensitiveContains`, excluding recently-tiled windows (prevents double-matching in batch commands like "Chrome left, Chrome right")
|
|
129
|
+
4. **No target** → tiles the frontmost window
|
|
130
|
+
|
|
131
|
+
### HandsOff-specific targeting
|
|
132
|
+
|
|
133
|
+
The system prompt instructs the LLM to always use `wid` from the desktop snapshot, never `app`. This avoids ambiguity when multiple windows of the same app exist. In speech, the LLM says the app name; in the JSON action, it uses the wid.
|
|
134
|
+
|
|
135
|
+
## Common Layouts (multi-action)
|
|
136
|
+
|
|
137
|
+
These are composed from multiple `tile_window` actions:
|
|
138
|
+
|
|
139
|
+
| Layout | Actions |
|
|
140
|
+
|---|---|
|
|
141
|
+
| Split screen | left + right |
|
|
142
|
+
| Stack | top + bottom |
|
|
143
|
+
| Thirds | left-third + center-third + right-third |
|
|
144
|
+
| Quadrants | top-left + top-right + bottom-left + bottom-right |
|
|
145
|
+
| Six-up (3×2) | All six `*-*-third` positions |
|
|
146
|
+
| Eight-up (4×2) | All eight `*-*-fourth` positions |
|
|
147
|
+
| Distribute | Single `distribute` intent (auto-grid) |
|
|
148
|
+
|
|
149
|
+
## HandsOff Smart Distribution
|
|
150
|
+
|
|
151
|
+
When the LLM sends multiple `tile_window` actions targeting the **same position**, `HandsOffSession.distributeTileActions()` subdivides:
|
|
152
|
+
|
|
153
|
+
- 2+ windows → "left" becomes top-left, left, bottom-left
|
|
154
|
+
- 2+ windows → "right" becomes top-right, right, bottom-right
|
|
155
|
+
- 2+ windows → "maximize" fans out to quadrants then halves
|
|
156
|
+
|
|
157
|
+
## Guardrails
|
|
158
|
+
|
|
159
|
+
- **Typed placement validation**: invalid placement strings or objects are rejected at the daemon boundary.
|
|
160
|
+
- **Recently-tiled dedup**: `IntentEngine.recentlyTiledWids` prevents the same window from being matched twice within 2 seconds during batch operations.
|
|
161
|
+
- **Compatibility wrappers**: `window.tile` still works, but routes through the same placement machinery.
|
|
162
|
+
|
|
163
|
+
## Current Gaps
|
|
164
|
+
|
|
165
|
+
1. **Voice extraction still needs to catch up**: the canonical executor understands horizontal thirds and edge quarters, but the local voice resolver still needs broader phrase coverage.
|
|
166
|
+
2. **HUD coverage is narrower than the executor**: keyboard tiling exposes a small subset of the full placement vocabulary.
|
|
167
|
+
3. **Optimization and layer actions are still wrapper-level**: `space.optimize` and `layer.activate` are now stable action IDs, but they currently wrap existing distributor and layer-switching behavior rather than a full planner.
|