@arach/lattices 0.2.0 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +172 -86
- package/apps/mac/Info.plist +43 -0
- package/apps/mac/Lattices.app/Contents/Info.plist +43 -0
- package/apps/mac/Lattices.app/Contents/MacOS/Lattices +0 -0
- package/apps/mac/Lattices.app/Contents/Resources/AppIcon.icns +0 -0
- package/apps/mac/Lattices.app/Contents/Resources/docs/assistant-knowledge.md +130 -0
- package/apps/mac/Lattices.app/Contents/Resources/tap.wav +0 -0
- package/apps/mac/Lattices.app/Contents/_CodeSignature/CodeResources +150 -0
- package/apps/mac/Lattices.entitlements +21 -0
- package/apps/mac/Resources/Pets/assistant-spark/pet.json +62 -0
- package/apps/mac/Resources/Pets/assistant-spark/spritesheet.webp +0 -0
- package/apps/mac/Resources/Pets/scout-ranger/pet.json +6 -0
- package/apps/mac/Resources/Pets/scout-ranger/spritesheet.webp +0 -0
- package/apps/mac/Resources/tap.wav +0 -0
- package/assets/AppIcon.icns +0 -0
- package/bin/assistant-intelligence.ts +912 -0
- package/bin/cli/capture.ts +252 -0
- package/bin/cli/daemon.ts +22 -0
- package/bin/cli/helpers.ts +105 -0
- package/bin/cli/layer.ts +178 -0
- package/bin/cli/runs.ts +43 -0
- package/bin/cli/search.ts +141 -0
- package/bin/cli/session.ts +32 -0
- package/bin/client.ts +17 -0
- package/bin/cua.ts +26 -0
- package/bin/{daemon-client.js → daemon-client.ts} +49 -30
- package/bin/handsoff-infer.ts +96 -0
- package/bin/handsoff-worker.ts +531 -0
- package/bin/infer.ts +424 -0
- package/bin/keychain.ts +75 -0
- package/bin/lattices-app.ts +655 -0
- package/bin/lattices-build +125 -0
- package/bin/lattices-build-env.ts +77 -0
- package/bin/lattices-dev +362 -0
- package/bin/lattices.ts +3260 -0
- package/bin/project-twin.ts +645 -0
- package/docs/agent-execution-plan.md +562 -0
- package/docs/agent-layer-guide.md +207 -0
- package/docs/agents.md +233 -0
- package/docs/ai-chat-ux-review.md +416 -0
- package/docs/api.md +1041 -47
- package/docs/app.md +96 -13
- package/docs/assistant-knowledge.md +130 -0
- package/docs/companion-deck.md +209 -0
- package/docs/component-extraction-roadmap.md +392 -0
- package/docs/concepts.md +13 -12
- package/docs/config.md +83 -10
- package/docs/gesture-customization-proposal.md +520 -0
- package/docs/handsoff-test-scenarios.md +84 -0
- package/docs/hyperspace-grid-snappiness.md +210 -0
- package/docs/layers.md +176 -28
- package/docs/mouse-gestures.md +244 -0
- package/docs/ocr.md +21 -9
- package/docs/overview.md +42 -23
- package/docs/presentation-execution-review.md +491 -0
- package/docs/prompts/hands-off-system.md +382 -0
- package/docs/prompts/hands-off-turn.md +30 -0
- package/docs/prompts/voice-advisor.md +31 -0
- package/docs/prompts/voice-fallback.md +23 -0
- package/docs/proposals/LAT-001-gesture-visual-customization.md +522 -0
- package/docs/proposals/LAT-002-shared-overlay-canvas.md +353 -0
- package/docs/proposals/LAT-003-menu-bar-controller-architecture.md +291 -0
- package/docs/proposals/LAT-004-interactive-overlay-actors.md +534 -0
- package/docs/proposals/LAT-005-action-runtime-product-spine.md +914 -0
- package/docs/proposals/LAT-006-followup-gaps.md +103 -0
- package/docs/proposals/LAT-006-runs-and-capture-in-lattices.md +566 -0
- package/docs/proposals/LAT-007-unified-app-shell.md +128 -0
- package/docs/quickstart.md +8 -12
- package/docs/reference/dewey.config.ts +74 -0
- package/docs/reference/install-agent.md +79 -0
- package/docs/release.md +172 -0
- package/docs/repo-structure.md +100 -0
- package/docs/terminal-kit.md +87 -0
- package/docs/tiling-reference.md +224 -0
- package/docs/twins.md +138 -0
- package/docs/voice-command-protocol.md +278 -0
- package/docs/voice-error-model.md +73 -0
- package/docs/voice.md +221 -0
- package/package.json +69 -16
- package/packages/npm/sdk/cua.d.mts +1 -0
- package/packages/npm/sdk/cua.d.ts +188 -0
- package/packages/npm/sdk/cua.mjs +376 -0
- package/app/Lattices.app/Contents/Info.plist +0 -24
- package/app/Package.swift +0 -13
- package/app/Sources/ActionRow.swift +0 -61
- package/app/Sources/App.swift +0 -10
- package/app/Sources/AppDelegate.swift +0 -234
- package/app/Sources/AppShellView.swift +0 -62
- package/app/Sources/AppTypeClassifier.swift +0 -70
- package/app/Sources/AppWindowShell.swift +0 -63
- package/app/Sources/CheatSheetHUD.swift +0 -332
- package/app/Sources/CommandModeState.swift +0 -1362
- package/app/Sources/CommandModeView.swift +0 -1405
- package/app/Sources/CommandModeWindow.swift +0 -192
- package/app/Sources/CommandPaletteView.swift +0 -307
- package/app/Sources/CommandPaletteWindow.swift +0 -134
- package/app/Sources/DaemonProtocol.swift +0 -101
- package/app/Sources/DaemonServer.swift +0 -414
- package/app/Sources/DesktopModel.swift +0 -121
- package/app/Sources/DesktopModelTypes.swift +0 -71
- package/app/Sources/DiagnosticLog.swift +0 -271
- package/app/Sources/EventBus.swift +0 -30
- package/app/Sources/HotkeyManager.swift +0 -250
- package/app/Sources/HotkeyStore.swift +0 -338
- package/app/Sources/InventoryManager.swift +0 -35
- package/app/Sources/InventoryPath.swift +0 -43
- package/app/Sources/KeyRecorderView.swift +0 -210
- package/app/Sources/LatticesApi.swift +0 -1125
- package/app/Sources/MainView.swift +0 -467
- package/app/Sources/MainWindow.swift +0 -83
- package/app/Sources/OcrModel.swift +0 -309
- package/app/Sources/OcrStore.swift +0 -295
- package/app/Sources/OmniSearchState.swift +0 -283
- package/app/Sources/OmniSearchView.swift +0 -288
- package/app/Sources/OmniSearchWindow.swift +0 -105
- package/app/Sources/OrphanRow.swift +0 -129
- package/app/Sources/PaletteCommand.swift +0 -419
- package/app/Sources/PermissionChecker.swift +0 -125
- package/app/Sources/Preferences.swift +0 -92
- package/app/Sources/ProcessModel.swift +0 -199
- package/app/Sources/ProcessQuery.swift +0 -151
- package/app/Sources/Project.swift +0 -28
- package/app/Sources/ProjectRow.swift +0 -368
- package/app/Sources/ProjectScanner.swift +0 -121
- package/app/Sources/ScreenMapState.swift +0 -2387
- package/app/Sources/ScreenMapView.swift +0 -2820
- package/app/Sources/ScreenMapWindowController.swift +0 -89
- package/app/Sources/SessionManager.swift +0 -72
- package/app/Sources/SettingsView.swift +0 -1053
- package/app/Sources/SettingsWindow.swift +0 -20
- package/app/Sources/TabGroupRow.swift +0 -178
- package/app/Sources/Terminal.swift +0 -259
- package/app/Sources/TerminalQuery.swift +0 -156
- package/app/Sources/TerminalSynthesizer.swift +0 -200
- package/app/Sources/Theme.swift +0 -163
- package/app/Sources/TilePickerView.swift +0 -209
- package/app/Sources/TmuxModel.swift +0 -53
- package/app/Sources/TmuxQuery.swift +0 -81
- package/app/Sources/WindowTiler.swift +0 -1755
- package/app/Sources/WorkspaceManager.swift +0 -434
- package/bin/lattices-app.js +0 -221
- package/bin/lattices.js +0 -1418
|
@@ -0,0 +1,224 @@
|
|
|
1
|
+
# Tiling Reference
|
|
2
|
+
|
|
3
|
+
Complete reference for Lattices window tiling — positions, grids, execution paths, and voice interpretation.
|
|
4
|
+
|
|
5
|
+
## Position System
|
|
6
|
+
|
|
7
|
+
Every tile position is a cell in a **cols × rows** grid, expressed as fractional `(x, y, w, h)` of the screen's visible area (excluding menu bar and dock).
|
|
8
|
+
|
|
9
|
+
### Named Positions
|
|
10
|
+
|
|
11
|
+
All valid position strings that `TilePosition` accepts:
|
|
12
|
+
|
|
13
|
+
| Position string | Grid | Cell (col, row) | Description |
|
|
14
|
+
|---|---|---|---|
|
|
15
|
+
| `maximize` | 1×1 | full | Full screen (100% × 100%) |
|
|
16
|
+
| `center` | — | — | Centered floating (70% × 80%, offset 15%/10%) |
|
|
17
|
+
| **Halves (2×1, full height)** | | | |
|
|
18
|
+
| `left` | 2×1 | 0,0 | Left 50% |
|
|
19
|
+
| `right` | 2×1 | 1,0 | Right 50% |
|
|
20
|
+
| **Halves (1×2, full width)** | | | |
|
|
21
|
+
| `top` | 1×2 | 0,0 | Top 50% |
|
|
22
|
+
| `bottom` | 1×2 | 0,1 | Bottom 50% |
|
|
23
|
+
| **Quarters (2×2)** | | | |
|
|
24
|
+
| `top-left` | 2×2 | 0,0 | Top-left 25% |
|
|
25
|
+
| `top-right` | 2×2 | 1,0 | Top-right 25% |
|
|
26
|
+
| `bottom-left` | 2×2 | 0,1 | Bottom-left 25% |
|
|
27
|
+
| `bottom-right` | 2×2 | 1,1 | Bottom-right 25% |
|
|
28
|
+
| **Thirds (3×1, full height)** | | | |
|
|
29
|
+
| `left-third` | 3×1 | 0,0 | Left 33% column |
|
|
30
|
+
| `center-third` | 3×1 | 1,0 | Center 33% column |
|
|
31
|
+
| `right-third` | 3×1 | 2,0 | Right 33% column |
|
|
32
|
+
| **Sixths (3×2)** | | | |
|
|
33
|
+
| `top-left-third` | 3×2 | 0,0 | Top-left sixth |
|
|
34
|
+
| `top-center-third` | 3×2 | 1,0 | Top-center sixth |
|
|
35
|
+
| `top-right-third` | 3×2 | 2,0 | Top-right sixth |
|
|
36
|
+
| `bottom-left-third` | 3×2 | 0,1 | Bottom-left sixth |
|
|
37
|
+
| `bottom-center-third` | 3×2 | 1,1 | Bottom-center sixth |
|
|
38
|
+
| `bottom-right-third` | 3×2 | 2,1 | Bottom-right sixth |
|
|
39
|
+
| **Fourths (4×1, full height)** | | | |
|
|
40
|
+
| `first-fourth` | 4×1 | 0,0 | Leftmost 25% column |
|
|
41
|
+
| `second-fourth` | 4×1 | 1,0 | Second 25% column |
|
|
42
|
+
| `third-fourth` | 4×1 | 2,0 | Third 25% column |
|
|
43
|
+
| `last-fourth` | 4×1 | 3,0 | Rightmost 25% column |
|
|
44
|
+
| **Eighths (4×2)** | | | |
|
|
45
|
+
| `top-first-fourth` | 4×2 | 0,0 | Top row, 1st column |
|
|
46
|
+
| `top-second-fourth` | 4×2 | 1,0 | Top row, 2nd column |
|
|
47
|
+
| `top-third-fourth` | 4×2 | 2,0 | Top row, 3rd column |
|
|
48
|
+
| `top-last-fourth` | 4×2 | 3,0 | Top row, 4th column |
|
|
49
|
+
| `bottom-first-fourth` | 4×2 | 0,1 | Bottom row, 1st column |
|
|
50
|
+
| `bottom-second-fourth` | 4×2 | 1,1 | Bottom row, 2nd column |
|
|
51
|
+
| `bottom-third-fourth` | 4×2 | 2,1 | Bottom row, 3rd column |
|
|
52
|
+
| `bottom-last-fourth` | 4×2 | 3,1 | Bottom row, 4th column |
|
|
53
|
+
| **Horizontal thirds (1×3)** | | | |
|
|
54
|
+
| `top-third` | 1×3 | 0,0 | Top 33% row |
|
|
55
|
+
| `middle-third` | 1×3 | 0,1 | Middle 33% row |
|
|
56
|
+
| `bottom-third` | 1×3 | 0,2 | Bottom 33% row |
|
|
57
|
+
| **Edge quarters** | | | |
|
|
58
|
+
| `left-quarter` | 4×1 | 0,0 | Leftmost 25% column |
|
|
59
|
+
| `right-quarter` | 4×1 | 3,0 | Rightmost 25% column |
|
|
60
|
+
| `top-quarter` | 1×4 | 0,0 | Top 25% row |
|
|
61
|
+
| `bottom-quarter` | 1×4 | 0,3 | Bottom 25% row |
|
|
62
|
+
|
|
63
|
+
### Custom Grid Syntax
|
|
64
|
+
|
|
65
|
+
For arbitrary grids: compact `CxR:C,R` or canonical `grid:CxR:C,R`
|
|
66
|
+
|
|
67
|
+
- `C` = total columns, `R` = total rows
|
|
68
|
+
- Compact `CxR:C,R` starts at 1 from the top-left
|
|
69
|
+
- Canonical `grid:CxR:C,R` starts at 0 for API/wire compatibility
|
|
70
|
+
- Example: `5x3:3,2` or `grid:5x3:2,1` = center cell of a 5×3 grid
|
|
71
|
+
- Spans can use two inclusive corners, such as `4x4:1,1-2,2`
|
|
72
|
+
|
|
73
|
+
Parsed by `PlacementSpec` / `parseGridString()` into fractional `(x, y, w, h)`.
|
|
74
|
+
|
|
75
|
+
### Placement Contract
|
|
76
|
+
|
|
77
|
+
Placement strings are convenient at the boundary, but the daemon uses a
|
|
78
|
+
typed placement model internally:
|
|
79
|
+
|
|
80
|
+
- named tile positions
|
|
81
|
+
- arbitrary grid cells
|
|
82
|
+
- raw fractional rectangles
|
|
83
|
+
|
|
84
|
+
That is what keeps CLI, daemon, voice, and hands-off execution aligned.
|
|
85
|
+
|
|
86
|
+
## Drag Snap Zones
|
|
87
|
+
|
|
88
|
+
The menu bar app can also use placement specs as drag-to-snap targets.
|
|
89
|
+
When you drag a window, Lattices shows faint landing zones plus a live
|
|
90
|
+
preview of the resulting frame. Releasing over a zone tiles the dragged
|
|
91
|
+
window to that placement. Hold `Command` while dragging to reveal snap
|
|
92
|
+
mode, and release `Command` to drop back to a normal free drag without
|
|
93
|
+
ending the gesture.
|
|
94
|
+
|
|
95
|
+
The recommended agent-owned config lives in `~/.lattices/snap-zones.json`:
|
|
96
|
+
|
|
97
|
+
```json
|
|
98
|
+
{
|
|
99
|
+
"enabled": true,
|
|
100
|
+
"modifier": "command",
|
|
101
|
+
"zoneOpacity": 0.08,
|
|
102
|
+
"highlightOpacity": 0.18,
|
|
103
|
+
"previewOpacity": 0.14,
|
|
104
|
+
"rules": [
|
|
105
|
+
{
|
|
106
|
+
"id": "left-edge",
|
|
107
|
+
"label": "Left",
|
|
108
|
+
"placement": "left",
|
|
109
|
+
"trigger": { "x": 0.0, "y": 0.18, "w": 0.12, "h": 0.64 },
|
|
110
|
+
"priority": 10
|
|
111
|
+
},
|
|
112
|
+
{
|
|
113
|
+
"id": "notes-rail",
|
|
114
|
+
"label": "Notes",
|
|
115
|
+
"placement": { "x": 0.68, "y": 0.0, "w": 0.32, "h": 1.0 },
|
|
116
|
+
"trigger": { "x": 0.88, "y": 0.18, "w": 0.12, "h": 0.64 },
|
|
117
|
+
"priority": 30
|
|
118
|
+
}
|
|
119
|
+
]
|
|
120
|
+
}
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Notes:
|
|
124
|
+
|
|
125
|
+
- `rules` is the preferred list key. `zones` is still accepted for backward compatibility.
|
|
126
|
+
- `modifier` accepts `command`, `option`, `control`, or `shift`.
|
|
127
|
+
- `placement` can be a named placement/preset string or raw fractions.
|
|
128
|
+
- `trigger` uses normalized `(x, y, w, h)` fractions of the screen's
|
|
129
|
+
visible area, with `y = 0` at the top.
|
|
130
|
+
- `priority` breaks ties when trigger regions overlap.
|
|
131
|
+
- `trigger` can also be a named placement or preset string if you want
|
|
132
|
+
the trigger region itself to reuse an existing tile definition.
|
|
133
|
+
- The older `~/.lattices/grid.json` `snapZones` section still works, but
|
|
134
|
+
`~/.lattices/snap-zones.json` is the cleaner file for agents to edit.
|
|
135
|
+
|
|
136
|
+
## Execution Paths
|
|
137
|
+
|
|
138
|
+
The old split-brain tiling logic has been collapsed toward a shared path.
|
|
139
|
+
The canonical mutation is now:
|
|
140
|
+
|
|
141
|
+
```json
|
|
142
|
+
{ "method": "window.place", "params": { "placement": "left" } }
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
All higher-level surfaces should compile into the same placement model:
|
|
146
|
+
|
|
147
|
+
- **Daemon / CLI**: `window.place` is the canonical mutation
|
|
148
|
+
- **Compatibility**: `window.tile` maps to `window.place`
|
|
149
|
+
- **Voice / hands-off**: parse natural language, then emit a placement spec
|
|
150
|
+
- **HUD**: still exposes a smaller shortcut set, but should target the same placement executor
|
|
151
|
+
|
|
152
|
+
The important change is that placement resolution now happens through
|
|
153
|
+
`PlacementSpec`, not through separate ad hoc parsers per surface.
|
|
154
|
+
|
|
155
|
+
## Frame Calculation
|
|
156
|
+
|
|
157
|
+
All paths eventually call one of:
|
|
158
|
+
|
|
159
|
+
1. **`WindowTiler.tileFrame(for:on:)`** — takes a `TilePosition` + `NSScreen`, returns a `CGRect` in AX coordinates (origin = top-left of primary display)
|
|
160
|
+
2. **`WindowTiler.tileFrame(fractions:inDisplay:)`** — takes raw `(x, y, w, h)` fractions + display rect
|
|
161
|
+
|
|
162
|
+
The math:
|
|
163
|
+
```
|
|
164
|
+
visible = screen.visibleFrame (excludes menu bar + dock)
|
|
165
|
+
primaryH = primary screen height
|
|
166
|
+
axTop = primaryH - visible.maxY (flip from AppKit bottom-left to AX top-left)
|
|
167
|
+
|
|
168
|
+
frame.x = visible.x + visible.width × fx
|
|
169
|
+
frame.y = axTop + visible.height × fy
|
|
170
|
+
frame.w = visible.width × fw
|
|
171
|
+
frame.h = visible.height × fh
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
## Window Targeting
|
|
175
|
+
|
|
176
|
+
The `tile_window` intent resolves the target window in this priority:
|
|
177
|
+
|
|
178
|
+
1. **`session`** slot → `LatticesApi.window.place` / `window.tile` compatibility wrapper
|
|
179
|
+
2. **`wid`** slot → `DesktopModel.shared.windows[wid]` (direct window ID lookup)
|
|
180
|
+
3. **`app`** slot → first matching window by `localizedCaseInsensitiveContains`, excluding recently-tiled windows (prevents double-matching in batch commands like "Chrome left, Chrome right")
|
|
181
|
+
4. **No target** → tiles the frontmost window
|
|
182
|
+
|
|
183
|
+
### HandsOff-specific targeting
|
|
184
|
+
|
|
185
|
+
The system prompt instructs the LLM to always use `wid` from the desktop snapshot, never `app`. This avoids ambiguity when multiple windows of the same app exist. In speech, the LLM says the app name; in the JSON action, it uses the wid.
|
|
186
|
+
|
|
187
|
+
## Common Layouts (multi-action)
|
|
188
|
+
|
|
189
|
+
These are composed from multiple `tile_window` actions:
|
|
190
|
+
|
|
191
|
+
| Layout | Actions |
|
|
192
|
+
|---|---|
|
|
193
|
+
| Split screen | left + right |
|
|
194
|
+
| Stack | top + bottom |
|
|
195
|
+
| Thirds | left-third + center-third + right-third |
|
|
196
|
+
| Quadrants | top-left + top-right + bottom-left + bottom-right |
|
|
197
|
+
| Six-up (3×2) | All six `*-*-third` positions |
|
|
198
|
+
| Eight-up (4×2) | All eight `*-*-fourth` positions |
|
|
199
|
+
| Distribute | Single `distribute` intent (auto-grid) |
|
|
200
|
+
|
|
201
|
+
CLI shortcuts compile into the same distributor:
|
|
202
|
+
|
|
203
|
+
- `lattices tile family` → smart-grid the frontmost app's visible windows
|
|
204
|
+
- `lattices distribute iTerm2 right` → smart-grid visible iTerm windows inside the right half
|
|
205
|
+
|
|
206
|
+
## HandsOff Smart Distribution
|
|
207
|
+
|
|
208
|
+
When the LLM sends multiple `tile_window` actions targeting the **same position**, `HandsOffSession.distributeTileActions()` subdivides:
|
|
209
|
+
|
|
210
|
+
- 2+ windows → "left" becomes top-left, left, bottom-left
|
|
211
|
+
- 2+ windows → "right" becomes top-right, right, bottom-right
|
|
212
|
+
- 2+ windows → "maximize" fans out to quadrants then halves
|
|
213
|
+
|
|
214
|
+
## Guardrails
|
|
215
|
+
|
|
216
|
+
- **Typed placement validation**: invalid placement strings or objects are rejected at the daemon boundary.
|
|
217
|
+
- **Recently-tiled dedup**: `IntentEngine.recentlyTiledWids` prevents the same window from being matched twice within 2 seconds during batch operations.
|
|
218
|
+
- **Compatibility wrappers**: `window.tile` still works, but routes through the same placement machinery.
|
|
219
|
+
|
|
220
|
+
## Current Gaps
|
|
221
|
+
|
|
222
|
+
1. **Voice extraction still needs to catch up**: the canonical executor understands horizontal thirds and edge quarters, but the local voice resolver still needs broader phrase coverage.
|
|
223
|
+
2. **HUD coverage is narrower than the executor**: keyboard tiling exposes a small subset of the full placement vocabulary.
|
|
224
|
+
3. **Optimization and layer actions are still wrapper-level**: `space.optimize` and `layer.activate` are now stable action IDs, but they currently wrap existing distributor and layer-switching behavior rather than a full planner.
|
package/docs/twins.md
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: Project Twins
|
|
3
|
+
description: Pi-backed project twins for mediated, persistent agent execution
|
|
4
|
+
order: 3
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
A project twin is a persistent software counterpart to a codebase.
|
|
8
|
+
|
|
9
|
+
It is not the primary agent. It is the project-native runtime that sits
|
|
10
|
+
between a general-purpose caller and the project's execution protocol.
|
|
11
|
+
|
|
12
|
+
## Why a twin exists
|
|
13
|
+
|
|
14
|
+
General-purpose agents are interchangeable. Project protocols are not.
|
|
15
|
+
|
|
16
|
+
If every primary agent has to learn the project's tool surface, memory
|
|
17
|
+
policy, protocol semantics, and context conventions from scratch, the
|
|
18
|
+
integration becomes brittle. A twin fixes that by becoming the stable
|
|
19
|
+
project-facing runtime:
|
|
20
|
+
|
|
21
|
+
- The **primary agent** asks for work
|
|
22
|
+
- The **twin** resumes with the right context and memory
|
|
23
|
+
- The **protocol** stays behind the twin boundary
|
|
24
|
+
|
|
25
|
+
```text
|
|
26
|
+
primary agent -> project twin -> project protocol / harness
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
The twin is the client of record for the project.
|
|
30
|
+
|
|
31
|
+
## Responsibilities
|
|
32
|
+
|
|
33
|
+
A project twin owns:
|
|
34
|
+
|
|
35
|
+
- Project-scoped identity
|
|
36
|
+
- Persistent session continuity
|
|
37
|
+
- Memory compaction and continuation
|
|
38
|
+
- Tool policy and allowed capabilities
|
|
39
|
+
- Protocol knowledge
|
|
40
|
+
- Project context assembly
|
|
41
|
+
- Caller-facing summaries and handoffs
|
|
42
|
+
|
|
43
|
+
A primary agent should not speak the project protocol directly. It should
|
|
44
|
+
invoke the twin.
|
|
45
|
+
|
|
46
|
+
## Pi-backed runtime
|
|
47
|
+
|
|
48
|
+
Pi is a good fit for the twin runtime because it already provides:
|
|
49
|
+
|
|
50
|
+
- Persistent sessions
|
|
51
|
+
- RPC mode for long-running subprocess integration
|
|
52
|
+
- Tool calling with an explicit harness
|
|
53
|
+
- Compaction and summarization hooks
|
|
54
|
+
- Context files, prompt templates, and extension loading
|
|
55
|
+
|
|
56
|
+
That makes the split:
|
|
57
|
+
|
|
58
|
+
- **Twin**: product concept and policy boundary
|
|
59
|
+
- **Pi**: reasoning and session runtime
|
|
60
|
+
- **Host system**: orchestration, durable memory, and protocol adapters
|
|
61
|
+
|
|
62
|
+
Pi powers the twin. It does not define the twin.
|
|
63
|
+
|
|
64
|
+
## Invocation model
|
|
65
|
+
|
|
66
|
+
The primary agent makes a single mediated call into the twin:
|
|
67
|
+
|
|
68
|
+
1. Resume the twin session
|
|
69
|
+
2. Inject caller context, project memory, and protocol state
|
|
70
|
+
3. Let the twin do project-local work inside the harness
|
|
71
|
+
4. Return a concise result to the caller
|
|
72
|
+
|
|
73
|
+
The caller should see a stable capability surface such as:
|
|
74
|
+
|
|
75
|
+
- `status`
|
|
76
|
+
- `inspect`
|
|
77
|
+
- `plan`
|
|
78
|
+
- `execute`
|
|
79
|
+
- `summarize`
|
|
80
|
+
- `handoff`
|
|
81
|
+
|
|
82
|
+
It should not see raw protocol-shaped operations unless that protocol is
|
|
83
|
+
itself the public product surface.
|
|
84
|
+
|
|
85
|
+
## Implementation in this repo
|
|
86
|
+
|
|
87
|
+
This repo now includes a Pi-backed runtime in
|
|
88
|
+
[`bin/project-twin.ts`](/Users/arach/dev/lattices/bin/project-twin.ts).
|
|
89
|
+
|
|
90
|
+
The runtime:
|
|
91
|
+
|
|
92
|
+
- Spawns `pi --mode rpc` as a persistent subprocess
|
|
93
|
+
- Stores project-local session state under `.openscout/twins/<name>/sessions`
|
|
94
|
+
- Exposes a stable `invoke()` API for callers
|
|
95
|
+
- Optionally injects OpenScout relay context if `.openscout/relay*` exists
|
|
96
|
+
|
|
97
|
+
The default harness is intentionally narrow:
|
|
98
|
+
|
|
99
|
+
- Built-in Pi tools are explicitly pinned to `read,bash,edit,write`
|
|
100
|
+
- Extension, skill, and prompt-template discovery are disabled by default
|
|
101
|
+
- Project instructions still come from `AGENTS.md` and related context files
|
|
102
|
+
|
|
103
|
+
This keeps the twin deterministic unless the host explicitly widens the
|
|
104
|
+
surface.
|
|
105
|
+
|
|
106
|
+
## Example
|
|
107
|
+
|
|
108
|
+
```ts
|
|
109
|
+
import { ProjectTwin } from "@lattices/cli"
|
|
110
|
+
|
|
111
|
+
const twin = new ProjectTwin({
|
|
112
|
+
cwd: "/Users/you/dev/my-project",
|
|
113
|
+
name: "my-project",
|
|
114
|
+
model: "anthropic/claude-sonnet-4-5",
|
|
115
|
+
})
|
|
116
|
+
|
|
117
|
+
await twin.start()
|
|
118
|
+
|
|
119
|
+
const result = await twin.invoke({
|
|
120
|
+
caller: "primary-agent",
|
|
121
|
+
protocol: "openscout-relay",
|
|
122
|
+
memory: "The caller is debugging relay enrollment and wants the next safe action.",
|
|
123
|
+
task: "Inspect the available project context and summarize what the caller should do next.",
|
|
124
|
+
})
|
|
125
|
+
|
|
126
|
+
console.log(result.text)
|
|
127
|
+
|
|
128
|
+
await twin.stop()
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
## Design rule
|
|
132
|
+
|
|
133
|
+
All project-specific protocol semantics should live behind the twin
|
|
134
|
+
boundary.
|
|
135
|
+
|
|
136
|
+
The primary agent should invoke the twin as a skill-like capability.
|
|
137
|
+
The twin should own context assembly, protocol interaction, and the final
|
|
138
|
+
handoff back to the caller.
|
|
@@ -0,0 +1,278 @@
|
|
|
1
|
+
# Voice Command Protocol — Lattices ↔ Vox
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Lattices delegates all audio capture and transcription to Vox via WebSocket JSON-RPC. Lattices never accesses the microphone directly — it borrows Vox's mic and transcription pipeline, receives English text back, and routes it through its own intent engine.
|
|
6
|
+
|
|
7
|
+
These dictations are **ephemeral** — Vox does not persist them as memos, sync them, or add them to Vox's history. Lattices is just using Vox as a transcription pipe.
|
|
8
|
+
|
|
9
|
+
## Vox Process Model
|
|
10
|
+
|
|
11
|
+
Vox consists of three independent processes:
|
|
12
|
+
|
|
13
|
+
| Process | Role | Relevance to Lattices |
|
|
14
|
+
|---|---|---|
|
|
15
|
+
| **Vox.app** | Main UI — menu bar, notch visualization, memo history | None |
|
|
16
|
+
| **Vox** | Background service — mic access, recording, hotkeys, orchestrates transcription, state notifications | **This is what Lattices connects to** |
|
|
17
|
+
| **VoxEngine** | Transcription engine — runs Whisper models, called by Vox internally | Indirect — Vox delegates to it |
|
|
18
|
+
|
|
19
|
+
Vox is the right target because:
|
|
20
|
+
- It owns the mic and recording lifecycle
|
|
21
|
+
- It's the long-running background process (always up when Vox is installed)
|
|
22
|
+
- It already orchestrates the record → transcribe → result pipeline
|
|
23
|
+
- It's easy to discover via its existing DistributedNotification
|
|
24
|
+
|
|
25
|
+
## Service Discovery
|
|
26
|
+
|
|
27
|
+
Lattices never hardcodes ports. Discovery uses two mechanisms:
|
|
28
|
+
|
|
29
|
+
### 1. Well-known file (at rest)
|
|
30
|
+
|
|
31
|
+
Vox writes its service configuration on startup:
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
~/.vox/services.json
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
```json
|
|
38
|
+
{
|
|
39
|
+
"agent": {"port": 19823, "pid": 48209},
|
|
40
|
+
"engine": {"port": 19821, "pid": 48210},
|
|
41
|
+
"sync": {"port": 19820, "pid": 48208},
|
|
42
|
+
"inference": {"port": 19822, "pid": 48212}
|
|
43
|
+
}
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
Lattices reads `agent.port` from this file. If the file doesn't exist, Vox isn't installed.
|
|
47
|
+
|
|
48
|
+
### 2. DistributedNotification (live discovery)
|
|
49
|
+
|
|
50
|
+
Vox posts when it comes online:
|
|
51
|
+
|
|
52
|
+
```
|
|
53
|
+
Notification: com.jdi.vox.agent.live.ready
|
|
54
|
+
UserInfo: {"agentPort": 19823, "pid": 48209}
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
Lattices subscribes to this on startup. Handles:
|
|
58
|
+
- **Vox launches after Lattices** — Lattices picks up the port dynamically
|
|
59
|
+
- **Vox restarts** — Lattices reconnects with the new port
|
|
60
|
+
- **Port changes** — no stale config
|
|
61
|
+
|
|
62
|
+
### 3. Health check
|
|
63
|
+
|
|
64
|
+
After discovering a port, Lattices confirms Vox is alive:
|
|
65
|
+
|
|
66
|
+
```json
|
|
67
|
+
→ {"id": "hc", "method": "ping"}
|
|
68
|
+
← {"id": "hc", "result": {"pong": true}}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
If ping fails, Lattices marks voice as unavailable and retries on the next `live.ready` or after ~30 seconds.
|
|
72
|
+
|
|
73
|
+
### When Vox is not running
|
|
74
|
+
|
|
75
|
+
Three possible states:
|
|
76
|
+
|
|
77
|
+
| State | How detected | Lattices behavior |
|
|
78
|
+
|---|---|---|
|
|
79
|
+
| **Not installed** | `/Applications/Vox.app` doesn't exist and no `~/.vox/` dir | Footer: `[Space] Voice (unavailable)` — no recovery action |
|
|
80
|
+
| **Installed but not running** | App bundle exists, but `services.json` missing/stale or ping fails | Footer: `[Space] Voice (start Vox)` — pressing Space runs `open /Applications/Vox.app`, which brings up Vox as a side effect |
|
|
81
|
+
| **Running** | Ping succeeds | Normal operation |
|
|
82
|
+
|
|
83
|
+
Launch-on-demand flow:
|
|
84
|
+
1. User presses Space while Vox is down but Vox is installed
|
|
85
|
+
2. Lattices runs `NSWorkspace.shared.open(URL(fileURLWithPath: "/Applications/Vox.app"))`
|
|
86
|
+
3. Feedback strip shows "Starting Vox..."
|
|
87
|
+
4. Lattices waits for `live.ready` notification (timeout: 10s)
|
|
88
|
+
5. On `live.ready`, connects and proceeds with `startDictation`
|
|
89
|
+
6. On timeout, shows "Couldn't reach Vox — try opening it manually"
|
|
90
|
+
|
|
91
|
+
Passive behavior (no user action):
|
|
92
|
+
- No log spam — just a quiet unavailable state
|
|
93
|
+
- Lattices keeps listening for `live.ready` and re-checks `services.json` periodically (~30s)
|
|
94
|
+
- The moment Vox comes online, voice becomes available — no restart needed
|
|
95
|
+
|
|
96
|
+
## Protocol
|
|
97
|
+
|
|
98
|
+
### Wire Format
|
|
99
|
+
|
|
100
|
+
Uses Vox's JSON-RPC format over WebSocket:
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
Request: {"id": "...", "method": "...", "params": {...}}
|
|
104
|
+
Response: {"id": "...", "result": {...}} or {"id": "...", "error": "..."}
|
|
105
|
+
Event: {"event": "...", "data": {...}} (server push, no id)
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Methods (Lattices → Vox)
|
|
109
|
+
|
|
110
|
+
**`startDictation`** — Start recording from the mic.
|
|
111
|
+
|
|
112
|
+
```json
|
|
113
|
+
{"id": "1", "method": "startDictation", "params": {
|
|
114
|
+
"source": "lattices",
|
|
115
|
+
"persist": false
|
|
116
|
+
}}
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
- `source` — identifies the caller (for Vox's logging/UI)
|
|
120
|
+
- `persist: false` — do not save as a memo, do not sync, do not show in Vox history
|
|
121
|
+
|
|
122
|
+
Response (immediate ack):
|
|
123
|
+
```json
|
|
124
|
+
{"id": "1", "result": {"ok": true}}
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
Error responses:
|
|
128
|
+
```json
|
|
129
|
+
{"id": "1", "error": "Microphone access denied"}
|
|
130
|
+
{"id": "1", "error": "No model loaded"}
|
|
131
|
+
{"id": "1", "error": "mic_busy", "owner": "vox"}
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
The `mic_busy` error means another consumer (Vox's own memo recording, or another client) already has an active dictation. The `owner` field identifies who holds the mic. Lattices shows: "Mic in use by Vox — finish your memo first".
|
|
135
|
+
|
|
136
|
+
The reverse case (user hits Vox hotkey while Lattices has the mic) is handled on Vox's side — it should reject its own recording with an equivalent busy state. Vox is the single owner of mic arbitration.
|
|
137
|
+
|
|
138
|
+
**`stopDictation`** — Stop recording and return the transcript.
|
|
139
|
+
|
|
140
|
+
```json
|
|
141
|
+
{"id": "2", "method": "stopDictation"}
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
Response (after transcription completes):
|
|
145
|
+
```json
|
|
146
|
+
{"id": "2", "result": {
|
|
147
|
+
"transcript": "tile this left",
|
|
148
|
+
"confidence": 0.94,
|
|
149
|
+
"durationMs": 1820
|
|
150
|
+
}}
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
**`cancelDictation`** — Abort without transcribing.
|
|
154
|
+
|
|
155
|
+
```json
|
|
156
|
+
{"id": "3", "method": "cancelDictation"}
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
```json
|
|
160
|
+
{"id": "3", "result": {"ok": true}}
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
### Events (Vox → Lattices)
|
|
164
|
+
|
|
165
|
+
Pushed over the WebSocket connection during an active dictation.
|
|
166
|
+
|
|
167
|
+
| Event | When | Data |
|
|
168
|
+
|---|---|---|
|
|
169
|
+
| `dictation.started` | Mic is hot, recording has begun | `{"source": "lattices"}` |
|
|
170
|
+
| `dictation.transcribing` | Recording stopped, model is running | `{}` |
|
|
171
|
+
| `dictation.result` | Transcription complete | `{"transcript": "...", "confidence": 0.94, "durationMs": 1820}` |
|
|
172
|
+
| `dictation.error` | Something failed during recording or transcription | `{"message": "..."}` |
|
|
173
|
+
|
|
174
|
+
## Disconnect Contract
|
|
175
|
+
|
|
176
|
+
If the WebSocket connection drops mid-dictation (Lattices crashes, user quits, network hiccup), Vox **must** auto-cancel the in-flight dictation:
|
|
177
|
+
|
|
178
|
+
1. Stop recording immediately
|
|
179
|
+
2. Discard any captured audio — do not transcribe
|
|
180
|
+
3. Release the mic so Vox's own UI or a reconnecting client can use it
|
|
181
|
+
4. Log the orphaned dictation for diagnostics: `[dictation] orphaned session from lattices — connection dropped, auto-cancelled`
|
|
182
|
+
|
|
183
|
+
Vox treats a closed WebSocket as an implicit `cancelDictation`. No grace period, no buffering — if the consumer is gone, the recording is worthless.
|
|
184
|
+
|
|
185
|
+
On the Lattices side, if the connection drops while in `listening` or `transcribing` state:
|
|
186
|
+
- Feedback strip: "Connection lost" (red)
|
|
187
|
+
- Attempt reconnect via normal discovery (ping → `services.json` → wait for `live.ready`)
|
|
188
|
+
- Do not auto-retry the dictation — the user needs to press Space again
|
|
189
|
+
|
|
190
|
+
## End-to-End Lifecycle
|
|
191
|
+
|
|
192
|
+
```mermaid
|
|
193
|
+
sequenceDiagram
|
|
194
|
+
participant U as User
|
|
195
|
+
participant L as Lattices UI
|
|
196
|
+
participant TA as Vox
|
|
197
|
+
participant IE as Intent Engine
|
|
198
|
+
|
|
199
|
+
U->>L: Press Space (in cheat sheet)
|
|
200
|
+
L->>TA: startDictation (persist: false)
|
|
201
|
+
|
|
202
|
+
alt Error
|
|
203
|
+
TA-->>L: error (mic denied / no model)
|
|
204
|
+
L->>U: Red text in feedback strip
|
|
205
|
+
else OK
|
|
206
|
+
TA-->>L: {ok: true}
|
|
207
|
+
TA-->>L: dictation.started
|
|
208
|
+
L->>U: Green dot (pulsing) + "Listening..."
|
|
209
|
+
|
|
210
|
+
Note over U,TA: User speaks...
|
|
211
|
+
|
|
212
|
+
U->>L: Press Space again
|
|
213
|
+
L->>TA: stopDictation
|
|
214
|
+
TA-->>L: dictation.transcribing
|
|
215
|
+
L->>U: "Transcribing..."
|
|
216
|
+
|
|
217
|
+
TA-->>L: {transcript: "tile this left", confidence: 0.94}
|
|
218
|
+
L->>U: Show transcript
|
|
219
|
+
end
|
|
220
|
+
|
|
221
|
+
L->>IE: Classify via NLEmbedding
|
|
222
|
+
IE-->>L: intent: tile_window, slots: {position: left}, confidence: 0.95
|
|
223
|
+
L->>U: Show intent + slots
|
|
224
|
+
|
|
225
|
+
L->>IE: Execute
|
|
226
|
+
IE-->>L: result
|
|
227
|
+
L->>U: "Done" or error
|
|
228
|
+
|
|
229
|
+
Note over L: Log entry written
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
## UI States
|
|
233
|
+
|
|
234
|
+
| State | Feedback strip | Footer |
|
|
235
|
+
|---|---|---|
|
|
236
|
+
| **Idle** | Hidden | `[Space] Voice [ESC] Dismiss` |
|
|
237
|
+
| **Not installed** | Hidden | `[Space] Voice (unavailable) [ESC] Dismiss` |
|
|
238
|
+
| **Installed, not running** | Hidden | `[Space] Voice (start Vox) [ESC] Dismiss` |
|
|
239
|
+
| **Starting** | "Starting Vox..." | `[ESC] Cancel` |
|
|
240
|
+
| **Error** | Red: "Mic access denied" or "Mic in use by Vox" | `[ESC] Dismiss` |
|
|
241
|
+
| **Disconnected** | Red: "Connection lost" | `[ESC] Dismiss` |
|
|
242
|
+
| **Listening** | Green dot + "Listening..." | `[Space] Stop [ESC] Cancel` |
|
|
243
|
+
| **Transcribing** | "Transcribing..." | `[ESC] Cancel` |
|
|
244
|
+
| **Result** | `"tile this left"` → `tile window · position: left` → `Done` | `[Space] New [ESC] Dismiss` |
|
|
245
|
+
|
|
246
|
+
## Logging
|
|
247
|
+
|
|
248
|
+
Every voice command produces a diagnostic log entry:
|
|
249
|
+
|
|
250
|
+
```
|
|
251
|
+
[voice] "tile this left" → tile_window(position: left) → ok (conf=0.95, 1820ms)
|
|
252
|
+
[voice] "organize my stuff" → distribute() → ok (conf=0.79, 2100ms)
|
|
253
|
+
[voice] "do something weird" → (no match, conf=0.41, 900ms)
|
|
254
|
+
[voice] error: Vox not running
|
|
255
|
+
[voice] error: mic_busy (owner: vox)
|
|
256
|
+
[voice] error: connection dropped mid-dictation
|
|
257
|
+
[voice] launched Vox, connected in 2.1s
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
## Implementation Scope
|
|
261
|
+
|
|
262
|
+
### Lattices side
|
|
263
|
+
- Use `@vox/client` SDK (`VoxClient` with `service: "agent"`, `clientId: "lattices"`, `capabilities: ["dictation"]`) — see `vox/sdk/SDK.md` for full reference
|
|
264
|
+
- Replace `AVAudioRecorder` in `VoxAudioProvider` with `createDictationSession().start({ persist: false })`
|
|
265
|
+
- Remove mic entitlement and `NSMicrophoneUsageDescription` (Lattices never touches the mic)
|
|
266
|
+
- Service discovery, auto-reconnect, and auth are handled by the SDK
|
|
267
|
+
- Map `DictationSession` events (`stateChange`, `partialTranscript`, `finalTranscript`, `error`) to cheat sheet UI states
|
|
268
|
+
- Handle `MicBusyError` — show `"Mic in use by ${error.owner}"`
|
|
269
|
+
|
|
270
|
+
### Vox side (separate repo)
|
|
271
|
+
- Expose a WebSocket bridge (or add methods to existing bridge)
|
|
272
|
+
- Add `startDictation`, `stopDictation`, `cancelDictation` handlers
|
|
273
|
+
- Emit `dictation.started`, `dictation.transcribing`, `dictation.result`, `dictation.error` events
|
|
274
|
+
- Honor `persist: false` — skip memo creation and sync
|
|
275
|
+
- Write `~/.vox/services.json` on startup (all service ports)
|
|
276
|
+
- Include `agentPort` in `live.ready` notification userInfo
|
|
277
|
+
- Return `mic_busy` error with `owner` field when another consumer holds the mic
|
|
278
|
+
- Auto-cancel dictation on WebSocket disconnect (closed socket = implicit cancel)
|