osborn 0.9.48 → 0.9.50

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,247 @@
1
+ # Ground Assumptions
2
+
3
+ ## SKILL IDENTITY
4
+ Name: ground-assumptions
5
+ Install path: ~/.claude/skills/ground-assumptions/SKILL.md
6
+ Portable: yes — drops into any agent's skills dir (Claude Code, osborn on Fly, other Claude Agent SDK hosts)
7
+
8
+ ## WHEN THIS SKILL ACTIVATES
9
+ This skill applies whenever the conversation enters a **planning / design / architecture phase**.
10
+ Specifically, if the user says or asks any of:
11
+
12
+ - "let's plan / design / architect..."
13
+ - "how should we..."
14
+ - "what's the best way to..."
15
+ - "I'm thinking we..."
16
+ - "what do you recommend..."
17
+ - "approach", "architecture", "design", "should we"
18
+ - Any time I am about to recommend an implementation strategy, performance characteristic,
19
+ behavioral guarantee, or comparative judgment ("X is faster than Y", "this propagates", "this scales")
20
+
21
+ Also activates explicitly with:
22
+ - "ground assumptions"
23
+ - "verify before planning"
24
+ - "check that hypothesis"
25
+
26
+ ## CORE PRINCIPLE
27
+ When you're **planning new work — a new feature, a new integration, or fitting
28
+ a change into the existing architecture** — every load-bearing assumption is a
29
+ **hypothesis until verified against real evidence.** Training-data intuition and
30
+ "it should work" do not count.
31
+
32
+ The canonical situation this skill is for: *we're about to implement something
33
+ new (e.g. add OpenAI Codex as an agent option alongside Claude Code), and we
34
+ need to know it actually fits our architecture one-to-one — does Codex's session
35
+ model, SDK, and data storage map to what we already do — BEFORE we build on that
36
+ assumption.* That's the shape: a new piece must slot into the existing system,
37
+ and we confirm the fit with evidence, not hope.
38
+
39
+ **What counts as verification** (in priority order):
40
+ 1. **Existing tests / previous test runs** — has this already been proven? check first.
41
+ 2. **Authoritative documentation / source code** — does the doc or the actual code confirm the behavior?
42
+ 3. **A newly-created, targeted test** — if nothing above answers it, build a
43
+ specific test (or several) that exercises exactly the assumption, run it on
44
+ real infrastructure, and read the result.
45
+
46
+ **The main agent does NOT do the verification work itself.** Spawn subagents
47
+ (Agent tool) to run the tests / read the docs / check prior results, so the main
48
+ agent stays free to keep the planning conversation moving and react to results
49
+ as they land. **Delegation rule (hard):** when verification is needed, ALWAYS
50
+ delegate to a subagent — don't call Bash/WebSearch/Grep inline yourself. Spawn
51
+ multiple subagents in parallel when there are several independent assumptions to
52
+ check. (Exception: the user explicitly asks you to run something inline — then
53
+ say you're breaking the delegation pattern.)
54
+
55
+ Why this matters — past sessions shipped plans on unverified assumptions and
56
+ paid for it: a "small" change broke an unrelated subsystem; a behavior we
57
+ *assumed* ("this propagates", "this auto-updates") silently failed; an
58
+ integration we *assumed* was 1:1 wasn't. The expensive surprises live in
59
+ **architectural fit and integration**, not in micro-benchmarks. (Timing claims
60
+ matter too — don't say "fast"/"Xs" without measuring — but they are the *least*
61
+ of it. Lead with "does this fit / does this actually work", not "how fast".)
62
+
63
+ The discipline: **verify against evidence — existing tests, docs, or a new
64
+ delegated test — before the assumption is surfaced as fact or built upon.**
65
+
66
+ ## ASSUMPTION PRIORITY ORDER (verify highest first)
67
+
68
+ When verification budget is limited, ALWAYS verify in this order:
69
+
70
+ 1. **ARCHITECTURAL IMPACT** — does this change break or alter existing flows / subsystems?
71
+ - "Does the new entrypoint affect the OAuth flow?"
72
+ - "If we move osborn off the volume, does session resume still work?"
73
+ - "Does the bind-mount conflict with Fly's shutdown umount?"
74
+ 2. **INTEGRATION** — does this work with all the connected pieces (auth, network, sessions, data, persistence, MCP, recording, etc.)?
75
+ - "Does Claude Code's `setup-token` pty work inside chroot?"
76
+ - "Does the frontend's `/api/sandbox` fetch-log still read the right path?"
77
+ 3. **BEHAVIORAL** — does the system actually do what we claim it does?
78
+ - "Does image-swap actually replace the running osborn binary?"
79
+ 4. **TIMING** — is the speed claim true under real conditions?
80
+ - "Is the seed tarball really 5s to extract?"
81
+ 5. **COSMETIC** — minor polish items that don't gate the architecture.
82
+
83
+ Timing claims are the LOWEST priority. We've burned multiple cycles on timing measurements
84
+ while missing that the architecture itself had subtle bugs that broke other parts of the system.
85
+ Architectural and integration assumptions are where the expensive surprises live.
86
+
87
+ ## THE WORKFLOW (followed strictly during planning)
88
+
89
+ ### 1. PLAN DRAFT
90
+ State the proposed plan as usual — fully, with intent and reasoning.
91
+
92
+ ### 2. ASSUMPTION EXTRACTION
93
+ Before presenting the plan as a recommendation, **list every load-bearing assumption**.
94
+ A load-bearing assumption is anything where, if it's wrong, the plan stops working.
95
+
96
+ For each assumption, **also identify its second/third-order implications** —
97
+ what else in the system depends on it being true?
98
+
99
+ Format:
100
+ ```
101
+ ASSUMPTIONS (must be verified before plan ships):
102
+ 1. <claim that the plan depends on>
103
+ → implications: <what else breaks if 1 is false>
104
+ 2. <claim that the plan depends on>
105
+ → implications: <what else breaks if 2 is false>
106
+ 3. ...
107
+ ```
108
+
109
+ If an assumption can't be stated cleanly in one sentence, it isn't ready to be tested.
110
+ Break it down further.
111
+
112
+ **Ripple-effect check** (do this once for every plan that touches existing architecture):
113
+
114
+ Ask explicitly:
115
+ - What existing flows touch the system we're changing?
116
+ (auth, network, sessions, MCP, recording, persistence, log-fetch, dashboard, voice loop)
117
+ - For each connected flow, can the change break it in a non-obvious way?
118
+ - Is there a code path that USED to work without our knowledge that depends on the old behavior?
119
+
120
+ If yes to any of those, add the affected flow as a new assumption that needs verification.
121
+ This is where the expensive surprises hide.
122
+
123
+ ### 3. ASYNC VERIFIER SPAWN (parallel, non-blocking)
124
+ For EACH assumption, spawn an Agent subagent **immediately**, in a SINGLE message
125
+ with multiple Agent tool calls so they run concurrently.
126
+
127
+ Choose verifier type by the nature of the assumption. **Architectural and integration verifiers come first** — they catch the expensive surprises.
128
+
129
+ | Assumption type | Verifier type | What it does |
130
+ |---|---|---|
131
+ | **Architectural impact** (`does X break flow Y?`) | **Ripple agent** | Traces all callers/consumers of the changed component, checks each for breakage |
132
+ | **Integration** (`does X work with subsystem Y?`) | **Integration agent** | Spawns end-to-end test exercising the connection between subsystems |
133
+ | Behavioral (`does X`, `propagates`, `survives Y`) | Test agent | Triggers the behavior, observes outcome |
134
+ | Documented (`API supports X`, `library does Y`) | Research agent | Fetches docs/code/sources, returns citation with quote |
135
+ | Derivable (`X+Y → Z`) | Reasoning agent | Derives from established facts, returns chain |
136
+ | Timing (`Xs`, `fast`, `slow`) | Test agent | Runs the actual operation on real infra, measures under stated conditions |
137
+ | Empirical (`users typically do X`) | Research agent | Cites surveys/data/observations |
138
+
139
+ Each verifier returns one of:
140
+ - **MEASURED**: empirical observation with conditions documented
141
+ - **SOURCED**: cited from authoritative source with quote
142
+ - **DERIVED**: chain from established facts
143
+ - **CONTRADICTED**: evidence that the assumption is false
144
+ - **UNVERIFIABLE**: cannot be determined in available time/resources
145
+
146
+ ### 4. CONTINUE PLANNING (don't block on verifiers)
147
+ Keep talking with the user through design tradeoffs, edge cases, etc.
148
+ **Main agent is NOT in the test loop.** Verifiers run in the background.
149
+ DO NOT commit to a recommendation until verifiers report.
150
+
151
+ **Main agent's role while verifiers run:**
152
+ - Stay in conversation with the user
153
+ - Sketch more of the plan / explore tradeoffs / answer questions
154
+ - Track which verifiers are still in flight, which returned, which contradicted
155
+ - React to verifier results as they arrive — don't poll, don't wait silently
156
+
157
+ **Things the main agent should NOT do while verifiers are in flight:**
158
+ - Run a Bash command that performs the same test (defeats delegation)
159
+ - Read files the verifier is already reading (duplicative)
160
+ - "Just check one quick thing myself" — that's how delegation collapses
161
+ - Block the conversation until results come back
162
+
163
+ ### 5. INTEGRATE RESULTS
164
+ When a verifier returns:
165
+ - **MEASURED / SOURCED / DERIVED** → mark assumption ✓, keep going
166
+ - **CONTRADICTED** → STOP, mark assumption ✗, announce: "Assumption N contradicted by <evidence>. Replanning." → restart at step 1 with revised approach
167
+ - **UNVERIFIABLE** → mark ⚠️, ask user: "Cannot verify <assumption>. Proceed with explicit risk, or pivot to a verifiable approach?"
168
+
169
+ ### 6. COMMITTED PLAN
170
+ Only present a plan as the recommended approach when every assumption is
171
+ ✓ MEASURED, ✓ SOURCED, ✓ DERIVED, or explicitly accepted as ⚠️ UNVERIFIABLE.
172
+
173
+ ## OUTPUT FORMAT
174
+
175
+ While verifiers are in flight:
176
+ ```
177
+ PLAN: <draft summary>
178
+
179
+ ASSUMPTIONS (verifiers running in parallel):
180
+ ☐ A1: <assumption>
181
+ ☐ A2: <assumption>
182
+ ☐ A3: <assumption>
183
+ ```
184
+
185
+ As verifiers return:
186
+ ```
187
+ ✓ A1 MEASURED: <result> (conditions: <where/when/setup>)
188
+ ✓ A2 SOURCED: <URL> — "<quote>"
189
+ ✗ A3 CONTRADICTED: <evidence>
190
+ → STOPPING. Replanning around A3.
191
+ ```
192
+
193
+ Final state:
194
+ ```
195
+ VERIFIED PLAN:
196
+ <plan with every assumption marked ✓ or explicitly ⚠️>
197
+ ```
198
+
199
+ ## HARD RULES (no exceptions)
200
+
201
+ 1. **No naked "it fits / it works" claims.** Never assert that a new piece integrates with the existing architecture — "Codex maps 1:1 to our session model", "this slots into the existing flow", "the SDK stores data the same way" — without backing from an existing test, the actual docs/source, or a new delegated test.
202
+ 2. **No naked behavioral claims.** Never write "auto-updates", "propagates", "survives", "rolls back", "X just works" without MEASURED or SOURCED backing.
203
+ 3. **Check for existing evidence FIRST.** Before commissioning a new test, have a subagent check whether a previous test run, doc, or the source already answers it. Don't re-test what's already proven.
204
+ 4. **CONTRADICTED stops everything.** When a verifier contradicts an assumption, NO new content is written about the plan until the plan is revised and the verifier rerun.
205
+ 5. **UNVERIFIABLE is loud.** Mark it ⚠️ in the output AND ask the user for explicit acceptance. Don't hide unverified parts in prose.
206
+ 6. **Training-data intuition is forbidden as evidence.** "X typically works this way" is not a citation. Verify against a test, doc, or source — or skip.
207
+ 7. **Timing/comparative claims are the least of it, but still bound:** don't write "fast"/"slow"/"X seconds"/"faster than Y" without a measurement + conditions. Just don't let speed-benchmarking crowd out the architectural-fit and integration checks, which are where the expensive surprises actually live.
208
+
209
+ ## SUBAGENT SPAWNING PATTERNS
210
+
211
+ For timing/behavioral tests on real infra:
212
+ > Spawn Agent subagent with prompt: "Run <specific command> on <specific target>. Measure <specific metric>. Report back the measurement and the conditions (machine type, memory, network state, cold/warm cache). Do not attempt the broader task — only verify this one assumption."
213
+
214
+ For documentation lookups:
215
+ > Spawn Agent subagent with prompt: "Find authoritative source for <specific claim>. Return URL + verbatim quote. If multiple sources, prefer official docs > vendor blogs > Stack Overflow. If no source exists, return UNVERIFIABLE with reasoning."
216
+
217
+ For derivation:
218
+ > Spawn Agent subagent with prompt: "Given these established facts: <list>, can we derive <claim>? Return either the derivation chain OR 'cannot derive — gap at: <step>'."
219
+
220
+ For ripple-effect / architectural impact (HIGHEST priority):
221
+ > Spawn Agent subagent with prompt: "Trace all consumers / callers / dependencies of `<component being changed>` in the codebase. For each consumer, check whether the proposed change would break it. Report each potential break with file:line and the specific failure mode. Do not propose fixes — only enumerate breaks."
222
+
223
+ For integration testing (HIGH priority):
224
+ > Spawn Agent subagent with prompt: "End-to-end test: after the proposed change, exercise the connection between `<subsystem A>` and `<subsystem B>` on real infra. Specifically, verify `<concrete cross-system flow>`. Report MEASURED behavior and any divergence from the expected flow."
225
+
226
+ **Pattern**: invoke all Agent tools in a single response message so they run concurrently rather than sequentially. The Agent tool's `subagent_type` should be `general-purpose` or `Explore` (read-only) depending on what the verifier needs.
227
+
228
+ ## PORTABILITY NOTES
229
+
230
+ This skill works in any Claude Agent SDK environment because:
231
+ - It only requires the Agent tool (standard SDK feature)
232
+ - The trigger logic is prose, not code
233
+ - No host-specific paths, IDs, or APIs
234
+
235
+ To deploy on another agent (e.g. osborn on a Fly machine), copy this SKILL.md to that agent's skills dir:
236
+ - Claude Code: `~/.claude/skills/ground-assumptions/SKILL.md`
237
+ - osborn on Fly: `/workspace/root-chroot/root/.claude/skills/ground-assumptions/SKILL.md`
238
+ - Other Claude Agent SDK hosts: their configured skills path
239
+
240
+ ## EXIT CRITERIA
241
+
242
+ The skill releases its grip on a conversation when:
243
+ - All assumptions are verified and a committed plan exists, OR
244
+ - The user explicitly asks to skip verification ("just give me your best guess"), OR
245
+ - The conversation shifts away from planning to execution of an already-verified plan
246
+
247
+ In the second case, mark the response with "WARNING: Skipping verification at user request. The following is unverified intuition." so the lack of grounding is visible.
@@ -0,0 +1,229 @@
1
+ # Voice Native Sync Skill
2
+
3
+ ## SKILL IDENTITY
4
+ Name: voice-native-sync
5
+ Install path: ~/.claude/skills/voice-native-sync/SKILL.md
6
+
7
+ ## TRIGGER PHRASES
8
+ This skill activates when the user says any of:
9
+ - "upload to voice native"
10
+ - "download from voice native"
11
+ - "sync to voice native"
12
+ - "push sessions to voice native"
13
+ - "pull sessions from voice native"
14
+ - "sync my sessions"
15
+
16
+ When triggered, follow the instructions below immediately without asking for clarification,
17
+ unless credentials are missing (see CREDENTIALS section).
18
+
19
+ ---
20
+
21
+ ## SELF-INSTALL
22
+ If the user asks you to install this skill or you fetched this file from a URL,
23
+ save it to: ~/.claude/skills/voice-native-sync/SKILL.md
24
+
25
+ ```bash
26
+ mkdir -p ~/.claude/skills/voice-native-sync
27
+ # then write this file's content to ~/.claude/skills/voice-native-sync/SKILL.md
28
+ ```
29
+
30
+ After saving, confirm: "Voice Native sync skill installed. Use 'upload to voice native' or 'download from voice native' any time."
31
+
32
+ ---
33
+
34
+ ## CREDENTIALS
35
+ Credentials are stored at: ~/.claude/skills/voice-native-sync/config
36
+
37
+ Config file format:
38
+ ```
39
+ CLOUD_URL=https://osborn-XXXX.fly.dev
40
+ TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
41
+ ```
42
+
43
+ If the config file doesn't exist:
44
+ 1. Tell the user: "I need your Voice Native sync credentials. Go to voice-native.com → Settings → Copy sync info, then paste it here."
45
+ 2. Parse the pasted block for CLOUD_URL (the "Server:" line) and TOKEN (the "Token:" line)
46
+ 3. Save to ~/.claude/skills/voice-native-sync/config
47
+ 4. Proceed with the requested operation
48
+
49
+ ---
50
+
51
+ ## UPLOAD (Local → Voice Native Cloud)
52
+
53
+ Uploads all local Claude session files to the Voice Native fly machine.
54
+ Uses chunked upload + finalize. Safe to re-run — mtime-newer-wins per file.
55
+
56
+ ### Execute as a single script (one permission prompt):
57
+
58
+ ```bash
59
+ set -e
60
+
61
+ # Load credentials
62
+ source ~/.claude/skills/voice-native-sync/config
63
+
64
+ TARGET_PATH="/workspace"
65
+
66
+ rm -f /tmp/vn-sync.tar.gz /tmp/vn-chunk-*
67
+
68
+ # Archive local Claude projects (exclude macOS AppleDouble files)
69
+ tar -czf /tmp/vn-sync.tar.gz \
70
+ $(uname | grep -qi darwin && echo '--exclude=._*') \
71
+ -C "$HOME/.claude" projects
72
+
73
+ echo "archive: $(du -sh /tmp/vn-sync.tar.gz | cut -f1)"
74
+
75
+ # Split into 50MB chunks
76
+ split -b 50m /tmp/vn-sync.tar.gz /tmp/vn-chunk-
77
+ CHUNKS=(/tmp/vn-chunk-*)
78
+ TOTAL=${#CHUNKS[@]}
79
+ echo "chunks: $TOTAL"
80
+
81
+ # Generate upload ID (works on Linux and macOS)
82
+ if command -v uuidgen &>/dev/null; then
83
+ UPLOAD_ID=$(uuidgen | tr '[:upper:]' '[:lower:]')
84
+ else
85
+ UPLOAD_ID=$(cat /proc/sys/kernel/random/uuid 2>/dev/null || python3 -c "import uuid; print(uuid.uuid4())")
86
+ fi
87
+ echo "upload id: $UPLOAD_ID"
88
+
89
+ # Upload chunks
90
+ idx=0
91
+ for chunk in "${CHUNKS[@]}"; do
92
+ echo "uploading chunk $idx / $((TOTAL-1))..."
93
+ STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST \
94
+ -H "Authorization: Bearer $TOKEN" \
95
+ -H "Content-Type: application/octet-stream" \
96
+ --data-binary "@${chunk}" \
97
+ "${CLOUD_URL}/sessions/import-chunk?uploadId=${UPLOAD_ID}&chunk=${idx}")
98
+ echo " chunk $idx → HTTP $STATUS"
99
+ idx=$((idx+1))
100
+ done
101
+
102
+ # Finalize — merges chunks and extracts WITHOUT slug remapping.
103
+ # IMPORTANT: do NOT pass `targetWorkDir`. The server-side remap collapses every
104
+ # source slug into the target work dir's slug, which causes session-resume to
105
+ # silently break when sessions are uploaded from different hosts (Mac, Codespace,
106
+ # Sprite) — they all end up in -workspace, the JSONLs internally still reference
107
+ # their original cwd, the slug↔cwd no longer match, and Claude Code's resume
108
+ # can't find the file. Confirmed 2026-05-27: a codespace upload remapped
109
+ # -workspaces-codespaces-blank → -workspace and every codespace session went
110
+ # silent on resume. The fix is to preserve each upload's original slug structure.
111
+ echo "finalizing..."
112
+ RESULT=$(curl -s -X POST \
113
+ -H "Authorization: Bearer $TOKEN" \
114
+ "${CLOUD_URL}/sessions/import-finalize?uploadId=${UPLOAD_ID}&total=${TOTAL}")
115
+ echo "finalize result: $RESULT"
116
+
117
+ # Cleanup
118
+ rm -f /tmp/vn-sync.tar.gz /tmp/vn-chunk-*
119
+
120
+ # Verify
121
+ echo "verifying manifest..."
122
+ curl -s -H "Authorization: Bearer $TOKEN" "${CLOUD_URL}/sessions/manifest" | \
123
+ python3 -c "
124
+ import json,sys
125
+ d=json.load(sys.stdin)
126
+ slugs=d.get('slugs',{})
127
+ total=sum(len(v.get('files',{})) for v in slugs.values())
128
+ print(f' cloud now has {len(slugs)} slug(s), {total} total files')
129
+ for slug,info in slugs.items():
130
+ files=info.get('files',{})
131
+ print(f' {slug}: {len(files)} files')
132
+ "
133
+ ```
134
+
135
+ ---
136
+
137
+ ## DOWNLOAD (Voice Native Cloud → Local)
138
+
139
+ Downloads all sessions from the Voice Native fly machine and merges into local ~/.claude/projects/.
140
+ Mtime-newer-wins — local files newer than cloud are preserved.
141
+
142
+ ### Execute as a single script:
143
+
144
+ ```bash
145
+ set -e
146
+
147
+ # Load credentials
148
+ source ~/.claude/skills/voice-native-sync/config
149
+
150
+ # Get local working directory for slug remapping
151
+ LOCAL_CWD="$(pwd)"
152
+ echo "local target cwd: $LOCAL_CWD"
153
+
154
+ rm -f /tmp/vn-download.tar.gz
155
+
156
+ # Download full export from fly machine
157
+ echo "downloading from $CLOUD_URL..."
158
+ curl -f -L \
159
+ -H "Authorization: Bearer $TOKEN" \
160
+ "${CLOUD_URL}/sessions/export" \
161
+ -o /tmp/vn-download.tar.gz
162
+ echo "downloaded: $(du -sh /tmp/vn-download.tar.gz | cut -f1)"
163
+
164
+ # Import with slug remapping to local cwd
165
+ echo "importing..."
166
+ # Same fix as upload: no targetWorkDir, preserve original slug structure.
167
+ RESULT=$(curl -s -X POST \
168
+ -H "Authorization: Bearer $TOKEN" \
169
+ -H "Content-Type: application/octet-stream" \
170
+ --data-binary "@/tmp/vn-download.tar.gz" \
171
+ "${CLOUD_URL}/sessions/import")
172
+ echo "import result: $RESULT"
173
+
174
+ rm -f /tmp/vn-download.tar.gz
175
+
176
+ echo "done — sessions merged into ~/.claude/projects/"
177
+ ```
178
+
179
+ Wait — the DOWNLOAD direction means pulling from cloud to THIS local machine.
180
+ The import endpoint runs on the cloud. For download to local, use this instead:
181
+
182
+ ```bash
183
+ set -e
184
+ source ~/.claude/skills/voice-native-sync/config
185
+
186
+ LOCAL_CWD="$(pwd)"
187
+ rm -f /tmp/vn-download.tar.gz
188
+
189
+ echo "downloading export from $CLOUD_URL..."
190
+ curl -f -H "Authorization: Bearer $TOKEN" \
191
+ "${CLOUD_URL}/sessions/export" \
192
+ -o /tmp/vn-download.tar.gz
193
+ echo "downloaded: $(du -sh /tmp/vn-download.tar.gz | cut -f1)"
194
+
195
+ # Extract archive
196
+ mkdir -p /tmp/vn-extract
197
+ tar -xzf /tmp/vn-download.tar.gz -C /tmp/vn-extract
198
+
199
+ # Remap and merge into local ~/.claude/projects/
200
+ LOCAL_SLUG=$(echo "$LOCAL_CWD" | sed 's|/|-|g')
201
+ PROJECTS_DIR="$HOME/.claude/projects"
202
+ mkdir -p "${PROJECTS_DIR}/${LOCAL_SLUG}"
203
+
204
+ echo "merging into ${PROJECTS_DIR}/${LOCAL_SLUG}..."
205
+ for slug_dir in /tmp/vn-extract/projects/*/; do
206
+ slug=$(basename "$slug_dir")
207
+ for f in "${slug_dir}"*.jsonl "${slug_dir}"*.jsonl.* 2>/dev/null; do
208
+ [ -f "$f" ] || continue
209
+ fname=$(basename "$f")
210
+ dest="${PROJECTS_DIR}/${LOCAL_SLUG}/${fname}"
211
+ if [ ! -f "$dest" ] || [ "$f" -nt "$dest" ]; then
212
+ cp "$f" "$dest"
213
+ echo " wrote $fname"
214
+ fi
215
+ done
216
+ done
217
+
218
+ rm -rf /tmp/vn-download.tar.gz /tmp/vn-extract
219
+ echo "done — sessions available at ${PROJECTS_DIR}/${LOCAL_SLUG}/"
220
+ ```
221
+
222
+ ---
223
+
224
+ ## TECHNICAL NOTES
225
+ - Cloud target path is always `/workspace` (Fly.io machines)
226
+ - Slug remapping is automatic on upload (source slug → /workspace slug)
227
+ - Mtime-newer-wins: re-syncing is always safe, newer file wins per-file
228
+ - gzip only — never use zstd (server doesn't support it)
229
+ - macOS: always pass `--exclude='._*'` to tar (BSD tar emits AppleDouble files)
@@ -31,7 +31,7 @@
31
31
  # - Path layout: chroot /workspace/root-chroot/root/... vs /workspace/home/...
32
32
  # Subagent research confirmed chroot in our codebase solves only HOME-and-persistence
33
33
  # layout, NOT library access (Linux dynamic linker is mount-agnostic per ld.so(8)).
34
- # Since HOME=/workspace/home achieves the same persistence with ~100 fewer LOC and
34
+ # Since HOME=/workspace achieves the same persistence with ~100 fewer LOC and
35
35
  # no bind-mount complexity, the chroot version was retired.
36
36
  # Archive: docs/archive/Dockerfile.sandbox.chroot-2026-05-28.md
37
37
 
@@ -63,14 +63,45 @@ ENV OSBORN_API_PORT=8741
63
63
  ENV NODE_ENV=production
64
64
  ENV OSBORN_IMAGE_VERSION=${OSBORN_VERSION}
65
65
 
66
- # THE KEY DIFFERENCE FROM CHROOT VARIANT:
67
- # HOME and OSBORN_CWD are set as IMAGE-LEVEL ENV here. The entrypoint reads
68
- # them, ensures the dirs exist on the volume, and execs osborn. No bind-mount
69
- # trickery HOME just IS on the volume because the path resolves to a volume
70
- # subdir.
71
- ENV HOME=/workspace/home
66
+ # HOME=/workspace the volume mount point itself is the home directory.
67
+ # This means ~/.claude resolves to /workspace/.claude, which is EXACTLY where
68
+ # the legacy symlink architecture (/root/.claude -> /workspace/.claude) already
69
+ # put credentials and sessions. So there is ZERO migration: an existing machine
70
+ # updating to this image finds its data already at ~/.claude with no file moves.
71
+ #
72
+ # Why not /workspace/home (the earlier Option D choice)? That required MOVING
73
+ # legacy data from /workspace/.claude into /workspace/home/.claude — an `mv`
74
+ # loop that is destructive (deletes source as it goes), non-atomic across
75
+ # multiple files (interruption = split state), and catastrophic if HOME ever
76
+ # resolved off-volume (mv would send data to the ephemeral overlay). Pointing
77
+ # HOME at /workspace eliminates the migration entirely: nothing moves, so
78
+ # nothing can be lost in a move. The only cosmetic cost is dotfiles sitting at
79
+ # the volume root — identical to what the legacy symlink effectively did.
80
+ #
81
+ # NOTE on overrides: these are Dockerfile ENV *defaults*. A Fly machine-config
82
+ # `env.HOME` (or app secret) OVERRIDES them at runtime. updateOsborn strips
83
+ # HOME/OSBORN_CWD from existing machine configs during image-swap so this
84
+ # default actually takes effect on migrated machines — without that, a stale
85
+ # HOME=/root from an older provisioning would silently win. See
86
+ # frontend/src/lib/machines.ts updateOsbornImpl.
87
+ ENV HOME=/workspace
72
88
  ENV OSBORN_CWD=/workspace
73
89
 
90
+ # HYBRID: user-installed global npm packages persist on the volume.
91
+ # osborn itself was already installed above into the DEFAULT prefix (/usr/local,
92
+ # image layer) — that RUN happened BEFORE this ENV, so osborn stays in the image
93
+ # and updates via image-swap (atomic, no runtime OOM, toolchain present at build).
94
+ # Setting NPM_CONFIG_PREFIX here only affects RUNTIME `npm install -g <x>` the
95
+ # user/agent runs: those land in /workspace/.npm-global on the persistent volume
96
+ # and survive restarts + image-swaps. PATH puts that bin dir first so installed
97
+ # CLIs are immediately runnable. Verified end-to-end on real Fly 2026-06-01:
98
+ # pure-JS (cowsay) AND native-compiled (node-pty via node-gyp) both install at
99
+ # runtime, persist across restart, no OOM (toolchain is in the image).
100
+ # Caveat: native user modules are tied to the image's Node ABI (currently 22) —
101
+ # a future Node-major image bump would need an `npm rebuild` of volume globals.
102
+ ENV NPM_CONFIG_PREFIX=/workspace/.npm-global
103
+ ENV PATH=/workspace/.npm-global/bin:$PATH
104
+
74
105
  WORKDIR /workspace
75
106
  EXPOSE 8741
76
107
 
@@ -95,34 +126,23 @@ exec > >(tee -a "$LOGFILE") 2>&1
95
126
  ONBOARDING_JSON='{"numStartups":10,"installMethod":"npm","autoUpdates":false,"hasCompletedOnboarding":true,"hasTrustDialogAccepted":true,"hasTrustDialogHooksAccepted":true,"hasCompletedProjectOnboarding":true,"hasAcknowledgedCostThreshold":true,"effortCalloutV2Dismissed":true,"theme":"dark","projects":{"/workspace":{"hasTrustDialogAccepted":true,"hasTrustDialogHooksAccepted":true,"hasCompletedProjectOnboarding":true}}}'
96
127
 
97
128
  # ============================================================
98
- # === HOME-on-volume seed ===
129
+ # === HOME-on-volume setup (HOME=/workspace) ===
99
130
  # ============================================================
100
- # HOME=/workspace/home, set via ENV in Dockerfile. Ensure the dir exists on
101
- # the volume and seed Claude/onboarding/skills on first boot. Idempotent.
102
- echo "[sandbox-d] HOME=$HOME OSBORN_CWD=$OSBORN_CWD"
131
+ # HOME=/workspace, set via ENV in Dockerfile (and enforced by updateOsborn
132
+ # stripping any stale HOME from existing machine configs). Because HOME is the
133
+ # volume mount itself, ~/.claude == /workspace/.claude — which is exactly where
134
+ # the legacy symlink architecture already stored credentials + sessions.
135
+ #
136
+ # THEREFORE: NO MIGRATION. An existing machine's data is already at ~/.claude
137
+ # the instant HOME points at /workspace. We removed the old `mv` migration
138
+ # block entirely (it was destructive + non-atomic + catastrophic if HOME ever
139
+ # resolved off-volume). Nothing moves, so nothing can be lost in a move.
140
+ echo "[sandbox-d] HOME=$HOME OSBORN_CWD=$OSBORN_CWD NPM_CONFIG_PREFIX=$NPM_CONFIG_PREFIX"
103
141
  mkdir -p "$HOME" "$HOME/.claude" "$HOME/.osborn"
104
-
105
- # === Legacy migration ===
106
- # Existing production sandboxes have credentials at /workspace/.claude/
107
- # (the pre-D legacy symlink architecture). Migrate to HOME=/workspace/home/.claude/
108
- # on first boot of the D image. Atomic mv — safe.
109
- if [ -d /workspace/.claude ] && [ ! -d "$HOME/.claude/projects" ] && [ ! -f "$HOME/.claude/.credentials.json" ]; then
110
- echo "[sandbox-d] migrating legacy /workspace/.claude → \$HOME/.claude"
111
- # Move CONTENTS, not the dir itself (target may already exist with seeded skills)
112
- for item in /workspace/.claude/.* /workspace/.claude/*; do
113
- [ -e "$item" ] || continue
114
- BASENAME=$(basename "$item")
115
- [ "$BASENAME" = "." ] && continue
116
- [ "$BASENAME" = ".." ] && continue
117
- [ -e "$HOME/.claude/$BASENAME" ] && continue
118
- mv "$item" "$HOME/.claude/$BASENAME" 2>/dev/null || true
119
- done
120
- rmdir /workspace/.claude 2>/dev/null || true
121
- fi
122
- if [ -f /workspace/.claude.json ] && [ ! -f "$HOME/.claude.json" ]; then
123
- echo "[sandbox-d] migrating legacy /workspace/.claude.json → \$HOME/.claude.json"
124
- mv /workspace/.claude.json "$HOME/.claude.json"
125
- fi
142
+ # HYBRID: ensure the volume-backed npm global prefix exists so user
143
+ # `npm install -g <x>` has a target on first use (npm would create it anyway,
144
+ # but pre-making it keeps perms predictable + visible in the boot log).
145
+ mkdir -p /workspace/.npm-global
126
146
 
127
147
  # Onboarding config (overwrites every boot — intentional, deterministic state)
128
148
  echo "$ONBOARDING_JSON" > "$HOME/.claude.json"
package/dist/index.js CHANGED
@@ -2527,6 +2527,30 @@ async function main() {
2527
2527
  currentLLM = null;
2528
2528
  clearFastBrainSession();
2529
2529
  clearPipelineFastBrainSession();
2530
+ // ── Ghost-agent fix (2026-06-01) ──
2531
+ // When LiveKit Cloud evicts our WebSocket (idle, network blip, or quota window),
2532
+ // the previous code stopped here — agent process kept running but no longer in
2533
+ // any room. /health continued returning "livekit.status:connected" because the
2534
+ // status was never written back. Frontend's checkOsbornHealth only validates
2535
+ // HTTP 200, so the ghost state was invisible. Users got stuck in "Connecting..."
2536
+ // forever because their LiveKit-token-minted room had no agent in it.
2537
+ //
2538
+ // Fix: re-arm the retry loop. connectWithRetry() will try to reconnect with
2539
+ // the same room name (so the room code stays stable for any in-flight frontend
2540
+ // token requests), backing off 5s → 60s. If the disconnect was permanent
2541
+ // (e.g. JWT expired — they're 24h), the retry will fail and surface
2542
+ // livekit.status=failed, which the (also-fixed) frontend health check will
2543
+ // see and trigger restartService.
2544
+ //
2545
+ // Note: we mark status='retrying' immediately so /health reflects the real
2546
+ // state — closing the lie window between Disconnected and the next attempt.
2547
+ livekitState.status = 'retrying';
2548
+ livekitState.error = 'LiveKit room disconnected; attempting to rejoin';
2549
+ livekitState.errorCode = 'disconnected';
2550
+ console.log('🔄 Rejoining LiveKit room after disconnect...');
2551
+ connectWithRetry().catch(err => {
2552
+ console.error('❌ Reconnect attempt threw (should not happen — connectWithRetry loops):', err);
2553
+ });
2530
2554
  });
2531
2555
  room.on(RoomEvent.ParticipantConnected, async (participant) => {
2532
2556
  console.log(`\n👤 User joined: ${participant.identity}`);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "osborn",
3
- "version": "0.9.48",
3
+ "version": "0.9.50",
4
4
  "description": "Voice AI coding assistant - local agent that connects to Osborn frontend",
5
5
  "type": "module",
6
6
  "bin": {