@synsci/cli-darwin-x64 1.1.83 → 1.1.85

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,153 @@
1
+ ---
2
+ name: colab-finetuning
3
+ description: Fine-tune LLMs on Google Colab GPUs directly from synsc. Connects to Colab runtimes via WebSocket bridge for remote training with Unsloth. Supports SFT, GRPO, DPO, vision, and TTS workflows on free T4 to Pro A100 GPUs.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Fine-Tuning, Google Colab, GPU, Remote Training, Unsloth, LoRA, GRPO, Cloud GPU]
8
+ dependencies: [unsloth, torch, transformers, trl, datasets]
9
+ ---
10
+
11
+ # Google Colab Fine-Tuning
12
+
13
+ Fine-tune LLMs using Google Colab GPUs directly from the synsc CLI. Connect to free or paid Colab runtimes and run Unsloth training workflows remotely.
14
+
15
+ ## When to Use Colab Fine-Tuning
16
+
17
+ **Use Colab when:**
18
+ - You don't have a local GPU but need to fine-tune a model
19
+ - You want free GPU access (T4 with 15GB VRAM on Colab Free)
20
+ - Training models up to ~14B parameters (4-bit QLoRA)
21
+ - Quick experiments and prototyping before scaling to cloud
22
+ - Colab Pro/Pro+ for A100 (40-80GB) access
23
+
24
+ **Don't use Colab when:**
25
+ - You need persistent long-running jobs (>12h) — use Tinker or cloud providers
26
+ - Training 70B+ models — use Lambda, RunPod, or multi-GPU cloud
27
+ - You need guaranteed uptime — Colab may disconnect idle sessions
28
+ - Production training pipelines — use managed services
29
+
30
+ **Colab vs Alternatives:**
31
+
32
+ | Need | Use |
33
+ |------|-----|
34
+ | Free GPU, quick experiments | **Google Colab** |
35
+ | Managed cloud training (any size) | Tinker |
36
+ | Persistent multi-GPU training | Lambda / RunPod |
37
+ | Local GPU available | Unsloth directly |
38
+ | Enterprise with SLA | Colab Enterprise (Vertex AI) |
39
+
40
+ ## Quick Start
41
+
42
+ ### Step 1: Generate Bridge Notebook
43
+
44
+ ```
45
+ Use colab_notebook tool with workflow="bridge"
46
+ ```
47
+
48
+ This creates a `synsc-bridge.ipynb` file that establishes a WebSocket tunnel between synsc and the Colab GPU.
49
+
50
+ ### Step 2: Open in Colab
51
+
52
+ 1. Go to [colab.research.google.com](https://colab.research.google.com)
53
+ 2. Upload the bridge notebook (File → Upload notebook)
54
+ 3. Select GPU runtime (Runtime → Change runtime type → T4 GPU)
55
+ 4. Run all cells
56
+ 5. Copy the WebSocket URL that appears
57
+
58
+ ### Step 3: Connect from synsc
59
+
60
+ ```
61
+ Use colab_connect tool with connection_url="wss://..."
62
+ ```
63
+
64
+ ### Step 4: Run Training
65
+
66
+ ```
67
+ Use colab_finetune tool with:
68
+ workflow: "sft"
69
+ model: "unsloth/Qwen3-4B-unsloth-bnb-4bit"
70
+ dataset: "mlabonne/FineTome-100k"
71
+ ```
72
+
73
+ Or execute individual cells:
74
+ ```
75
+ Use colab_execute tool with code="import torch; print(torch.cuda.get_device_name(0))"
76
+ ```
77
+
78
+ ## GPU Tiers
79
+
80
+ | Tier | GPU | VRAM | Max Model (QLoRA) | Session Limit |
81
+ |------|-----|------|-------------------|---------------|
82
+ | Free | T4 | 15 GB | ~14B | 12h, may disconnect |
83
+ | Pro ($10/mo) | T4/V100/A100 | 16-40 GB | ~32B | 24h, priority |
84
+ | Pro+ ($50/mo) | A100 (80GB) | 80 GB | ~72B | 24h, guaranteed |
85
+ | Enterprise | Configurable | Any | Any | No limit |
86
+
87
+ See [references/gpu-tiers.md](references/gpu-tiers.md) for detailed VRAM requirements.
88
+
89
+ ## Available Tools
90
+
91
+ | Tool | Purpose |
92
+ |------|---------|
93
+ | `colab_connect` | Connect to a Colab runtime (standard bridge or enterprise) |
94
+ | `colab_execute` | Run arbitrary Python code on the connected GPU |
95
+ | `colab_status` | Check GPU, memory, disk, connection status |
96
+ | `colab_finetune` | Run complete Unsloth training workflow (SFT/GRPO/DPO/vision/TTS) |
97
+ | `colab_notebook` | Generate .ipynb notebooks for Colab |
98
+
99
+ ## Training Workflows
100
+
101
+ ### SFT (Supervised Fine-Tuning)
102
+ Standard instruction tuning. Best for: chat models, domain adaptation, format learning.
103
+
104
+ ```
105
+ colab_finetune workflow=sft model="unsloth/Qwen3-4B-unsloth-bnb-4bit" dataset="mlabonne/FineTome-100k"
106
+ ```
107
+
108
+ ### GRPO (Reinforcement Learning)
109
+ Train reasoning models with reward functions. Best for: math, coding, structured output.
110
+
111
+ ```
112
+ colab_finetune workflow=grpo model="unsloth/Qwen3-4B-unsloth-bnb-4bit" dataset="your-dataset"
113
+ ```
114
+
115
+ ### DPO (Direct Preference Optimization)
116
+ Align models with human preferences. Requires chosen/rejected pairs.
117
+
118
+ ```
119
+ colab_finetune workflow=dpo model="unsloth/Llama-3.1-8B-unsloth-bnb-4bit" dataset="HuggingFaceH4/ultrafeedback_binarized"
120
+ ```
121
+
122
+ ### Vision Fine-Tuning
123
+ Fine-tune vision-language models (Qwen3-VL, Gemma 3, Llama 3.2 Vision).
124
+
125
+ ```
126
+ colab_finetune workflow=vision model="unsloth/Qwen3-VL-2B-unsloth-bnb-4bit" dataset="your-vision-dataset"
127
+ ```
128
+
129
+ ### TTS Fine-Tuning
130
+ Fine-tune text-to-speech models (Orpheus, Sesame-CSM).
131
+
132
+ ```
133
+ colab_finetune workflow=tts model="unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit" dataset="your-tts-dataset"
134
+ ```
135
+
136
+ ## Troubleshooting
137
+
138
+ ### Connection Issues
139
+ - **"WebSocket connection failed"**: Ensure the bridge notebook is still running in Colab. Re-run the tunnel cell if the URL expired.
140
+ - **"Execution timeout"**: Colab may have disconnected due to inactivity. Run the keep-alive cell.
141
+ - **"Connection closed during execution"**: Colab Free disconnects after idle time. Upgrade to Pro for more stability.
142
+
143
+ ### Training Issues
144
+ - **CUDA OOM**: Reduce batch_size, max_seq_length, or lora_rank. Use 4-bit quantization.
145
+ - **Slow training**: Ensure GPU runtime is selected. Check with `colab_status detail=gpu`.
146
+ - **Package not found**: Unsloth is installed automatically by `colab_finetune`. For custom packages, use `colab_execute` with pip install.
147
+
148
+ ### Colab Enterprise
149
+ - Requires GCP project with Vertex AI API enabled
150
+ - Use `colab_connect mode=enterprise project_id="your-project"`
151
+ - GPU quotas apply — check GCP console for availability
152
+
153
+ See [references/troubleshooting.md](references/troubleshooting.md) for more solutions.
@@ -0,0 +1,68 @@
1
+ # Bridge Notebook Setup
2
+
3
+ ## How the Bridge Works
4
+
5
+ The bridge notebook creates a WebSocket tunnel from a Google Colab runtime to your local machine:
6
+
7
+ ```
8
+ synsc CLI ←→ WebSocket ←→ Cloudflare Tunnel ←→ Jupyter Server (on Colab GPU)
9
+ ```
10
+
11
+ 1. A Jupyter notebook server starts on the Colab VM (port 8888)
12
+ 2. `jupyter_http_over_ws` extension enables WebSocket connections
13
+ 3. `cloudflared` creates a public tunnel URL
14
+ 4. synsc connects via WebSocket and sends Jupyter kernel messages
15
+
16
+ ## Step-by-Step Setup
17
+
18
+ ### 1. Generate the Bridge Notebook
19
+ ```
20
+ Use colab_notebook tool with workflow="bridge"
21
+ ```
22
+ This creates `synsc-bridge.ipynb` in your project directory.
23
+
24
+ ### 2. Upload to Google Colab
25
+ - Go to [colab.research.google.com](https://colab.research.google.com)
26
+ - File → Upload notebook → select `synsc-bridge.ipynb`
27
+
28
+ ### 3. Select GPU Runtime
29
+ - Runtime → Change runtime type
30
+ - Hardware accelerator: GPU
31
+ - GPU type: T4 (free) or A100 (pro)
32
+
33
+ ### 4. Run All Cells
34
+ Run cells in order:
35
+ 1. **GPU check** — confirms GPU is available
36
+ 2. **Install dependencies** — installs `jupyter_http_over_ws`
37
+ 3. **Install cloudflared** — downloads the tunnel binary
38
+ 4. **Start bridge** — starts Jupyter + tunnel, prints WebSocket URL
39
+
40
+ ### 5. Copy the URL
41
+ The output will show:
42
+ ```
43
+ ============================================================
44
+ SYNSC BRIDGE READY
45
+ ============================================================
46
+
47
+ Paste this URL into synsc:
48
+
49
+ wss://random-name.trycloudflare.com/api/kernels/default/channels?token=...
50
+
51
+ ============================================================
52
+ ```
53
+
54
+ ### 6. Connect from synsc
55
+ ```
56
+ Use colab_connect tool with connection_url="wss://..."
57
+ ```
58
+
59
+ ### 7. Keep Alive
60
+ Run the keep-alive cell to prevent Colab from disconnecting due to inactivity.
61
+
62
+ ## Security
63
+
64
+ - The bridge uses a random token for authentication
65
+ - The Cloudflare tunnel URL is unique and expires when the notebook stops
66
+ - No Google account credentials are transmitted through the bridge
67
+ - The tunnel is read/write — anyone with the URL can execute code on the Colab VM
68
+ - **Do not share the WebSocket URL**
@@ -0,0 +1,54 @@
1
+ # GPU Tiers & VRAM Requirements
2
+
3
+ ## Colab GPU Options
4
+
5
+ ### Free Tier
6
+ - **T4**: 15 GB VRAM, Compute Capability 7.5
7
+ - **P100**: 16 GB VRAM, Compute Capability 6.0 (less common)
8
+ - Session limit: ~12 hours, may disconnect during idle
9
+
10
+ ### Pro Tier ($10/month)
11
+ - **T4**: 15 GB VRAM (priority access)
12
+ - **V100**: 16 GB VRAM, Compute Capability 7.0
13
+ - **L4**: 24 GB VRAM, Compute Capability 8.9
14
+ - **A100**: 40 GB VRAM, Compute Capability 8.0
15
+ - Session limit: ~24 hours, priority GPU allocation
16
+
17
+ ### Pro+ Tier ($50/month)
18
+ - **A100 (80GB)**: 80 GB VRAM
19
+ - Session limit: ~24 hours, guaranteed GPU access
20
+ - Background execution available
21
+
22
+ ### Enterprise (Vertex AI)
23
+ - Any GPU type configured in GCP
24
+ - No session limits
25
+ - Full API control
26
+ - Requires GCP project and billing
27
+
28
+ ## Model VRAM Requirements (4-bit QLoRA)
29
+
30
+ | Model Size | VRAM Required | Fits on |
31
+ |-----------|---------------|---------|
32
+ | 1B | ~3 GB | T4 (free) |
33
+ | 3B | ~5 GB | T4 (free) |
34
+ | 4B | ~6 GB | T4 (free) |
35
+ | 7-8B | ~9 GB | T4 (free) |
36
+ | 13-14B | ~15 GB | T4 (free, tight) |
37
+ | 32B | ~22 GB | L4, A100 (Pro) |
38
+ | 70-72B | ~44 GB | A100 80GB (Pro+) |
39
+
40
+ ## Recommended Configurations
41
+
42
+ ### Free Tier (T4, 15 GB)
43
+ - **SFT**: Qwen3-4B, Llama-3.2-3B, Gemma-3-4B — batch_size=2, max_seq=2048
44
+ - **GRPO**: Qwen3-4B — batch_size=1, max_seq=1024, num_generations=4
45
+ - **Vision**: Qwen3-VL-2B — batch_size=1
46
+
47
+ ### Pro Tier (A100, 40 GB)
48
+ - **SFT**: Qwen3-14B, Llama-3.1-8B — batch_size=4, max_seq=4096
49
+ - **GRPO**: 8B models — batch_size=2, num_generations=8
50
+ - **DPO**: 8B models — batch_size=4
51
+
52
+ ### Pro+ Tier (A100 80GB)
53
+ - **SFT**: Qwen3-32B, Llama-3.3-70B (tight) — batch_size=1-2
54
+ - **GRPO**: 32B models — batch_size=1
@@ -0,0 +1,79 @@
1
+ # Troubleshooting
2
+
3
+ ## Connection Problems
4
+
5
+ ### "WebSocket connection failed"
6
+ - **Cause**: Bridge notebook stopped or tunnel URL expired
7
+ - **Fix**: Re-run the bridge cells in Colab to get a fresh URL
8
+
9
+ ### "Connection timeout after 30 seconds"
10
+ - **Cause**: Firewall blocking WebSocket connections, or Colab VM is unresponsive
11
+ - **Fix**: Try a different network. Check if Colab runtime is still active.
12
+
13
+ ### "Connection closed during execution"
14
+ - **Cause**: Colab disconnected the session (idle timeout or resource limits)
15
+ - **Fix**: Run the keep-alive cell. Upgrade to Colab Pro for longer sessions.
16
+
17
+ ### Bridge notebook won't start
18
+ - **Cause**: Colab Free tier quota exhausted
19
+ - **Fix**: Wait for quota reset (resets daily) or upgrade to Pro.
20
+
21
+ ## Training Problems
22
+
23
+ ### CUDA Out of Memory (OOM)
24
+ - Reduce `batch_size` to 1
25
+ - Reduce `max_seq_length` (try 1024 or 512)
26
+ - Reduce `lora_rank` (try 8 instead of 16)
27
+ - Use `quantization="4bit"` (default)
28
+ - For GRPO: reduce `num_generations` to 2
29
+
30
+ ### Training is very slow
31
+ - Check GPU: `colab_status detail=gpu` — ensure a GPU is connected
32
+ - If using T4, training 8B+ models will be slower than A100
33
+ - Ensure `packing=True` for SFT (default in templates)
34
+ - Check for CPU bottleneck: `colab_status detail=full`
35
+
36
+ ### "ModuleNotFoundError: No module named 'unsloth'"
37
+ - The `colab_finetune` tool installs Unsloth automatically
38
+ - If running manual code, first run: `!pip install unsloth`
39
+
40
+ ### Dataset loading fails
41
+ - Verify the dataset ID exists on HuggingFace
42
+ - For gated datasets, authenticate: `!huggingface-cli login`
43
+ - For local datasets, upload to Colab first with `colab_execute`
44
+
45
+ ### Loss is NaN or not decreasing
46
+ - Reduce learning rate (try 1e-5 for GRPO, 5e-5 for SFT)
47
+ - Check dataset format matches the template expectations
48
+ - Ensure the dataset isn't empty: `colab_execute code="print(len(dataset))"`
49
+
50
+ ## Colab-Specific Issues
51
+
52
+ ### "You have exhausted your GPU quota"
53
+ - Colab Free limits GPU usage per day/week
54
+ - **Fix**: Wait for reset, or use Colab Pro
55
+
56
+ ### Runtime disconnects after ~30 minutes
57
+ - Run the keep-alive cell in the bridge notebook
58
+ - Interact with the Colab tab occasionally (Colab detects browser activity)
59
+
60
+ ### "Cannot connect to GPU backend"
61
+ - Colab may be overloaded — try again later
62
+ - Change runtime type and change back (Runtime → Change runtime type)
63
+
64
+ ### Files disappear after disconnect
65
+ - Colab VMs are ephemeral — files are deleted when the runtime stops
66
+ - Save important files to Google Drive or push to HuggingFace Hub
67
+ - Use `push_to_hub` parameter in `colab_finetune` to auto-upload
68
+
69
+ ## Enterprise (Vertex AI) Issues
70
+
71
+ ### "Not authenticated"
72
+ - Run `/connect google-colab` to authenticate with Google OAuth
73
+
74
+ ### "Insufficient quota"
75
+ - Check GCP quotas in the Cloud Console
76
+ - Request quota increase for the desired GPU type and region
77
+
78
+ ### "Vertex AI API not enabled"
79
+ - Enable the API: `gcloud services enable aiplatform.googleapis.com`
@@ -392,6 +392,43 @@ Orpheus supports emotional tags: `<laugh>`, `<sigh>`, `<cough>`, `<gasp>`, `<yaw
392
392
 
393
393
  ---
394
394
 
395
+ ## Workflow 5: Colab Fine-Tuning (Remote GPU)
396
+
397
+ Use this to run any Unsloth workflow on a Google Colab GPU directly from synsc — no local GPU required.
398
+
399
+ ### Setup
400
+ 1. Generate the bridge notebook: `colab_notebook workflow=bridge`
401
+ 2. Upload to Google Colab, select GPU runtime, run all cells
402
+ 3. Copy the WebSocket URL → `colab_connect connection_url="wss://..."`
403
+
404
+ ### Run Training Remotely
405
+ ```
406
+ colab_finetune workflow=sft model="unsloth/Qwen3-4B-unsloth-bnb-4bit" dataset="mlabonne/FineTome-100k"
407
+ ```
408
+
409
+ All SFT/GRPO/DPO/vision/TTS workflows work identically on Colab. The plugin handles:
410
+ - Unsloth installation on the Colab VM
411
+ - Model loading, LoRA setup, dataset preparation
412
+ - Training execution with streaming output
413
+ - Model saving and optional HuggingFace Hub push
414
+
415
+ ### GPU Recommendations
416
+
417
+ | Colab Tier | GPU | VRAM | Max Model (QLoRA) |
418
+ |-----------|-----|------|-------------------|
419
+ | Free | T4 | 15 GB | ~14B |
420
+ | Pro | A100 | 40 GB | ~32B |
421
+ | Pro+ | A100 80GB | 80 GB | ~72B |
422
+
423
+ ### Key Differences from Local Training
424
+ - Files are ephemeral — save to HuggingFace Hub with `push_to_hub` parameter
425
+ - Session may disconnect — use keep-alive cell in bridge notebook
426
+ - Package installation happens on each new session
427
+
428
+ See the **colab-finetuning** skill for detailed Colab-specific guidance.
429
+
430
+ ---
431
+
395
432
  ## Model Selection
396
433
 
397
434
  ### Instruct vs Base Model
package/bin/synsc CHANGED
Binary file
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@synsci/cli-darwin-x64",
3
- "version": "1.1.83",
3
+ "version": "1.1.85",
4
4
  "os": [
5
5
  "darwin"
6
6
  ],