npm - @synsci/cli-darwin-x64 - Versions diffs - 1.1.83 → 1.1.85 - Mend

@synsci/cli-darwin-x64 1.1.83 → 1.1.85

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/bin/skills/colab-finetuning/SKILL.md +153 -0
package/bin/skills/colab-finetuning/references/bridge-setup.md +68 -0
package/bin/skills/colab-finetuning/references/gpu-tiers.md +54 -0
package/bin/skills/colab-finetuning/references/troubleshooting.md +79 -0
package/bin/skills/unsloth/SKILL.md +37 -0
package/bin/synsc +0 -0
package/package.json +1 -1

package/bin/skills/colab-finetuning/SKILL.md ADDED Viewed

@@ -0,0 +1,153 @@
+---
+name: colab-finetuning
+description: Fine-tune LLMs on Google Colab GPUs directly from synsc. Connects to Colab runtimes via WebSocket bridge for remote training with Unsloth. Supports SFT, GRPO, DPO, vision, and TTS workflows on free T4 to Pro A100 GPUs.
+version: 1.0.0
+author: Synthetic Sciences
+license: MIT
+tags: [Fine-Tuning, Google Colab, GPU, Remote Training, Unsloth, LoRA, GRPO, Cloud GPU]
+dependencies: [unsloth, torch, transformers, trl, datasets]
+---
+# Google Colab Fine-Tuning
+Fine-tune LLMs using Google Colab GPUs directly from the synsc CLI. Connect to free or paid Colab runtimes and run Unsloth training workflows remotely.
+## When to Use Colab Fine-Tuning
+**Use Colab when:**
+- You don't have a local GPU but need to fine-tune a model
+- You want free GPU access (T4 with 15GB VRAM on Colab Free)
+- Training models up to ~14B parameters (4-bit QLoRA)
+- Quick experiments and prototyping before scaling to cloud
+- Colab Pro/Pro+ for A100 (40-80GB) access
+**Don't use Colab when:**
+- You need persistent long-running jobs (>12h) — use Tinker or cloud providers
+- Training 70B+ models — use Lambda, RunPod, or multi-GPU cloud
+- You need guaranteed uptime — Colab may disconnect idle sessions
+- Production training pipelines — use managed services
+**Colab vs Alternatives:**
+| Need | Use |
+|------|-----|
+| Free GPU, quick experiments | **Google Colab** |
+| Managed cloud training (any size) | Tinker |
+| Persistent multi-GPU training | Lambda / RunPod |
+| Local GPU available | Unsloth directly |
+| Enterprise with SLA | Colab Enterprise (Vertex AI) |
+## Quick Start
+### Step 1: Generate Bridge Notebook
+```
+Use colab_notebook tool with workflow="bridge"
+```
+This creates a `synsc-bridge.ipynb` file that establishes a WebSocket tunnel between synsc and the Colab GPU.
+### Step 2: Open in Colab
+1. Go to [colab.research.google.com](https://colab.research.google.com)
+2. Upload the bridge notebook (File → Upload notebook)
+3. Select GPU runtime (Runtime → Change runtime type → T4 GPU)
+4. Run all cells
+5. Copy the WebSocket URL that appears
+### Step 3: Connect from synsc
+```
+Use colab_connect tool with connection_url="wss://..."
+```
+### Step 4: Run Training
+```
+Use colab_finetune tool with:
+  workflow: "sft"
+  model: "unsloth/Qwen3-4B-unsloth-bnb-4bit"
+  dataset: "mlabonne/FineTome-100k"
+```
+Or execute individual cells:
+```
+Use colab_execute tool with code="import torch; print(torch.cuda.get_device_name(0))"
+```
+## GPU Tiers
+| Tier | GPU | VRAM | Max Model (QLoRA) | Session Limit |
+|------|-----|------|-------------------|---------------|
+| Free | T4 | 15 GB | ~14B | 12h, may disconnect |
+| Pro ($10/mo) | T4/V100/A100 | 16-40 GB | ~32B | 24h, priority |
+| Pro+ ($50/mo) | A100 (80GB) | 80 GB | ~72B | 24h, guaranteed |
+| Enterprise | Configurable | Any | Any | No limit |
+See [references/gpu-tiers.md](references/gpu-tiers.md) for detailed VRAM requirements.
+## Available Tools
+| Tool | Purpose |
+|------|---------|
+| `colab_connect` | Connect to a Colab runtime (standard bridge or enterprise) |
+| `colab_execute` | Run arbitrary Python code on the connected GPU |
+| `colab_status` | Check GPU, memory, disk, connection status |
+| `colab_finetune` | Run complete Unsloth training workflow (SFT/GRPO/DPO/vision/TTS) |
+| `colab_notebook` | Generate .ipynb notebooks for Colab |
+## Training Workflows
+### SFT (Supervised Fine-Tuning)
+Standard instruction tuning. Best for: chat models, domain adaptation, format learning.
+```
+colab_finetune workflow=sft model="unsloth/Qwen3-4B-unsloth-bnb-4bit" dataset="mlabonne/FineTome-100k"
+```
+### GRPO (Reinforcement Learning)
+Train reasoning models with reward functions. Best for: math, coding, structured output.
+```
+colab_finetune workflow=grpo model="unsloth/Qwen3-4B-unsloth-bnb-4bit" dataset="your-dataset"
+```
+### DPO (Direct Preference Optimization)
+Align models with human preferences. Requires chosen/rejected pairs.
+```
+colab_finetune workflow=dpo model="unsloth/Llama-3.1-8B-unsloth-bnb-4bit" dataset="HuggingFaceH4/ultrafeedback_binarized"
+```
+### Vision Fine-Tuning
+Fine-tune vision-language models (Qwen3-VL, Gemma 3, Llama 3.2 Vision).
+```
+colab_finetune workflow=vision model="unsloth/Qwen3-VL-2B-unsloth-bnb-4bit" dataset="your-vision-dataset"
+```
+### TTS Fine-Tuning
+Fine-tune text-to-speech models (Orpheus, Sesame-CSM).
+```
+colab_finetune workflow=tts model="unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit" dataset="your-tts-dataset"
+```
+## Troubleshooting
+### Connection Issues
+- **"WebSocket connection failed"**: Ensure the bridge notebook is still running in Colab. Re-run the tunnel cell if the URL expired.
+- **"Execution timeout"**: Colab may have disconnected due to inactivity. Run the keep-alive cell.
+- **"Connection closed during execution"**: Colab Free disconnects after idle time. Upgrade to Pro for more stability.
+### Training Issues
+- **CUDA OOM**: Reduce batch_size, max_seq_length, or lora_rank. Use 4-bit quantization.
+- **Slow training**: Ensure GPU runtime is selected. Check with `colab_status detail=gpu`.
+- **Package not found**: Unsloth is installed automatically by `colab_finetune`. For custom packages, use `colab_execute` with pip install.
+### Colab Enterprise
+- Requires GCP project with Vertex AI API enabled
+- Use `colab_connect mode=enterprise project_id="your-project"`
+- GPU quotas apply — check GCP console for availability
+See [references/troubleshooting.md](references/troubleshooting.md) for more solutions.

package/bin/skills/colab-finetuning/references/bridge-setup.md ADDED Viewed

@@ -0,0 +1,68 @@
+# Bridge Notebook Setup
+## How the Bridge Works
+The bridge notebook creates a WebSocket tunnel from a Google Colab runtime to your local machine:
+```
+synsc CLI ←→ WebSocket ←→ Cloudflare Tunnel ←→ Jupyter Server (on Colab GPU)
+```
+1. A Jupyter notebook server starts on the Colab VM (port 8888)
+2. `jupyter_http_over_ws` extension enables WebSocket connections
+3. `cloudflared` creates a public tunnel URL
+4. synsc connects via WebSocket and sends Jupyter kernel messages
+## Step-by-Step Setup
+### 1. Generate the Bridge Notebook
+```
+Use colab_notebook tool with workflow="bridge"
+```
+This creates `synsc-bridge.ipynb` in your project directory.
+### 2. Upload to Google Colab
+- Go to [colab.research.google.com](https://colab.research.google.com)
+- File → Upload notebook → select `synsc-bridge.ipynb`
+### 3. Select GPU Runtime
+- Runtime → Change runtime type
+- Hardware accelerator: GPU
+- GPU type: T4 (free) or A100 (pro)
+### 4. Run All Cells
+Run cells in order:
+1. **GPU check** — confirms GPU is available
+2. **Install dependencies** — installs `jupyter_http_over_ws`
+3. **Install cloudflared** — downloads the tunnel binary
+4. **Start bridge** — starts Jupyter + tunnel, prints WebSocket URL
+### 5. Copy the URL
+The output will show:
+```
+============================================================
+SYNSC BRIDGE READY
+============================================================
+Paste this URL into synsc:
+  wss://random-name.trycloudflare.com/api/kernels/default/channels?token=...
+============================================================
+```
+### 6. Connect from synsc
+```
+Use colab_connect tool with connection_url="wss://..."
+```
+### 7. Keep Alive
+Run the keep-alive cell to prevent Colab from disconnecting due to inactivity.
+## Security
+- The bridge uses a random token for authentication
+- The Cloudflare tunnel URL is unique and expires when the notebook stops
+- No Google account credentials are transmitted through the bridge
+- The tunnel is read/write — anyone with the URL can execute code on the Colab VM
+- **Do not share the WebSocket URL**

package/bin/skills/colab-finetuning/references/gpu-tiers.md ADDED Viewed

@@ -0,0 +1,54 @@
+# GPU Tiers & VRAM Requirements
+## Colab GPU Options
+### Free Tier
+- **T4**: 15 GB VRAM, Compute Capability 7.5
+- **P100**: 16 GB VRAM, Compute Capability 6.0 (less common)
+- Session limit: ~12 hours, may disconnect during idle
+### Pro Tier ($10/month)
+- **T4**: 15 GB VRAM (priority access)
+- **V100**: 16 GB VRAM, Compute Capability 7.0
+- **L4**: 24 GB VRAM, Compute Capability 8.9
+- **A100**: 40 GB VRAM, Compute Capability 8.0
+- Session limit: ~24 hours, priority GPU allocation
+### Pro+ Tier ($50/month)
+- **A100 (80GB)**: 80 GB VRAM
+- Session limit: ~24 hours, guaranteed GPU access
+- Background execution available
+### Enterprise (Vertex AI)
+- Any GPU type configured in GCP
+- No session limits
+- Full API control
+- Requires GCP project and billing
+## Model VRAM Requirements (4-bit QLoRA)
+| Model Size | VRAM Required | Fits on |
+|-----------|---------------|---------|
+| 1B | ~3 GB | T4 (free) |
+| 3B | ~5 GB | T4 (free) |
+| 4B | ~6 GB | T4 (free) |
+| 7-8B | ~9 GB | T4 (free) |
+| 13-14B | ~15 GB | T4 (free, tight) |
+| 32B | ~22 GB | L4, A100 (Pro) |
+| 70-72B | ~44 GB | A100 80GB (Pro+) |
+## Recommended Configurations
+### Free Tier (T4, 15 GB)
+- **SFT**: Qwen3-4B, Llama-3.2-3B, Gemma-3-4B — batch_size=2, max_seq=2048
+- **GRPO**: Qwen3-4B — batch_size=1, max_seq=1024, num_generations=4
+- **Vision**: Qwen3-VL-2B — batch_size=1
+### Pro Tier (A100, 40 GB)
+- **SFT**: Qwen3-14B, Llama-3.1-8B — batch_size=4, max_seq=4096
+- **GRPO**: 8B models — batch_size=2, num_generations=8
+- **DPO**: 8B models — batch_size=4
+### Pro+ Tier (A100 80GB)
+- **SFT**: Qwen3-32B, Llama-3.3-70B (tight) — batch_size=1-2
+- **GRPO**: 32B models — batch_size=1

package/bin/skills/colab-finetuning/references/troubleshooting.md ADDED Viewed

@@ -0,0 +1,79 @@
+# Troubleshooting
+## Connection Problems
+### "WebSocket connection failed"
+- **Cause**: Bridge notebook stopped or tunnel URL expired
+- **Fix**: Re-run the bridge cells in Colab to get a fresh URL
+### "Connection timeout after 30 seconds"
+- **Cause**: Firewall blocking WebSocket connections, or Colab VM is unresponsive
+- **Fix**: Try a different network. Check if Colab runtime is still active.
+### "Connection closed during execution"
+- **Cause**: Colab disconnected the session (idle timeout or resource limits)
+- **Fix**: Run the keep-alive cell. Upgrade to Colab Pro for longer sessions.
+### Bridge notebook won't start
+- **Cause**: Colab Free tier quota exhausted
+- **Fix**: Wait for quota reset (resets daily) or upgrade to Pro.
+## Training Problems
+### CUDA Out of Memory (OOM)
+- Reduce `batch_size` to 1
+- Reduce `max_seq_length` (try 1024 or 512)
+- Reduce `lora_rank` (try 8 instead of 16)
+- Use `quantization="4bit"` (default)
+- For GRPO: reduce `num_generations` to 2
+### Training is very slow
+- Check GPU: `colab_status detail=gpu` — ensure a GPU is connected
+- If using T4, training 8B+ models will be slower than A100
+- Ensure `packing=True` for SFT (default in templates)
+- Check for CPU bottleneck: `colab_status detail=full`
+### "ModuleNotFoundError: No module named 'unsloth'"
+- The `colab_finetune` tool installs Unsloth automatically
+- If running manual code, first run: `!pip install unsloth`
+### Dataset loading fails
+- Verify the dataset ID exists on HuggingFace
+- For gated datasets, authenticate: `!huggingface-cli login`
+- For local datasets, upload to Colab first with `colab_execute`
+### Loss is NaN or not decreasing
+- Reduce learning rate (try 1e-5 for GRPO, 5e-5 for SFT)
+- Check dataset format matches the template expectations
+- Ensure the dataset isn't empty: `colab_execute code="print(len(dataset))"`
+## Colab-Specific Issues
+### "You have exhausted your GPU quota"
+- Colab Free limits GPU usage per day/week
+- **Fix**: Wait for reset, or use Colab Pro
+### Runtime disconnects after ~30 minutes
+- Run the keep-alive cell in the bridge notebook
+- Interact with the Colab tab occasionally (Colab detects browser activity)
+### "Cannot connect to GPU backend"
+- Colab may be overloaded — try again later
+- Change runtime type and change back (Runtime → Change runtime type)
+### Files disappear after disconnect
+- Colab VMs are ephemeral — files are deleted when the runtime stops
+- Save important files to Google Drive or push to HuggingFace Hub
+- Use `push_to_hub` parameter in `colab_finetune` to auto-upload
+## Enterprise (Vertex AI) Issues
+### "Not authenticated"
+- Run `/connect google-colab` to authenticate with Google OAuth
+### "Insufficient quota"
+- Check GCP quotas in the Cloud Console
+- Request quota increase for the desired GPU type and region
+### "Vertex AI API not enabled"
+- Enable the API: `gcloud services enable aiplatform.googleapis.com`

package/bin/skills/unsloth/SKILL.md CHANGED Viewed

@@ -392,6 +392,43 @@ Orpheus supports emotional tags: `<laugh>`, `<sigh>`, `<cough>`, `<gasp>`, `<yaw
 ---
+## Workflow 5: Colab Fine-Tuning (Remote GPU)
+Use this to run any Unsloth workflow on a Google Colab GPU directly from synsc — no local GPU required.
+### Setup
+1. Generate the bridge notebook: `colab_notebook workflow=bridge`
+2. Upload to Google Colab, select GPU runtime, run all cells
+3. Copy the WebSocket URL → `colab_connect connection_url="wss://..."`
+### Run Training Remotely
+```
+colab_finetune workflow=sft model="unsloth/Qwen3-4B-unsloth-bnb-4bit" dataset="mlabonne/FineTome-100k"
+```
+All SFT/GRPO/DPO/vision/TTS workflows work identically on Colab. The plugin handles:
+- Unsloth installation on the Colab VM
+- Model loading, LoRA setup, dataset preparation
+- Training execution with streaming output
+- Model saving and optional HuggingFace Hub push
+### GPU Recommendations
+| Colab Tier | GPU | VRAM | Max Model (QLoRA) |
+|-----------|-----|------|-------------------|
+| Free | T4 | 15 GB | ~14B |
+| Pro | A100 | 40 GB | ~32B |
+| Pro+ | A100 80GB | 80 GB | ~72B |
+### Key Differences from Local Training
+- Files are ephemeral — save to HuggingFace Hub with `push_to_hub` parameter
+- Session may disconnect — use keep-alive cell in bridge notebook
+- Package installation happens on each new session
+See the **colab-finetuning** skill for detailed Colab-specific guidance.
+---
 ## Model Selection
 ### Instruct vs Base Model

package/bin/synsc CHANGED Viewed

Binary file

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@synsci/cli-darwin-x64",
-  "version": "1.1.83",
+  "version": "1.1.85",
   "os": [
     "darwin"
   ],