labgate 0.5.27 → 0.5.28

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,13 +1,23 @@
1
1
  # LabGate
2
2
 
3
- Policy-controlled sandboxes for AI coding agents. Built for HPC clusters.
3
+ Secure wrapper around LLM coding agents for HPC clusters.
4
+
5
+ LabGate lets institutions adopt AI coding tools without giving agents unrestricted host access. It is designed for shared research environments where HPC admins need policy and audit controls, while researchers need a practical day-to-day workflow for coding, data analysis, and SLURM jobs.
6
+
7
+ ## Product Goal
8
+
9
+ - Give HPC admins a deployable control layer for agent sessions.
10
+ - Make Claude-assisted work practical for researchers on real cluster infrastructure.
11
+ - Keep the default path simple and reliable: `labgate claude` + Apptainer + SLURM.
4
12
 
5
13
  ## Current Product Focus
6
14
 
7
15
  - Primary workflow: Claude (`labgate claude`)
8
16
  - Primary runtime: Apptainer on HPC
9
- - macOS runtime: Podman (best-effort fallback path)
10
- - Secondary targets (best-effort): other agents
17
+ - SLURM integration: enabled by default (`slurm.enabled = true`)
18
+ - Secondary targets: other agents/runtimes are best-effort only
19
+
20
+ LabGate still contains Podman runtime code for local/non-HPC scenarios, but that is not the primary supported path.
11
21
 
12
22
  ## Install
13
23
 
@@ -15,46 +25,107 @@ Policy-controlled sandboxes for AI coding agents. Built for HPC clusters.
15
25
  npm i -g labgate
16
26
  ```
17
27
 
18
- Note: LabGate uses `node-pty` only for the optional sticky footer. On minimal Linux installs, that dependency may fail to build without a compiler toolchain. If it fails, the install still works and LabGate falls back to non-sticky output.
28
+ Note: LabGate uses `node-pty` only for the optional sticky footer. On minimal Linux installs, that dependency may fail to build without a compiler toolchain. If it fails, install still succeeds and LabGate falls back to non-sticky output.
29
+
30
+ ## Quick Start (Researcher)
31
+
32
+ ```bash
33
+ labgate init
34
+ labgate claude
35
+ ```
36
+
37
+ Typical HPC flow:
38
+
39
+ 1. Login node: run `labgate ui`
40
+ 2. Compute allocation: `srun --pty bash`, then `labgate claude` in your project directory
41
+
42
+ Useful follow-ups for data-heavy work:
43
+
44
+ ```bash
45
+ labgate dataset list
46
+ labgate slurm status
47
+ labgate logs --follow
48
+ ```
49
+
50
+ Example (life science / data analysis workflow):
51
+
52
+ ```bash
53
+ # 1) register datasets in ~/.labgate/config.json (via UI or config edit)
54
+ # 2) initialize dataset stats for discoverability
55
+ labgate dataset init rnaseq-cohort
56
+
57
+ # 3) start agent and run analysis in project directory
58
+ labgate claude
19
59
 
20
- LabGate prefers Apptainer for sandbox runtime and supports Podman as a fallback (especially on macOS).
60
+ # 4) submit SLURM job with host-valid output paths (relative preferred)
61
+ sbatch --output slurm-%j.out --error slurm-%j.err run_qc.sh
62
+
63
+ # 5) inspect tracked jobs and output
64
+ labgate slurm status
65
+ labgate slurm output <job-id> --tail 100
66
+ ```
21
67
 
22
- ## Quick start
68
+ ## Quick Start (HPC Admin)
23
69
 
24
70
  ```bash
25
- labgate init # create ~/.labgate/config.json
26
- labgate claude # launch Claude Code in current dir
27
- labgate codex /projects/my-analysis # launch Codex in a specific dir
71
+ # Install license (enterprise mode)
72
+ labgate license install <key-or-file> --system
73
+
74
+ # Create baseline policy
75
+ labgate policy init --path /etc/labgate/policy.json --admin <hpc-admin-username>
76
+ labgate policy validate
77
+
78
+ # Validate default runtime behavior
79
+ labgate config get runtime
28
80
  ```
29
81
 
30
- ## What it does
82
+ Admin controls can force/lock runtime, network mode, audit settings, and SLURM behavior through policy.
83
+
84
+ ## Why HPC Admins Deploy It
85
+
86
+ - Scoped filesystem mounts instead of full host exposure
87
+ - Default blocking of common credential and key material paths (`.ssh`, `.aws`, `.env`, `.gnupg`, key files)
88
+ - Network policy modes (`host`, `filtered`, `none`)
89
+ - Command blacklist inside sandbox (`ssh`, `curl`, `wget`, etc.)
90
+ - Session/audit logging for operational traceability
91
+ - Enterprise policy and lock semantics for institution-level governance
92
+ - SLURM-aware behavior designed for shared cluster operations
31
93
 
32
- LabGate runs your AI coding agent inside a sandboxed container with:
94
+ ## Why Researchers Keep Using It (Life Science / Data Analysis)
33
95
 
34
- - **Scoped filesystem** only your working directory and configured paths are visible
35
- - **Credential blocking** `.ssh`, `.aws`, `.env`, `.gnupg`, and other sensitive paths are hidden by default
36
- - **Network policy** configurable network modes (`host`, `filtered`, `none`)
37
- - **Command blocking** `ssh`, `curl`, `wget`, and other commands are blocked by default
38
- - **Audit logging** session start/stop and mount configuration logged to `~/.labgate/logs/`
39
- - **Dashboard instructions editor** — view and update per-session `AGENTS.md` / `CLAUDE.md` from the UI
40
- - **Session context injection** — LabGate prepends a temporary sandbox-mapping instruction block during active sessions
41
- - **HPC ready** — first-class Apptainer support for shared clusters
96
+ - Works in existing project folders and scheduler workflows
97
+ - Named dataset mounts under `/datasets/<name>` reduce path confusion in collaborative analysis
98
+ - Auto-injected session context gives the agent correct path + cluster constraints
99
+ - SLURM tracking + MCP tools help inspect jobs and output without leaving the coding workflow
100
+ - Results registry MCP lets teams record findings, artifacts, and summaries across sessions
101
+
102
+ ## What LabGate Enforces
103
+
104
+ LabGate runs AI coding agents in a sandboxed container with:
105
+
106
+ - **Scoped filesystem**: only workdir + configured mounts are visible
107
+ - **Credential blocking**: sensitive paths hidden by default
108
+ - **Network policy**: configurable network mode
109
+ - **Command blocking**: risky commands blocked by default
110
+ - **Audit logging**: session lifecycle + key security events in `~/.labgate/logs/`
111
+ - **Instruction management**: temporary LabGate context blocks in `CLAUDE.md` / `AGENTS.md`
112
+ - **HPC integration**: Apptainer-first runtime behavior and SLURM support
42
113
 
43
114
  ## Configuration
44
115
 
45
- Edit `~/.labgate/config.json` to customize:
116
+ Edit config:
46
117
 
47
118
  ```bash
48
119
  $EDITOR ~/.labgate/config.json
49
120
  ```
50
121
 
51
- Or start fresh:
122
+ Reset full config:
52
123
 
53
124
  ```bash
54
125
  labgate init --force
55
126
  ```
56
127
 
57
- Or reset a single setting back to defaults:
128
+ Reset a single setting to defaults:
58
129
 
59
130
  ```bash
60
131
  labgate config reset image
@@ -64,73 +135,101 @@ labgate config reset image
64
135
 
65
136
  | Setting | Default | What it does |
66
137
  |---------|---------|-------------|
67
- | `runtime` | `auto` | `auto`, `apptainer`, or `podman` |
138
+ | `runtime` | `auto` | Runtime preference (`auto`, `apptainer`, `podman`) |
68
139
  | `image` | `docker.io/library/node:20-bookworm` | Container image |
69
140
  | `session_timeout_hours` | `8` | Max session length |
70
141
  | `filesystem.blocked_patterns` | `.ssh, .aws, .env, ...` | Hidden from sandbox |
71
142
  | `filesystem.extra_paths` | `[]` | Additional mounts |
143
+ | `datasets` | `[]` | Named dataset mounts under `/datasets/*` |
72
144
  | `network.mode` | `host` | `none`, `filtered`, or `host` |
73
145
  | `commands.blacklist` | `ssh, curl, wget, ...` | Blocked commands |
74
- | `slurm.enabled` | `true` | Enable SLURM CLI passthrough (`sbatch`, `squeue`, etc.) and job tracking |
146
+ | `slurm.enabled` | `true` | Enable SLURM tracking + passthrough |
147
+ | `slurm.mcp_server` | `true` | Enable SLURM MCP server integration |
148
+ | `audit.enabled` | `true` | Enable audit logging |
75
149
 
76
150
  ## Commands
77
151
 
78
152
  ```bash
79
- labgate claude [workdir] # launch Claude Code
80
- labgate codex [workdir] # launch Codex
81
- labgate feedback # submit feedback (interactive or piped)
82
- labgate status # list running sessions
83
- labgate stop <id> # stop a session
84
- labgate ui # start dashboard server on localhost:7700 (auth token required)
85
- labgate register <activation-key> [--server <url>] # activate + install enterprise license
86
- labgate license # show enterprise license status
87
- labgate license install <key-or-file> [--system|--user|--path] # install enterprise license key
88
- labgate policy init [--institution ... --admin ...] # create policy template
89
- labgate policy validate [file] # validate policy JSON
90
- labgate logs [-n 20] # view recent audit events
91
- labgate logs --follow # stream new audit events
92
- labgate init [--force] # create/reset config
153
+ # Agent sessions
154
+ labgate claude [workdir]
155
+ labgate codex [workdir] # secondary/best-effort path
156
+
157
+ # Session lifecycle
158
+ labgate status
159
+ labgate stop <id>
160
+ labgate restart <id>
161
+ labgate continue [web-terminal-id] [--latest]
162
+
163
+ # UI + logs
164
+ labgate ui
165
+ labgate logs [-n 20]
166
+ labgate logs --follow
167
+ labgate doctor
168
+
169
+ # Config + setup
170
+ labgate init [--force]
171
+ labgate config get <key>
172
+ labgate config set <key> <value>
173
+ labgate config reset <key>
174
+
175
+ # Dataset workflow
176
+ labgate dataset list
177
+ labgate dataset init <name>
178
+
179
+ # SLURM workflow
180
+ labgate slurm status
181
+ labgate slurm job <id>
182
+ labgate slurm output <id> [--stderr] [--tail <lines>]
183
+ labgate slurm cancel <id>
184
+ labgate slurm mcp
185
+
186
+ # Enterprise
187
+ labgate license
188
+ labgate license install <key-or-file> [--system|--user|--path]
189
+ labgate register <activation-key> [--server <url>]
190
+ labgate policy init [--institution ... --admin ...]
191
+ labgate policy validate [file]
93
192
  ```
94
193
 
95
- ### Options
194
+ ### Common options
96
195
 
97
196
  ```bash
98
- labgate claude --dry-run # print the sandbox command without running
99
- labgate claude --image my-image:tag # use a different container image
100
- labgate claude --no-footer # disable the status footer line
101
- labgate ui # localhost UI on 7700, logs full token URL + short /s/<code> quick link
102
- labgate ui --socket ~/.labgate/ui.sock # custom Unix socket path
103
- labgate logs --lines 50 --follow # tail last 50 lines and keep following
197
+ labgate claude --dry-run
198
+ labgate claude --image my-image:tag
199
+ labgate claude --no-footer
200
+ labgate claude --api-key "$ANTHROPIC_API_KEY"
201
+ labgate ui --socket ~/.labgate/ui.sock
202
+ labgate logs --lines 50 --follow
104
203
  ```
105
204
 
106
205
  `labgate claude` auto-starts `labgate ui` when missing in local (non-SSH/non-SLURM) shells.
107
206
 
108
207
  ### SLURM inside sandboxes (`sbatch` / `squeue`)
109
208
 
110
- For Apptainer sessions, LabGate now attempts SLURM CLI passthrough automatically.
111
- If host `sbatch`/`squeue` are available, they are staged into the sandbox, so
112
- `labgate claude` should work without extra config in the common HPC path.
209
+ For Apptainer sessions, LabGate attempts SLURM CLI passthrough automatically.
210
+ If host `sbatch`/`squeue` are available, they are staged into the sandbox so
211
+ `labgate claude` works in common HPC setups without extra config.
113
212
 
114
- SLURM tracking and MCP tools are enabled by default (`slurm.enabled=true`).
213
+ SLURM tracking and MCP tools are enabled by default (`slurm.enabled = true`).
115
214
  If native SQLite (`better-sqlite3`) is unavailable on a host, LabGate falls back
116
215
  to a JSON tracking store automatically.
117
216
 
118
217
  Requirements for automatic `sbatch` in sandbox:
119
218
 
120
219
  1. Runtime is Apptainer
121
- 2. The host can resolve SLURM CLI tools when launching LabGate
220
+ 2. Host shell can resolve SLURM CLI tools before launching LabGate
122
221
 
123
- If `sbatch` is missing inside the sandbox, run:
222
+ If `sbatch` is missing inside the sandbox:
124
223
 
125
224
  ```bash
126
- which sbatch # on host, before launching labgate
225
+ which sbatch
127
226
  labgate claude
128
227
  ```
129
228
 
130
- If your cluster uses environment modules, load SLURM first (host shell), then launch LabGate:
229
+ If your cluster uses environment modules, load SLURM first:
131
230
 
132
231
  ```bash
133
- module load slurm # or your site-specific module name
232
+ module load slurm
134
233
  labgate claude
135
234
  ```
136
235
 
@@ -244,20 +343,20 @@ Coverage:
244
343
  3. Verifies host browser-open hook is triggered
245
344
  4. Optional override: `LABGATE_REAL_E2E_IMAGE`
246
345
 
247
- ## How it works
346
+ ## How It Works
248
347
 
249
348
  LabGate builds a sandboxed container from your config:
250
349
 
251
- 1. Detects Apptainer first, then Podman (or uses explicit runtime)
350
+ 1. Detects Apptainer first (primary HPC path), with secondary fallback runtimes when configured
252
351
  2. Mounts your working directory at `/work`
253
352
  3. Mounts persistent sandbox HOME at `/home/sandbox` (for npm cache, agent config)
254
353
  4. Overlays blocked paths (`.ssh`, `.aws`, etc.) with empty mounts
255
- 5. Applies network isolation and capability restrictions
354
+ 5. Applies network isolation and command controls
256
355
  6. Installs the agent (if not cached) and runs it interactively
257
356
 
258
- On macOS, LabGate syncs your Claude credentials from the system keychain so the agent can authenticate automatically.
357
+ On macOS, LabGate can sync Claude credentials from the system keychain so the agent can authenticate automatically.
259
358
 
260
- ## Audit logs
359
+ ## Audit Logs
261
360
 
262
361
  Session events are logged to `~/.labgate/logs/YYYY-MM-DD.jsonl`:
263
362
 
@@ -265,14 +364,6 @@ Session events are logged to `~/.labgate/logs/YYYY-MM-DD.jsonl`:
265
364
  cat ~/.labgate/logs/2025-02-05.jsonl | jq .
266
365
  ```
267
366
 
268
- ## Roadmap
269
-
270
- - **M0** CLI + sandbox engine + config + audit (this release)
271
- - **M1** Mount allowlists, network filtering, project-level config
272
- - **M2** SLURM proxy (submit/status/cancel from inside sandbox)
273
- - **M3** Web UI for config + audit viewer
274
- - **M4** Institutional mode (/etc/labgate/ policies, admin locks)
275
-
276
367
  ## License
277
368
 
278
369
  MIT
package/dist/cli.js CHANGED
@@ -39,6 +39,7 @@ const fs_1 = require("fs");
39
39
  const os_1 = require("os");
40
40
  const net_1 = require("net");
41
41
  const readline_1 = require("readline");
42
+ const child_process_1 = require("child_process");
42
43
  const config_js_1 = require("./lib/config.js");
43
44
  const init_js_1 = require("./lib/init.js");
44
45
  const container_js_1 = require("./lib/container.js");
@@ -339,6 +340,24 @@ program
339
340
  });
340
341
  }
341
342
  });
343
+ // ── labgate doctor ───────────────────────────────────────
344
+ program
345
+ .command('doctor')
346
+ .description('Run preflight checks for LabGate HPC usage')
347
+ .option('--json', 'Print full report as JSON')
348
+ .action(async (opts) => {
349
+ const { runDoctor, renderDoctorReport } = await import('./lib/doctor.js');
350
+ const report = runDoctor();
351
+ if (opts.json) {
352
+ console.log(JSON.stringify(report, null, 2));
353
+ }
354
+ else {
355
+ console.log(renderDoctorReport(report));
356
+ }
357
+ if (!report.success) {
358
+ process.exit(1);
359
+ }
360
+ });
342
361
  // ── labgate ui ───────────────────────────────────────────
343
362
  program
344
363
  .command('ui')
@@ -396,6 +415,129 @@ program
396
415
  const { restartSession } = await import('./lib/container.js');
397
416
  await restartSession(id, { dryRun: opts.dryRun ?? false });
398
417
  });
418
+ // ── labgate continue <id> ─────────────────────────────────
419
+ program
420
+ .command('continue')
421
+ .description('Attach to a tmux-backed web terminal session')
422
+ .argument('[id]', 'Web terminal session ID/prefix (e.g. wt-abc123...)')
423
+ .option('--latest', 'Attach to the newest runnable local web-terminal session')
424
+ .action(async (id, opts) => {
425
+ const web = await import('./lib/web-terminal.js');
426
+ if (opts.latest && id && id.trim()) {
427
+ console.error('Use either an ID/prefix or --latest, not both.');
428
+ process.exit(1);
429
+ }
430
+ const localHost = (0, os_1.hostname)();
431
+ const all = web.listWebTerminalRecords();
432
+ const ensureTmux = async () => {
433
+ const tmux = await web.ensureTmuxAvailable();
434
+ if (!tmux.ok) {
435
+ console.error(`Error: ${tmux.error}`);
436
+ process.exit(1);
437
+ }
438
+ };
439
+ const pickLatestRunnableLocal = async () => {
440
+ await ensureTmux();
441
+ for (const item of all) {
442
+ if (item.node !== localHost)
443
+ continue;
444
+ if (await web.hasTmuxSession(item.tmuxSession))
445
+ return item;
446
+ }
447
+ return null;
448
+ };
449
+ const pickInteractive = async () => {
450
+ const candidates = all.slice(0, 20);
451
+ if (candidates.length === 0)
452
+ return null;
453
+ if (!process.stdin.isTTY || !process.stdout.isTTY) {
454
+ console.error('No session id provided in non-interactive mode. Use `labgate continue <id>` or `--latest`.');
455
+ process.exit(1);
456
+ }
457
+ await ensureTmux();
458
+ console.error('Select a web terminal session to continue:');
459
+ for (let i = 0; i < candidates.length; i++) {
460
+ const item = candidates[i];
461
+ const alive = item.node === localHost ? await web.hasTmuxSession(item.tmuxSession) : false;
462
+ const availability = item.node === localHost ? (alive ? 'attachable' : 'not running') : `remote:${item.node}`;
463
+ console.error(` ${i + 1}. ${item.id} ${item.agent} ${item.status} ${availability} ${item.workdir}`);
464
+ }
465
+ const rl = (0, readline_1.createInterface)({ input: process.stdin, output: process.stderr });
466
+ const answer = await new Promise((resolve) => {
467
+ rl.question('Enter number (or q to cancel): ', (value) => {
468
+ rl.close();
469
+ resolve((value || '').trim());
470
+ });
471
+ });
472
+ if (!answer || answer.toLowerCase() === 'q') {
473
+ console.error('Cancelled.');
474
+ process.exit(1);
475
+ }
476
+ const idx = parseInt(answer, 10);
477
+ if (!Number.isFinite(idx) || idx < 1 || idx > candidates.length) {
478
+ console.error(`Invalid selection: ${answer}`);
479
+ process.exit(1);
480
+ }
481
+ return candidates[idx - 1];
482
+ };
483
+ let record = null;
484
+ if (opts.latest) {
485
+ record = await pickLatestRunnableLocal();
486
+ if (!record) {
487
+ console.error('No runnable local web terminal session found.');
488
+ process.exit(1);
489
+ }
490
+ }
491
+ else if (id && id.trim()) {
492
+ const resolved = web.resolveWebTerminalRecord(id);
493
+ if (!resolved.record) {
494
+ if (resolved.matches.length > 1) {
495
+ console.error(`Ambiguous session prefix "${id}". Matches:`);
496
+ for (const item of resolved.matches.slice(0, 20)) {
497
+ console.error(` - ${item.id} (${item.agent}, ${item.workdir})`);
498
+ }
499
+ process.exit(1);
500
+ }
501
+ console.error(`Session not found: ${id}`);
502
+ process.exit(1);
503
+ }
504
+ record = resolved.record;
505
+ }
506
+ else {
507
+ if (!process.stdin.isTTY || !process.stdout.isTTY) {
508
+ console.error('No session id provided in non-interactive mode. Use `labgate continue <id>` or `--latest`.');
509
+ process.exit(1);
510
+ }
511
+ record = await pickInteractive();
512
+ if (!record) {
513
+ console.error('No web terminal sessions found.');
514
+ process.exit(1);
515
+ }
516
+ }
517
+ if (record.node !== localHost) {
518
+ console.error(`Session "${record.id}" is running on node "${record.node}", not "${localHost}".`);
519
+ console.error(`Attach there: ssh ${record.node} "labgate continue ${record.id}"`);
520
+ process.exit(1);
521
+ }
522
+ await ensureTmux();
523
+ const alive = await web.hasTmuxSession(record.tmuxSession);
524
+ if (!alive) {
525
+ console.error(`Session "${record.id}" is not running anymore (tmux session missing).`);
526
+ process.exit(1);
527
+ }
528
+ let tmuxBin = 'tmux';
529
+ try {
530
+ tmuxBin = await web.getTmuxBinary();
531
+ }
532
+ catch (err) {
533
+ console.error(`Error resolving tmux binary: ${err?.message ?? String(err)}`);
534
+ process.exit(1);
535
+ }
536
+ const child = (0, child_process_1.spawn)(tmuxBin, ['attach-session', '-t', record.tmuxSession], { stdio: 'inherit' });
537
+ child.on('exit', (code) => {
538
+ process.exit(code ?? 0);
539
+ });
540
+ });
399
541
  // ── labgate slurm ────────────────────────────────────────
400
542
  const slurmCmd = program
401
543
  .command('slurm')