@tekyzinc/gsd-t 3.18.17 → 3.19.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +69 -0
- package/bin/gsd-t-parallel.cjs +184 -4
- package/bin/gsd-t-unattended.cjs +501 -297
- package/bin/gsd-t-worker-dispatch.cjs +211 -0
- package/bin/headless-auto-spawn.cjs +28 -1
- package/bin/m46-iter-proof.cjs +149 -0
- package/bin/m46-worker-proof.cjs +201 -0
- package/bin/spawn-plan-writer.cjs +1 -1
- package/commands/gsd-t-resume.md +32 -0
- package/commands/gsd-t-status.md +10 -0
- package/docs/architecture.md +16 -0
- package/docs/requirements.md +20 -0
- package/package.json +1 -1
- package/scripts/gsd-t-update-check.js +13 -4
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,75 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to GSD-T are documented here. Updated with each release.
|
|
4
4
|
|
|
5
|
+
## [3.19.00] - 2026-04-23
|
|
6
|
+
|
|
7
|
+
### Added — M46 Milestone: Unattended Iter-Parallel + Worker Fan-Out Completion
|
|
8
|
+
|
|
9
|
+
Closes the two biggest gaps from the 2026-04-23 five-surface parallelism audit: (2A) unattended multi-iteration parallelism and (2B) worker-side sub-fan-out.
|
|
10
|
+
|
|
11
|
+
**D1 — Iteration-parallel supervisor scaffold (`bin/gsd-t-unattended.cjs`):**
|
|
12
|
+
- `_runOneIter(state, opts) → IterResult` — extracted from the while-loop body at line 969 (68-line delta, zero behavior change when called sequentially)
|
|
13
|
+
- `_computeIterBatchSize(state, opts) → number` — mode-safety gates: `verify-needed → 1`, `milestone-boundary → 1`, `complete-milestone → 1`. Production default returns 1 (serial) unless caller passes `opts.maxIterParallel` — opt-in gate on the iter-parallel engagement pending full concurrent-state-safety rewrite (backlog #24).
|
|
14
|
+
- `_runIterParallel(state, opts, iterFn, batchSize) → Promise<IterResult[]>` — uses `Promise.allSettled` for per-iter error isolation; one rejection does not cancel siblings.
|
|
15
|
+
- `_reconcile(state, results)` — deduped union on `completedTasks`, OR on `verifyNeeded`, append-only `artifacts`, last-writer-wins `status`, writes `lastBatch` metadata. **Does NOT advance `state.iter`** — that invariant is owned by the main while loop via `_runOneIter`.
|
|
16
|
+
- `module.exports.__test__` exposes all four helpers to unit tests.
|
|
17
|
+
|
|
18
|
+
**D2 — Worker sub-dispatch production path (`bin/gsd-t-worker-dispatch.cjs`):**
|
|
19
|
+
- `dispatchWorkerTasks({projectDir, parentSessionId, tasks, maxParallel=4}) → Promise<{parallel, taskResults, wallClockMs, reason}>` — deterministic probe + delegation to `bin/gsd-t-parallel.cjs::runDispatch` when the task graph is file-disjoint and `tasks.length > 1`.
|
|
20
|
+
- Short-circuits with reason strings: `no-tasks`, `single-task`, `file-overlap`, `dispatch-error:*`, `dispatched`.
|
|
21
|
+
- CLI entry: `node bin/gsd-t-worker-dispatch.cjs --parent-session <id> --tasks <path> --max-parallel <n>` — emits JSON result to stdout.
|
|
22
|
+
- `commands/gsd-t-resume.md` Step 0 `GSD_T_UNATTENDED_WORKER=1` branch gains an additive sub-dispatch block (no deletion from existing prose).
|
|
23
|
+
- `bin/spawn-plan-writer.cjs` kind enum extended with `unattended-worker-sub`.
|
|
24
|
+
|
|
25
|
+
### Contracts
|
|
26
|
+
|
|
27
|
+
- **`.gsd-t/contracts/iter-parallel-contract.md`** — NEW v1.0.0. Batch semantics, `IterResult` shape, reconciliation rule, stop-check batch-boundary invariant (stop-check is never polled mid-batch).
|
|
28
|
+
- **`.gsd-t/contracts/headless-default-contract.md`** — v2.0.0 → v2.1.0 (additive §Worker Sub-Dispatch documenting the `unattended-worker-sub` kind and the resume-path hand-off).
|
|
29
|
+
|
|
30
|
+
### Measurements — both proofs passed thresholds
|
|
31
|
+
|
|
32
|
+
- **D1 iter-proof (`bin/m46-iter-proof.cjs`)**: 10-iter synthetic workload, 200ms work per iter, batch=4 vs serial. Result: `T_serial=2022.6ms`, `T_par=602.9ms`, **speedup=3.35×**, `T_par/T_serial=0.298` — passes thresholds `speedup ≥ 3.0` and `T_par/T_serial ≤ 0.35`.
|
|
33
|
+
- **D2 worker-proof (`bin/m46-worker-proof.cjs`)**: 6-task file-disjoint synthetic workload, serial vs fan-out via `runDispatch`. Result: `T_serial=12134ms`, `T_par=2034ms`, **speedup=5.96×** — passes threshold `speedup ≥ 2.5`.
|
|
34
|
+
- Reports: `.gsd-t/metrics/m46-iter-proof.json` + `.gsd-t/metrics/m46-worker-proof.json`.
|
|
35
|
+
|
|
36
|
+
### Tests
|
|
37
|
+
|
|
38
|
+
- **`test/m46-d1-iter-parallel.test.js`** — 12 unit tests covering serial fallback, 3-way parallel batch concurrency (<200ms for three 100ms iters), mode-safety gates (verify-needed / complete-milestone / milestoneBoundary), error isolation (one rejection, two siblings succeed), stop-check batch-boundary invariant, and `_reconcile` semantics.
|
|
39
|
+
- **`test/m46-d2-worker-subdispatch.test.js`** — 6 unit tests covering disjoint fan-out, single-task short-circuit, file-overlap detection, dispatcher error surfacing, and CLI JSON-stdout contract.
|
|
40
|
+
- Full suite: **1946/1946 pass**, zero regressions (M43 heartbeat-watchdog + M44 planner-wire + M45 suites all green).
|
|
41
|
+
|
|
42
|
+
### Regression caught and fixed mid-milestone
|
|
43
|
+
|
|
44
|
+
- **Double-increment of `state.iter`** between `_reconcile` and `_runOneIter` tripped 4 tests (m43-heartbeat-watchdog `staleHeartbeat res → exitCode 125`, m44-wire-unattended-to-planner iter-count / fallback / sequential). Root cause: `_reconcile` was advancing `state.iter` by `results.length` while `_runOneIter` already advanced it by 1. Fix: `_reconcile` leaves `state.iter` untouched (main loop owns the invariant); two m46-d1 tests updated to match new contract.
|
|
45
|
+
|
|
46
|
+
### Follow-up backlog
|
|
47
|
+
|
|
48
|
+
- **#24 — Dynamic work-stealing rewrite** (full concurrent-safe state isolation for `_runIterParallel` >1 batch). Covers the `state.iter` / heartbeat / writeState shared-mutation issue that keeps the iter-parallel engagement opt-in rather than default-on.
|
|
49
|
+
- **D2-T11 integration smoke** deferred — unit tests + proof harness cover the surface.
|
|
50
|
+
|
|
51
|
+
## [3.18.18] - 2026-04-23
|
|
52
|
+
|
|
53
|
+
### Added — Model-aware worker spawn in `runDispatch`
|
|
54
|
+
|
|
55
|
+
- **`bin/gsd-t-parallel.cjs::runDispatch`**: fan-out workers now default to `claude-sonnet-4-6` via new constant `DEFAULT_WORKER_MODEL` (was: inherit the orchestrator's `ANTHROPIC_MODEL`, which is `claude-opus-4-7` in this user's global settings). Caller overrides via `opts.workerModel`: alias strings (`"opus"` / `"sonnet"` / `"haiku"`) resolve to full model IDs; explicit full IDs pass through; `workerModel: false` opts out of the override entirely and inherits parent. **Why**: the 2026-04-23 M46 Wave 1 dispatch rate-limited all 8 Opus workers (Max 20x subscription concurrent-session throttle). Sonnet lives in a separate rate bucket, so orchestrator Opus + worker Sonnet lifts the concurrency ceiling. Per-task Opus opt-in via `[opus]` marker on tasks.md lines is future work (surfaces in planner metadata).
|
|
56
|
+
- **`bin/headless-auto-spawn.cjs::autoSpawnHeadless`**: accepts `workerModel?: string` and sets `ANTHROPIC_MODEL` in the child env after the caller's `envOverride` merge (so caller always wins if they explicitly set `ANTHROPIC_MODEL` in `env`).
|
|
57
|
+
|
|
58
|
+
### Added — Spawn stagger to avoid burst spikes
|
|
59
|
+
|
|
60
|
+
- **`bin/gsd-t-parallel.cjs::runDispatch`**: new `opts.spawnStaggerMs` (default **3000** ms) delays each spawn after the first. Implemented via `Atomics.wait` on a `SharedArrayBuffer` so the blocking wait releases the CPU (no spin loop). 2026-04-23 observation: 8 concurrent `claude -p` spawned within 700 ms → all 429 rate-limited; 3 s stagger avoids the burst. Set `spawnStaggerMs: 0` for pre-v3.18.18 behavior.
|
|
61
|
+
|
|
62
|
+
### Added — Cache-warming probe (opt-in)
|
|
63
|
+
|
|
64
|
+
- **`bin/gsd-t-parallel.cjs::_runCacheWarmProbe`** + opt-in flag `opts.cacheWarm` / env `GSD_T_CACHE_WARM=1`. Before fan-out, fires a short `claude -p` that reads CLAUDE.md + progress.md + top-level contracts using the **same model** the workers will run, so Anthropic's 5-minute prompt cache is populated and the workers' identical initial reads return cache-read tokens (free for ITPM budget, lower rate-limit pressure). Probe is synchronous (workers land inside the warm window, not racing it), 60 s timeout, failure does not block fan-out. Dependency-injection hook `opts.cacheWarmProbeImpl` for tests. Gated behind opt-in until backlog #23 (mitmproxy header instrumentation) measures the actual delta; flips to default-on if measurement confirms the ITPM savings are real.
|
|
65
|
+
|
|
66
|
+
### Tests
|
|
67
|
+
|
|
68
|
+
- **`test/m44-run-dispatch.test.js`**: 4 new tests for model selection (default Sonnet / alias resolution / explicit opt-out / stagger timing) + 3 new tests for cache-warming (opt-in gating / probe model matches workers / probe failure does not block fan-out). Full suite **2023/2023** pass (baseline 2016 + 7 new).
|
|
69
|
+
|
|
70
|
+
### Incident — 2026-04-23 M46 Wave 1 rate-limit
|
|
71
|
+
|
|
72
|
+
- Root cause: all 8 headless workers inherited `ANTHROPIC_MODEL=claude-opus-4-7` from `~/.claude/settings.json` (this user runs Max 20x on Opus globally) and spawned in a 700 ms burst. Max subscription's concurrent-session throttle fired ~1 s into each worker's first tool call. Anthropic Console dashboards showed flat-zero API usage — confirmed the throttle is subscription-side, not API-key-side. Mitigations shipped in this release (model-mix + stagger + opt-in cache-warm) + scoped backlog items #22 (coord-gate runtime coordination) and #23 (mitmproxy header instrumentation for calibration).
|
|
73
|
+
|
|
5
74
|
## [3.18.17] - 2026-04-23
|
|
6
75
|
|
|
7
76
|
### Fixed — `npm test` picks up `worker-sim.js` fixture
|
package/bin/gsd-t-parallel.cjs
CHANGED
|
@@ -79,7 +79,7 @@ function detectMode(opts, env) {
|
|
|
79
79
|
// ─── CLI arg parsing ──────────────────────────────────────────────────────
|
|
80
80
|
|
|
81
81
|
function parseArgv(argv) {
|
|
82
|
-
const out = { help: false, dryRun: false, mode: null, milestone: null, domain: null, command: null };
|
|
82
|
+
const out = { help: false, dryRun: false, mode: null, milestone: null, domain: null, command: null, maxWorkers: null, stagger: null };
|
|
83
83
|
for (let i = 0; i < argv.length; i++) {
|
|
84
84
|
const a = argv[i];
|
|
85
85
|
if (a === "--help" || a === "-h") out.help = true;
|
|
@@ -92,6 +92,10 @@ function parseArgv(argv) {
|
|
|
92
92
|
else if (a.startsWith("--domain=")) out.domain = a.slice("--domain=".length);
|
|
93
93
|
else if (a === "--command") out.command = argv[++i] || null;
|
|
94
94
|
else if (a.startsWith("--command=")) out.command = a.slice("--command=".length);
|
|
95
|
+
else if (a === "--max-workers") out.maxWorkers = parseInt(argv[++i], 10);
|
|
96
|
+
else if (a.startsWith("--max-workers=")) out.maxWorkers = parseInt(a.slice("--max-workers=".length), 10);
|
|
97
|
+
else if (a === "--stagger") out.stagger = parseInt(argv[++i], 10);
|
|
98
|
+
else if (a.startsWith("--stagger=")) out.stagger = parseInt(a.slice("--stagger=".length), 10);
|
|
95
99
|
}
|
|
96
100
|
return out;
|
|
97
101
|
}
|
|
@@ -325,6 +329,69 @@ function _partitionTaskIds(taskIds, workerCount) {
|
|
|
325
329
|
return buckets.filter((b) => b.length > 0);
|
|
326
330
|
}
|
|
327
331
|
|
|
332
|
+
/**
|
|
333
|
+
* _runCacheWarmProbe — fire a single short `claude -p` before fan-out so the
|
|
334
|
+
* Anthropic prompt cache (5-min TTL) is pre-populated with the files every
|
|
335
|
+
* worker will read. When workers spawn within the warm window, their initial
|
|
336
|
+
* Read(CLAUDE.md), Read(progress.md), Read(contracts/*.md) return cache-read
|
|
337
|
+
* tokens (free for ITPM budget, lower rate-limit pressure).
|
|
338
|
+
*
|
|
339
|
+
* Returns `{ok, filesRead, error}`. Best-effort; failures do not block fan-out.
|
|
340
|
+
*
|
|
341
|
+
* The probe reads the existing files (skips missing ones silently) and asks
|
|
342
|
+
* the child to print the literal string "warm" — cheap, deterministic, fast.
|
|
343
|
+
* Cache key matches exactly when `model` equals the workers' model and the
|
|
344
|
+
* workers use the same tool-call shape (Read on the same paths) within TTL.
|
|
345
|
+
*/
|
|
346
|
+
function _runCacheWarmProbe(opts) {
|
|
347
|
+
const projectDir = (opts && opts.projectDir) || process.cwd();
|
|
348
|
+
const model = opts && opts.model;
|
|
349
|
+
const timeoutMs = Number.isFinite(opts && opts.timeoutMs) ? opts.timeoutMs : 60000;
|
|
350
|
+
|
|
351
|
+
const candidates = [
|
|
352
|
+
"CLAUDE.md",
|
|
353
|
+
".gsd-t/progress.md",
|
|
354
|
+
".gsd-t/contracts/headless-default-contract.md",
|
|
355
|
+
".gsd-t/contracts/wave-join-contract.md",
|
|
356
|
+
];
|
|
357
|
+
const filesRead = candidates.filter((rel) => {
|
|
358
|
+
try {
|
|
359
|
+
return fs.statSync(path.join(projectDir, rel)).isFile();
|
|
360
|
+
} catch {
|
|
361
|
+
return false;
|
|
362
|
+
}
|
|
363
|
+
});
|
|
364
|
+
if (filesRead.length === 0) return { ok: false, filesRead: [], error: "no_warm_files" };
|
|
365
|
+
|
|
366
|
+
const { spawnSync } = require("node:child_process");
|
|
367
|
+
const prompt =
|
|
368
|
+
"Read the following files so they enter the prompt cache for subsequent workers, " +
|
|
369
|
+
"then reply with the single word `warm` and nothing else:\n" +
|
|
370
|
+
filesRead.map((f) => `- ${f}`).join("\n");
|
|
371
|
+
|
|
372
|
+
const env = Object.assign({}, process.env);
|
|
373
|
+
if (model) env.ANTHROPIC_MODEL = model;
|
|
374
|
+
|
|
375
|
+
try {
|
|
376
|
+
const r = spawnSync(
|
|
377
|
+
"claude",
|
|
378
|
+
["-p", prompt, "--dangerously-skip-permissions"],
|
|
379
|
+
{
|
|
380
|
+
cwd: projectDir,
|
|
381
|
+
env,
|
|
382
|
+
encoding: "utf8",
|
|
383
|
+
timeout: timeoutMs,
|
|
384
|
+
stdio: ["ignore", "pipe", "pipe"],
|
|
385
|
+
},
|
|
386
|
+
);
|
|
387
|
+
if (r.error) return { ok: false, filesRead, error: r.error.message };
|
|
388
|
+
if (r.status !== 0) return { ok: false, filesRead, error: `exit_${r.status}` };
|
|
389
|
+
return { ok: true, filesRead };
|
|
390
|
+
} catch (e) {
|
|
391
|
+
return { ok: false, filesRead, error: (e && e.message) || "spawn_error" };
|
|
392
|
+
}
|
|
393
|
+
}
|
|
394
|
+
|
|
328
395
|
/**
|
|
329
396
|
* runDispatch — the single instrument every command delegates to.
|
|
330
397
|
*
|
|
@@ -389,8 +456,30 @@ function runDispatch(opts) {
|
|
|
389
456
|
};
|
|
390
457
|
}
|
|
391
458
|
|
|
392
|
-
const
|
|
459
|
+
const plannerWorkerCount = Number(result.workerCount) || 0;
|
|
393
460
|
const parallelTasks = Array.isArray(result.parallelTasks) ? result.parallelTasks : [];
|
|
461
|
+
|
|
462
|
+
// Concurrency cap (v3.18.19) — caller may clamp the planner-selected worker
|
|
463
|
+
// count via `opts.maxWorkers`. Motivated by 2026-04-23 incident: the Max
|
|
464
|
+
// subscription concurrent-session throttle rate-limits `claude -p` bursts
|
|
465
|
+
// regardless of model choice (since all spawns inherit the parent's Max
|
|
466
|
+
// OAuth, not an API key — see feedback_anthropic_key_measurement_only). The
|
|
467
|
+
// planner has no knowledge of this throttle; callers who know they're near
|
|
468
|
+
// the ceiling need a direct cap.
|
|
469
|
+
const cap = Number.isFinite(opts && opts.maxWorkers) && opts.maxWorkers > 0
|
|
470
|
+
? Math.floor(opts.maxWorkers)
|
|
471
|
+
: 2;
|
|
472
|
+
const workerCount = Math.min(plannerWorkerCount, cap);
|
|
473
|
+
if (workerCount < plannerWorkerCount) {
|
|
474
|
+
appendEvent(projectDir, {
|
|
475
|
+
type: "parallelism_reduced",
|
|
476
|
+
source: "dispatch_max_workers_cap",
|
|
477
|
+
original_count: plannerWorkerCount,
|
|
478
|
+
reduced_count: workerCount,
|
|
479
|
+
reason: `max_workers_cap:${cap}`,
|
|
480
|
+
ts: new Date().toISOString(),
|
|
481
|
+
});
|
|
482
|
+
}
|
|
394
483
|
const subsets = workerCount >= 2 ? _partitionTaskIds(parallelTasks, workerCount) : [];
|
|
395
484
|
|
|
396
485
|
if (subsets.length < 2) {
|
|
@@ -433,8 +522,90 @@ function runDispatch(opts) {
|
|
|
433
522
|
ts: new Date().toISOString(),
|
|
434
523
|
});
|
|
435
524
|
|
|
525
|
+
// Worker model selection (v3.18.18) — mechanical fan-out defaults to Sonnet
|
|
526
|
+
// so the orchestrator's Opus bucket isn't the bottleneck. Caller may
|
|
527
|
+
// override via `opts.workerModel` ("opus" | "sonnet" | "haiku" | full ID).
|
|
528
|
+
// A task can opt back to Opus by declaring "[opus]" in its tasks.md line;
|
|
529
|
+
// the planner surfaces this via per-task metadata (future; today the per-
|
|
530
|
+
// subset opt-in is an all-or-nothing knob passed by the caller).
|
|
531
|
+
const DEFAULT_WORKER_MODEL = "claude-sonnet-4-6";
|
|
532
|
+
const modelAlias = {
|
|
533
|
+
opus: "claude-opus-4-7",
|
|
534
|
+
sonnet: "claude-sonnet-4-6",
|
|
535
|
+
haiku: "claude-haiku-4-5-20251001",
|
|
536
|
+
};
|
|
537
|
+
const callerModel = opts && opts.workerModel;
|
|
538
|
+
const workerModel = callerModel === false
|
|
539
|
+
? null // explicit opt-out: inherit parent's ANTHROPIC_MODEL
|
|
540
|
+
: (modelAlias[callerModel] || callerModel || DEFAULT_WORKER_MODEL);
|
|
541
|
+
|
|
542
|
+
// Stagger between spawns — 10s default empirically-validated against the
|
|
543
|
+
// Max-subscription concurrent-session throttle (2026-04-23 M46 probe: two
|
|
544
|
+
// 10s-staggered 2-parallel rounds of real work, both exit 0, no 429; prior
|
|
545
|
+
// 3s default burst at >2 workers hit rate limits). Caller may override via
|
|
546
|
+
// `opts.spawnStaggerMs` (0 = no delay, previous burst behavior).
|
|
547
|
+
const staggerMs = Number.isFinite(opts && opts.spawnStaggerMs)
|
|
548
|
+
? Math.max(0, opts.spawnStaggerMs)
|
|
549
|
+
: 10000;
|
|
550
|
+
const busyWait = (ms) => {
|
|
551
|
+
if (!ms) return;
|
|
552
|
+
// Synchronous sleep that releases the CPU (Atomics.wait on a dummy
|
|
553
|
+
// SharedArrayBuffer — pattern used in Node REPL/sync-sleep helpers).
|
|
554
|
+
// Keeps runDispatch's sync return contract without pegging a core.
|
|
555
|
+
// Total wall-clock added to startup: (subsets-1) * staggerMs.
|
|
556
|
+
try {
|
|
557
|
+
const sab = new SharedArrayBuffer(4);
|
|
558
|
+
const view = new Int32Array(sab);
|
|
559
|
+
Atomics.wait(view, 0, 0, ms);
|
|
560
|
+
} catch (_) {
|
|
561
|
+
// Atomics unavailable — fall back to a coarse spin.
|
|
562
|
+
const until = Date.now() + ms;
|
|
563
|
+
while (Date.now() < until) { /* spin */ }
|
|
564
|
+
}
|
|
565
|
+
};
|
|
566
|
+
|
|
567
|
+
// Cache-warming probe (v3.18.19) — opt-in via GSD_T_CACHE_WARM=1 or
|
|
568
|
+
// opts.cacheWarm. Anthropic's prompt cache has a 5-minute TTL keyed on the
|
|
569
|
+
// exact system-prompt + tool-call prefix. One leader probe that reads the
|
|
570
|
+
// same foundational files every worker will read (CLAUDE.md, progress.md,
|
|
571
|
+
// top-level contracts) populates the cache so the first N seconds of every
|
|
572
|
+
// subsequent worker hit cache-read tokens (free for ITPM budget, lower
|
|
573
|
+
// rate-limit pressure). Probe runs synchronously so workers land inside
|
|
574
|
+
// the warm window rather than racing it. Gated behind opt-in until
|
|
575
|
+
// backlog #23 (mitmproxy instrumentation) measures the actual delta.
|
|
576
|
+
const warmEnv = (opts && opts.env) || process.env;
|
|
577
|
+
const cacheWarmEnabled =
|
|
578
|
+
(opts && opts.cacheWarm === true) ||
|
|
579
|
+
(!(opts && opts.cacheWarm === false) && warmEnv.GSD_T_CACHE_WARM === "1");
|
|
580
|
+
if (cacheWarmEnabled) {
|
|
581
|
+
const warmStart = Date.now();
|
|
582
|
+
let warmResult = { ok: false, error: "not_run" };
|
|
583
|
+
try {
|
|
584
|
+
const probeImpl = (opts && opts.cacheWarmProbeImpl) || _runCacheWarmProbe;
|
|
585
|
+
warmResult = probeImpl({
|
|
586
|
+
projectDir,
|
|
587
|
+
model: workerModel, // same model as workers so cache key matches
|
|
588
|
+
timeoutMs: (opts && Number.isFinite(opts.cacheWarmTimeoutMs))
|
|
589
|
+
? opts.cacheWarmTimeoutMs
|
|
590
|
+
: 60000,
|
|
591
|
+
});
|
|
592
|
+
} catch (e) {
|
|
593
|
+
warmResult = { ok: false, error: (e && e.message) || "unknown" };
|
|
594
|
+
}
|
|
595
|
+
appendEvent(projectDir, {
|
|
596
|
+
type: "cache_warm_probe",
|
|
597
|
+
source: "dispatch",
|
|
598
|
+
ok: !!warmResult.ok,
|
|
599
|
+
duration_ms: Date.now() - warmStart,
|
|
600
|
+
error: warmResult.error,
|
|
601
|
+
files_read: warmResult.filesRead,
|
|
602
|
+
ts: new Date().toISOString(),
|
|
603
|
+
});
|
|
604
|
+
}
|
|
605
|
+
|
|
436
606
|
const workerResults = [];
|
|
437
607
|
for (let i = 0; i < subsets.length; i++) {
|
|
608
|
+
if (i > 0) busyWait(staggerMs);
|
|
438
609
|
const subset = subsets[i];
|
|
439
610
|
const workerEnv = {
|
|
440
611
|
GSD_T_WORKER_TASK_IDS: subset.join(","),
|
|
@@ -450,6 +621,7 @@ function runDispatch(opts) {
|
|
|
450
621
|
projectDir,
|
|
451
622
|
env: workerEnv,
|
|
452
623
|
spawnType: "primary",
|
|
624
|
+
workerModel,
|
|
453
625
|
});
|
|
454
626
|
} catch (e) {
|
|
455
627
|
spawnError = (e && e.message) || "unknown";
|
|
@@ -549,14 +721,21 @@ function runCli(argv, env) {
|
|
|
549
721
|
// --command: live dispatch. The single instrument that command files
|
|
550
722
|
// delegate to instead of re-implementing probe-and-branch logic.
|
|
551
723
|
if (args.command) {
|
|
552
|
-
const
|
|
724
|
+
const dispatchOpts = {
|
|
553
725
|
projectDir: process.cwd(),
|
|
554
726
|
mode,
|
|
555
727
|
milestone: args.milestone,
|
|
556
728
|
domain: args.domain,
|
|
557
729
|
command: args.command,
|
|
558
730
|
env,
|
|
559
|
-
}
|
|
731
|
+
};
|
|
732
|
+
if (Number.isFinite(args.maxWorkers) && args.maxWorkers > 0) {
|
|
733
|
+
dispatchOpts.maxWorkers = args.maxWorkers;
|
|
734
|
+
}
|
|
735
|
+
if (Number.isFinite(args.stagger) && args.stagger >= 0) {
|
|
736
|
+
dispatchOpts.spawnStaggerMs = args.stagger * 1000;
|
|
737
|
+
}
|
|
738
|
+
const dispatch = runDispatch(dispatchOpts);
|
|
560
739
|
if (dispatch.decision === "fan_out") {
|
|
561
740
|
process.stdout.write(
|
|
562
741
|
`gsd-t parallel — fan_out command=${args.command} mode=${dispatch.mode} workers=${dispatch.fanOutCount}\n`,
|
|
@@ -606,6 +785,7 @@ module.exports = {
|
|
|
606
785
|
_detectMode: detectMode,
|
|
607
786
|
_appendEvent: appendEvent,
|
|
608
787
|
_partitionTaskIds,
|
|
788
|
+
_runCacheWarmProbe,
|
|
609
789
|
_HELP_TEXT: HELP_TEXT,
|
|
610
790
|
};
|
|
611
791
|
|