gm-skill 2.0.1616 → 2.0.1617

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -23,12 +23,15 @@ session close <id>
23
23
  url=<url>\n<expression>
24
24
  timeout=<ms>\n<expression>
25
25
  capture\n<expression>
26
+ profile\n<expression>
26
27
  ```
27
28
 
28
29
  **Open on the page you want to test, not a blank one.** A bare `https://...` URL body navigates the session straight to that page and returns `{url, title}` -- the simplest "show me this page." `url=<url>\n<expression>` navigates first, then runs your expression on the loaded page, so the global/DOM you assert is already there in one dispatch instead of a blank surface you must `page.goto` yourself. `url=` composes with `timeout=` and `capture` -- stack the prefix lines in order `timeout=`, then `url=`, then `capture`, the expression last; the prepended `page.goto` rides inside the capture so its navigation console/network is captured too. A bare expression with no URL prefix and no live session opens against `about:blank`; with a live session it reuses it. `session new` returns the id you carry; with more than one open, target it via `session=<id>\n<expr>`. (`session close` and `session kill` are aliases.) Default per-eval timeout 120000ms; operations that legitimately exceed it prefix `timeout=<ms>\n` (wrapper clamps to 120000ms). The response carries `timeout_ms_used`; `browser.runner-timeout` fires at the cap -- read `stderr`, narrow or raise, never retry blind at the same budget.
29
30
 
30
31
  **`capture\n<expression>` is the zero-boilerplate debug path -- prefer it.** Prefix your script with `capture` (or `profile`) on its own line and the wrapper auto-attaches `page.on('console'|'pageerror'|'requestfinished')` before your code runs, runs your script in an async wrapper (your top-level `await`/`return` work unchanged), and returns `{result: <your return>, debug: {console, pageErrors, network, performance}}` -- page console logs, uncaught errors, per-request network timing, and navigation performance, captured for free. Combine with timeout via `timeout=<ms>\ncapture\n<expr>`. Use the bare expression only when you do not want the capture overhead.
31
32
 
33
+ **`profile\n<expression>` is the bottom-up CPU profiler -- worst-20 culprits by file location across init and code-execution.** Prefix your script with `profile` on its own line: the wrapper opens a CDP `Profiler` (`newCDPSession` + `Profiler.start` BEFORE the prepended `page.goto`, so navigation, script-parse, and init are sampled, not only steady-state), runs your script, `Profiler.stop`s, and aggregates the v8 CPU profile into `{result, profile: {timeframe: {start_us, end_us, total_us, sample_count}, culprits: [{location, function, self_us, self_pct, hits}]}, profile_error, debug: {...}}`. `culprits` is the bottom-up self-time ranking capped at the worst 20 `url:line` locations; `timeframe` is the capture window in microseconds. Composes with `url=`/`timeout=` in the same prefix order. Page scripts loaded from `.js` files carry real `file:line`; `page.evaluate` anonymous frames bucket to `(program)`/`(native)`. On a CDP failure `profile` is `null` with `profile_error` set and your `result` still returns. The identical `{timeframe, culprits}` shape comes back from `exec_js` with `opts.profile:true`, so the cli and browser bottom-up views read the same.
34
+
32
35
  ## Envelope
33
36
 
34
37
  `{ok, stdout, stderr, exit_code, session_id?}`. `stdout` = stringified eval result; `stderr` = page errors + launch diagnostics; `exit_code` non-zero = the dispatch did not land -- read `stderr` and re-dispatch, never blind.
@@ -36,7 +36,7 @@ First emit = closure of the transform; scaffold + IOU externalizes residual cost
36
36
 
37
37
  Data first -- get the structures and their invariants right and the code writes itself; convoluted control flow means the data model is wrong, so fix the model. Make invalid state unrepresentable -- pass parameters over hidden globals, encode the constraint in the type/shape so the bad combination cannot be constructed. Reason from physical constraints (latency, bandwidth, memory, coordination, the worst node) before designing within them. Keep the spine flat, each unit single-focus and understandable at its call site. Make misuse structurally impossible, not documented-against. Optimize the worst case, not the average; design every failure path explicitly (full -> degraded -> safe-fail -> explicit-error), never a silent catastrophic mode. Measure, do not assume -- profile before optimizing, implement both and compare on real input when in genuine dispute. When a change regresses something that worked, revert first and investigate second: restore green, then diagnose from a known-good base. Fail fast and loud over limping on bad state.
38
38
 
39
- **Process of elimination is the debugging paradigm on every surface, and manual labour against real services is how you witness.** Never guess-and-restart, a/b-test, or shotgun variants: enumerate the candidate causes as mutables, then eliminate each by a witness read against REAL input -- `exec_js` against the real service, `codesearch`/`Read` against the real source, the `browser` verb's `page.evaluate` against a `window.*` global on the live page. Each elimination reveals the next mutable; record it and keep going until one cause survives every other's refutation. Reading the live runtime once observes more than a hundred blind restarts. Profile on the real surface, not from intuition: wrap the suspect node and read the live numbers. In node, `exec_js` carries `duration_ms` for free, surfaces your own timing and `process.memoryUsage()` on stdout, and lands the thrown-error `stack` on stderr -- read both channels (numbers on stdout, stack on stderr). In the browser, a body prefixed `capture\n<script>` auto-returns `{result, debug:{console, pageErrors, network, performance}}` with zero boilerplate. Profile to LOCATE the slow/broken node, then eliminate hypotheses by live measurement. Verification is the same labour: run the real thing and witness the real output (the single mock-free `test.js`, the live page, the real service), never an automated unit/mock harness standing in for the real-services witness. Apparent tooling failure is part of this -- it is your mechanical self-recovery by elimination, never a question for the user.
39
+ **Process of elimination is the debugging paradigm on every surface, and manual labour against real services is how you witness.** Never guess-and-restart, a/b-test, or shotgun variants: enumerate the candidate causes as mutables, then eliminate each by a witness read against REAL input -- `exec_js` against the real service, `codesearch`/`Read` against the real source, the `browser` verb's `page.evaluate` against a `window.*` global on the live page. Each elimination reveals the next mutable; record it and keep going until one cause survives every other's refutation. Reading the live runtime once observes more than a hundred blind restarts. Profile on the real surface, not from intuition: wrap the suspect node and read the live numbers. In node, `exec_js` carries `duration_ms` for free, surfaces your own timing and `process.memoryUsage()` on stdout, and lands the thrown-error `stack` on stderr -- read both channels (numbers on stdout, stack on stderr). In the browser, a body prefixed `capture\n<script>` auto-returns `{result, debug:{console, pageErrors, network, performance}}` with zero boilerplate. When the slow node is not obvious, sample it bottom-up: `exec_js` with `opts.profile:true` and the browser `profile\n<script>` prefix both return `{result, profile:{timeframe:{start_us,end_us,total_us,sample_count}, culprits:[{location,function,self_us,self_pct,hits}]}}` -- the worst-20 `file:line` by self-time across init and code-execution, identical shape on both surfaces, so the culprit ranking points straight at the line to fix. Profile to LOCATE the slow/broken node, then eliminate hypotheses by live measurement. Verification is the same labour: run the real thing and witness the real output (the single mock-free `test.js`, the live page, the real service), never an automated unit/mock harness standing in for the real-services witness. Apparent tooling failure is part of this -- it is your mechanical self-recovery by elimination, never a question for the user.
40
40
 
41
41
  ## Memorize
42
42
 
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-plugkit",
3
- "version": "2.0.1616",
3
+ "version": "2.0.1617",
4
4
  "description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -621,6 +621,56 @@ function writeJsonFile(fp, value) {
621
621
  try { atomicWriteJson(fp, value); } catch (_) {}
622
622
  }
623
623
 
624
+ const AGGREGATE_CPU_PROFILE_SRC = `function aggregateCpuProfile(profile, topN) {
625
+ const N = topN || 20;
626
+ if (!profile || !Array.isArray(profile.nodes) || !Array.isArray(profile.samples)) {
627
+ return { timeframe: null, culprits: [] };
628
+ }
629
+ const byId = new Map();
630
+ for (const node of profile.nodes) byId.set(node.id, node);
631
+ const deltas = Array.isArray(profile.timeDeltas) ? profile.timeDeltas : [];
632
+ const acc = new Map();
633
+ let total = 0;
634
+ const sampleCount = profile.samples.length;
635
+ for (let i = 0; i < profile.samples.length; i++) {
636
+ const node = byId.get(profile.samples[i]);
637
+ const dt = deltas[i] || 0;
638
+ total += dt;
639
+ if (!node) continue;
640
+ const cf = node.callFrame || {};
641
+ let url = cf.url || '';
642
+ if (!url) url = cf.functionName ? '(native)' : '(program)';
643
+ const line = (typeof cf.lineNumber === 'number' && cf.lineNumber >= 0) ? cf.lineNumber + 1 : 0;
644
+ const loc = url + ':' + line;
645
+ let e = acc.get(loc);
646
+ if (!e) { e = { location: loc, function: cf.functionName || '(anonymous)', self_us: 0, hits: 0 }; acc.set(loc, e); }
647
+ e.self_us += dt;
648
+ e.hits += 1;
649
+ }
650
+ const culprits = Array.from(acc.values())
651
+ .sort((a, b) => b.self_us - a.self_us)
652
+ .slice(0, N)
653
+ .map(c => ({ location: c.location, function: c.function, self_us: c.self_us, self_pct: total ? Math.round((c.self_us / total) * 1000) / 10 : 0, hits: c.hits }));
654
+ return {
655
+ timeframe: {
656
+ start_us: typeof profile.startTime === 'number' ? profile.startTime : 0,
657
+ end_us: typeof profile.endTime === 'number' ? profile.endTime : 0,
658
+ total_us: total,
659
+ sample_count: sampleCount,
660
+ },
661
+ culprits,
662
+ };
663
+ }`;
664
+
665
+ let execProfileSeq = 0;
666
+ let _aggregateCpuProfileFn = null;
667
+ function aggregateCpuProfile(profile, topN) {
668
+ if (!_aggregateCpuProfileFn) {
669
+ _aggregateCpuProfileFn = new Function(AGGREGATE_CPU_PROFILE_SRC + '\nreturn aggregateCpuProfile;')();
670
+ }
671
+ return _aggregateCpuProfileFn(profile, topN);
672
+ }
673
+
624
674
  const BROWSER_RUNNER_BIN = process.env.GM_BROWSER_RUNNER_BIN || 'playwriter';
625
675
 
626
676
  function findBrowserRunner() {
@@ -1870,14 +1920,62 @@ function makeHostFunctions(instanceRef) {
1870
1920
  });
1871
1921
  }
1872
1922
  const timeoutMs = rawTimeout;
1923
+ const wantProfile = opts.profile === true && (lang === 'nodejs' || lang === 'js' || lang === undefined);
1924
+ let profileUserFile = null;
1873
1925
  let cmd, args;
1874
- if (lang === 'nodejs' || lang === 'js') { cmd = process.execPath; args = ['-e', code]; }
1926
+ if (lang === 'nodejs' || lang === 'js') {
1927
+ if (wantProfile) {
1928
+ profileUserFile = path.join(os.tmpdir(), `gm-prof-${process.pid}-${execProfileSeq++}.js`);
1929
+ fs.writeFileSync(profileUserFile, `module.exports = (async () => {\n${code}\n});`, 'utf-8');
1930
+ const runnerCode = `${AGGREGATE_CPU_PROFILE_SRC}\n`
1931
+ + `const __inspector = require('inspector');\n`
1932
+ + `const __session = new __inspector.Session();\n`
1933
+ + `__session.connect();\n`
1934
+ + `const __post = (m, p) => new Promise((res, rej) => __session.post(m, p || {}, (e, r) => e ? rej(e) : res(r)));\n`
1935
+ + `(async () => {\n`
1936
+ + ` let __profile = null, __profileError = null, __userResult = null, __userError = null;\n`
1937
+ + ` try {\n`
1938
+ + ` await __post('Profiler.enable');\n`
1939
+ + ` await __post('Profiler.setSamplingInterval', { interval: ${Number.isFinite(opts.sampleIntervalUs) && opts.sampleIntervalUs > 0 ? Math.floor(opts.sampleIntervalUs) : 100} });\n`
1940
+ + ` await __post('Profiler.start');\n`
1941
+ + ` try { __userResult = await require(${JSON.stringify(profileUserFile)})(); } catch (ue) { __userError = String(ue && ue.stack || ue); }\n`
1942
+ + ` const __r = await __post('Profiler.stop');\n`
1943
+ + ` __profile = __r && __r.profile || null;\n`
1944
+ + ` } catch (pe) { __profileError = String(pe && pe.message || pe); }\n`
1945
+ + ` const __agg = __profile ? aggregateCpuProfile(__profile) : { timeframe: null, culprits: [] };\n`
1946
+ + ` process.stdout.write('__GM_PROFILE__' + JSON.stringify({ result: __userResult, user_error: __userError, profile: __agg, profile_error: __profileError }));\n`
1947
+ + ` __session.disconnect();\n`
1948
+ + `})();\n`;
1949
+ cmd = process.execPath; args = ['-e', runnerCode];
1950
+ } else {
1951
+ cmd = process.execPath; args = ['-e', code];
1952
+ }
1953
+ }
1875
1954
  else if (lang === 'python') { cmd = 'python'; args = ['-c', code]; }
1876
1955
  else if (lang === 'bash') { cmd = 'bash'; args = ['-c', code]; }
1877
1956
  else if (lang === 'deno') { cmd = 'deno'; args = ['eval', code]; }
1878
1957
  else { return writeWasmJson(instanceRef.value, { ok: false, error: `unsupported lang: ${lang}` }); }
1879
1958
  const __execT0 = Date.now();
1880
1959
  const result = spawnSync(cmd, args, { encoding: 'utf-8', timeout: timeoutMs, cwd, env: process.env });
1960
+ if (profileUserFile) { try { fs.unlinkSync(profileUserFile); } catch (_) {} }
1961
+ if (wantProfile) {
1962
+ const raw = result.stdout || '';
1963
+ const idx = raw.indexOf('__GM_PROFILE__');
1964
+ let parsed = null;
1965
+ if (idx >= 0) { try { parsed = JSON.parse(raw.slice(idx + '__GM_PROFILE__'.length)); } catch (_) {} }
1966
+ return writeWasmJson(instanceRef.value, {
1967
+ ok: result.status === 0 && parsed !== null && !parsed.user_error,
1968
+ stdout: idx >= 0 ? raw.slice(0, idx) : raw,
1969
+ stderr: result.stderr || '',
1970
+ exit_code: result.status === null ? -1 : result.status,
1971
+ timed_out: result.signal === 'SIGTERM',
1972
+ duration_ms: Date.now() - __execT0,
1973
+ result: parsed ? parsed.result : null,
1974
+ profile: parsed ? parsed.profile : { timeframe: null, culprits: [] },
1975
+ profile_error: parsed ? parsed.profile_error : 'profile sentinel not found in stdout',
1976
+ user_error: parsed ? parsed.user_error : null,
1977
+ });
1978
+ }
1881
1979
  return writeWasmJson(instanceRef.value, {
1882
1980
  ok: result.status === 0,
1883
1981
  stdout: result.stdout || '',
@@ -2016,15 +2114,30 @@ function makeHostFunctions(instanceRef) {
2016
2114
  const gotoPrefix = startUrl
2017
2115
  ? `await page.goto(${JSON.stringify(startUrl)},{waitUntil:'load',timeout:${navTimeout}});\n`
2018
2116
  : '';
2019
- const captureMatch = evalBody.match(/^(?:capture|profile)[ \t]*\n([\s\S]*)$/);
2020
- if (captureMatch) {
2021
- const userScript = captureMatch[1];
2022
- evalBody = `const __logs=[],__errs=[],__net=[];\n`
2023
- + `try{page.on('console',m=>{try{__logs.push({type:m.type(),text:m.text()});}catch(_){}});`
2024
- + `page.on('pageerror',e=>{try{__errs.push(String(e&&e.message||e));}catch(_){}});`
2025
- + `page.on('requestfinished',r=>{try{const t=r.timing();__net.push({url:String(r.url()).slice(0,120),dur_ms:Math.round(t.responseEnd),ttfb_ms:Math.round(t.responseStart)});}catch(_){}});}catch(_){}\n`
2117
+ const modeMatch = evalBody.match(/^(capture|profile)[ \t]*\n([\s\S]*)$/);
2118
+ const debugSetup = `const __logs=[],__errs=[],__net=[];\n`
2119
+ + `try{page.on('console',m=>{try{__logs.push({type:m.type(),text:m.text()});}catch(_){}});`
2120
+ + `page.on('pageerror',e=>{try{__errs.push(String(e&&e.message||e));}catch(_){}});`
2121
+ + `page.on('requestfinished',r=>{try{const t=r.timing();__net.push({url:String(r.url()).slice(0,120),dur_ms:Math.round(t.responseEnd),ttfb_ms:Math.round(t.responseStart)});}catch(_){}});}catch(_){}\n`;
2122
+ const perfRead = `let __perf=null;try{__perf=await page.evaluate(()=>{const n=performance.getEntriesByType('navigation')[0];return n?{load_ms:Math.round(n.loadEventEnd||0),dcl_ms:Math.round(n.domContentLoadedEventEnd||0),resources:performance.getEntriesByType('resource').length,now:Math.round(performance.now())}:null;});}catch(_){}\n`;
2123
+ if (modeMatch && modeMatch[1] === 'profile') {
2124
+ const userScript = modeMatch[2];
2125
+ const intervalUs = 100;
2126
+ evalBody = debugSetup
2127
+ + `let __profile=null,__profileError=null;\n`
2128
+ + `let __cdp=null;\n`
2129
+ + `try{__cdp=await page.context().newCDPSession(page);await __cdp.send('Profiler.enable');await __cdp.send('Profiler.setSamplingInterval',{interval:${intervalUs}});await __cdp.send('Profiler.start');}catch(e){__profileError=String(e&&e.message||e);__cdp=null;}\n`
2130
+ + `const __result = await (async () => {\n${gotoPrefix}${userScript}\n})();\n`
2131
+ + `if(__cdp){try{const __r=await __cdp.send('Profiler.stop');__profile=__r&&__r.profile||null;}catch(e){__profileError=String(e&&e.message||e);}}\n`
2132
+ + perfRead
2133
+ + AGGREGATE_CPU_PROFILE_SRC + `\n`
2134
+ + `const __agg = __profile ? aggregateCpuProfile(__profile) : {timeframe:null,culprits:[]};\n`
2135
+ + `return {result:__result,profile:__agg,profile_error:__profileError,debug:{console:__logs,pageErrors:__errs,network:__net.slice(0,30),performance:__perf}};`;
2136
+ } else if (modeMatch && modeMatch[1] === 'capture') {
2137
+ const userScript = modeMatch[2];
2138
+ evalBody = debugSetup
2026
2139
  + `const __result = await (async () => {\n${gotoPrefix}${userScript}\n})();\n`
2027
- + `let __perf=null;try{__perf=await page.evaluate(()=>{const n=performance.getEntriesByType('navigation')[0];return n?{load_ms:Math.round(n.loadEventEnd||0),dcl_ms:Math.round(n.domContentLoadedEventEnd||0),resources:performance.getEntriesByType('resource').length,now:Math.round(performance.now())}:null;});}catch(_){}\n`
2140
+ + perfRead
2028
2141
  + `return {result:__result,debug:{console:__logs,pageErrors:__errs,network:__net.slice(0,30),performance:__perf}};`;
2029
2142
  } else if (startUrl) {
2030
2143
  evalBody = `${gotoPrefix}${evalBody}`;
package/gm.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.1616",
3
+ "version": "2.0.1617",
4
4
  "description": "Spool-dispatch orchestration engine with unified state machine, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-skill",
3
- "version": "2.0.1616",
3
+ "version": "2.0.1617",
4
4
  "description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",