gm-skill 2.0.1615 → 2.0.1617
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -12,21 +12,26 @@ YOU drive the browser through the spool: plugkit holds the Chromium handle, per-
|
|
|
12
12
|
|
|
13
13
|
## Body shapes
|
|
14
14
|
|
|
15
|
-
The body is a string,
|
|
15
|
+
The body is a string, these shapes:
|
|
16
16
|
|
|
17
17
|
```
|
|
18
18
|
session new
|
|
19
19
|
session list
|
|
20
20
|
session close <id>
|
|
21
21
|
<arbitrary JS expression evaluated in page context>
|
|
22
|
+
<https://... bare URL>
|
|
23
|
+
url=<url>\n<expression>
|
|
22
24
|
timeout=<ms>\n<expression>
|
|
23
25
|
capture\n<expression>
|
|
26
|
+
profile\n<expression>
|
|
24
27
|
```
|
|
25
28
|
|
|
26
|
-
A bare expression with no live session opens
|
|
29
|
+
**Open on the page you want to test, not a blank one.** A bare `https://...` URL body navigates the session straight to that page and returns `{url, title}` -- the simplest "show me this page." `url=<url>\n<expression>` navigates first, then runs your expression on the loaded page, so the global/DOM you assert is already there in one dispatch instead of a blank surface you must `page.goto` yourself. `url=` composes with `timeout=` and `capture` -- stack the prefix lines in order `timeout=`, then `url=`, then `capture`, the expression last; the prepended `page.goto` rides inside the capture so its navigation console/network is captured too. A bare expression with no URL prefix and no live session opens against `about:blank`; with a live session it reuses it. `session new` returns the id you carry; with more than one open, target it via `session=<id>\n<expr>`. (`session close` and `session kill` are aliases.) Default per-eval timeout 120000ms; operations that legitimately exceed it prefix `timeout=<ms>\n` (wrapper clamps to 120000ms). The response carries `timeout_ms_used`; `browser.runner-timeout` fires at the cap -- read `stderr`, narrow or raise, never retry blind at the same budget.
|
|
27
30
|
|
|
28
31
|
**`capture\n<expression>` is the zero-boilerplate debug path -- prefer it.** Prefix your script with `capture` (or `profile`) on its own line and the wrapper auto-attaches `page.on('console'|'pageerror'|'requestfinished')` before your code runs, runs your script in an async wrapper (your top-level `await`/`return` work unchanged), and returns `{result: <your return>, debug: {console, pageErrors, network, performance}}` -- page console logs, uncaught errors, per-request network timing, and navigation performance, captured for free. Combine with timeout via `timeout=<ms>\ncapture\n<expr>`. Use the bare expression only when you do not want the capture overhead.
|
|
29
32
|
|
|
33
|
+
**`profile\n<expression>` is the bottom-up CPU profiler -- worst-20 culprits by file location across init and code-execution.** Prefix your script with `profile` on its own line: the wrapper opens a CDP `Profiler` (`newCDPSession` + `Profiler.start` BEFORE the prepended `page.goto`, so navigation, script-parse, and init are sampled, not only steady-state), runs your script, `Profiler.stop`s, and aggregates the v8 CPU profile into `{result, profile: {timeframe: {start_us, end_us, total_us, sample_count}, culprits: [{location, function, self_us, self_pct, hits}]}, profile_error, debug: {...}}`. `culprits` is the bottom-up self-time ranking capped at the worst 20 `url:line` locations; `timeframe` is the capture window in microseconds. Composes with `url=`/`timeout=` in the same prefix order. Page scripts loaded from `.js` files carry real `file:line`; `page.evaluate` anonymous frames bucket to `(program)`/`(native)`. On a CDP failure `profile` is `null` with `profile_error` set and your `result` still returns. The identical `{timeframe, culprits}` shape comes back from `exec_js` with `opts.profile:true`, so the cli and browser bottom-up views read the same.
|
|
34
|
+
|
|
30
35
|
## Envelope
|
|
31
36
|
|
|
32
37
|
`{ok, stdout, stderr, exit_code, session_id?}`. `stdout` = stringified eval result; `stderr` = page errors + launch diagnostics; `exit_code` non-zero = the dispatch did not land -- read `stderr` and re-dispatch, never blind.
|
|
@@ -36,7 +36,7 @@ First emit = closure of the transform; scaffold + IOU externalizes residual cost
|
|
|
36
36
|
|
|
37
37
|
Data first -- get the structures and their invariants right and the code writes itself; convoluted control flow means the data model is wrong, so fix the model. Make invalid state unrepresentable -- pass parameters over hidden globals, encode the constraint in the type/shape so the bad combination cannot be constructed. Reason from physical constraints (latency, bandwidth, memory, coordination, the worst node) before designing within them. Keep the spine flat, each unit single-focus and understandable at its call site. Make misuse structurally impossible, not documented-against. Optimize the worst case, not the average; design every failure path explicitly (full -> degraded -> safe-fail -> explicit-error), never a silent catastrophic mode. Measure, do not assume -- profile before optimizing, implement both and compare on real input when in genuine dispute. When a change regresses something that worked, revert first and investigate second: restore green, then diagnose from a known-good base. Fail fast and loud over limping on bad state.
|
|
38
38
|
|
|
39
|
-
**Process of elimination is the debugging paradigm on every surface, and manual labour against real services is how you witness.** Never guess-and-restart, a/b-test, or shotgun variants: enumerate the candidate causes as mutables, then eliminate each by a witness read against REAL input -- `exec_js` against the real service, `codesearch`/`Read` against the real source, the `browser` verb's `page.evaluate` against a `window.*` global on the live page. Each elimination reveals the next mutable; record it and keep going until one cause survives every other's refutation. Reading the live runtime once observes more than a hundred blind restarts. Profile on the real surface, not from intuition: wrap the suspect node and read the live numbers. In node, `exec_js` carries `duration_ms` for free, surfaces your own timing and `process.memoryUsage()` on stdout, and lands the thrown-error `stack` on stderr -- read both channels (numbers on stdout, stack on stderr). In the browser, a body prefixed `capture\n<script>` auto-returns `{result, debug:{console, pageErrors, network, performance}}` with zero boilerplate. Profile to LOCATE the slow/broken node, then eliminate hypotheses by live measurement. Verification is the same labour: run the real thing and witness the real output (the single mock-free `test.js`, the live page, the real service), never an automated unit/mock harness standing in for the real-services witness. Apparent tooling failure is part of this -- it is your mechanical self-recovery by elimination, never a question for the user.
|
|
39
|
+
**Process of elimination is the debugging paradigm on every surface, and manual labour against real services is how you witness.** Never guess-and-restart, a/b-test, or shotgun variants: enumerate the candidate causes as mutables, then eliminate each by a witness read against REAL input -- `exec_js` against the real service, `codesearch`/`Read` against the real source, the `browser` verb's `page.evaluate` against a `window.*` global on the live page. Each elimination reveals the next mutable; record it and keep going until one cause survives every other's refutation. Reading the live runtime once observes more than a hundred blind restarts. Profile on the real surface, not from intuition: wrap the suspect node and read the live numbers. In node, `exec_js` carries `duration_ms` for free, surfaces your own timing and `process.memoryUsage()` on stdout, and lands the thrown-error `stack` on stderr -- read both channels (numbers on stdout, stack on stderr). In the browser, a body prefixed `capture\n<script>` auto-returns `{result, debug:{console, pageErrors, network, performance}}` with zero boilerplate. When the slow node is not obvious, sample it bottom-up: `exec_js` with `opts.profile:true` and the browser `profile\n<script>` prefix both return `{result, profile:{timeframe:{start_us,end_us,total_us,sample_count}, culprits:[{location,function,self_us,self_pct,hits}]}}` -- the worst-20 `file:line` by self-time across init and code-execution, identical shape on both surfaces, so the culprit ranking points straight at the line to fix. Profile to LOCATE the slow/broken node, then eliminate hypotheses by live measurement. Verification is the same labour: run the real thing and witness the real output (the single mock-free `test.js`, the live page, the real service), never an automated unit/mock harness standing in for the real-services witness. Apparent tooling failure is part of this -- it is your mechanical self-recovery by elimination, never a question for the user.
|
|
40
40
|
|
|
41
41
|
## Memorize
|
|
42
42
|
|
package/gm-plugkit/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gm-plugkit",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.1617",
|
|
4
4
|
"description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|
|
@@ -621,6 +621,56 @@ function writeJsonFile(fp, value) {
|
|
|
621
621
|
try { atomicWriteJson(fp, value); } catch (_) {}
|
|
622
622
|
}
|
|
623
623
|
|
|
624
|
+
const AGGREGATE_CPU_PROFILE_SRC = `function aggregateCpuProfile(profile, topN) {
|
|
625
|
+
const N = topN || 20;
|
|
626
|
+
if (!profile || !Array.isArray(profile.nodes) || !Array.isArray(profile.samples)) {
|
|
627
|
+
return { timeframe: null, culprits: [] };
|
|
628
|
+
}
|
|
629
|
+
const byId = new Map();
|
|
630
|
+
for (const node of profile.nodes) byId.set(node.id, node);
|
|
631
|
+
const deltas = Array.isArray(profile.timeDeltas) ? profile.timeDeltas : [];
|
|
632
|
+
const acc = new Map();
|
|
633
|
+
let total = 0;
|
|
634
|
+
const sampleCount = profile.samples.length;
|
|
635
|
+
for (let i = 0; i < profile.samples.length; i++) {
|
|
636
|
+
const node = byId.get(profile.samples[i]);
|
|
637
|
+
const dt = deltas[i] || 0;
|
|
638
|
+
total += dt;
|
|
639
|
+
if (!node) continue;
|
|
640
|
+
const cf = node.callFrame || {};
|
|
641
|
+
let url = cf.url || '';
|
|
642
|
+
if (!url) url = cf.functionName ? '(native)' : '(program)';
|
|
643
|
+
const line = (typeof cf.lineNumber === 'number' && cf.lineNumber >= 0) ? cf.lineNumber + 1 : 0;
|
|
644
|
+
const loc = url + ':' + line;
|
|
645
|
+
let e = acc.get(loc);
|
|
646
|
+
if (!e) { e = { location: loc, function: cf.functionName || '(anonymous)', self_us: 0, hits: 0 }; acc.set(loc, e); }
|
|
647
|
+
e.self_us += dt;
|
|
648
|
+
e.hits += 1;
|
|
649
|
+
}
|
|
650
|
+
const culprits = Array.from(acc.values())
|
|
651
|
+
.sort((a, b) => b.self_us - a.self_us)
|
|
652
|
+
.slice(0, N)
|
|
653
|
+
.map(c => ({ location: c.location, function: c.function, self_us: c.self_us, self_pct: total ? Math.round((c.self_us / total) * 1000) / 10 : 0, hits: c.hits }));
|
|
654
|
+
return {
|
|
655
|
+
timeframe: {
|
|
656
|
+
start_us: typeof profile.startTime === 'number' ? profile.startTime : 0,
|
|
657
|
+
end_us: typeof profile.endTime === 'number' ? profile.endTime : 0,
|
|
658
|
+
total_us: total,
|
|
659
|
+
sample_count: sampleCount,
|
|
660
|
+
},
|
|
661
|
+
culprits,
|
|
662
|
+
};
|
|
663
|
+
}`;
|
|
664
|
+
|
|
665
|
+
let execProfileSeq = 0;
|
|
666
|
+
let _aggregateCpuProfileFn = null;
|
|
667
|
+
function aggregateCpuProfile(profile, topN) {
|
|
668
|
+
if (!_aggregateCpuProfileFn) {
|
|
669
|
+
_aggregateCpuProfileFn = new Function(AGGREGATE_CPU_PROFILE_SRC + '\nreturn aggregateCpuProfile;')();
|
|
670
|
+
}
|
|
671
|
+
return _aggregateCpuProfileFn(profile, topN);
|
|
672
|
+
}
|
|
673
|
+
|
|
624
674
|
const BROWSER_RUNNER_BIN = process.env.GM_BROWSER_RUNNER_BIN || 'playwriter';
|
|
625
675
|
|
|
626
676
|
function findBrowserRunner() {
|
|
@@ -1870,14 +1920,62 @@ function makeHostFunctions(instanceRef) {
|
|
|
1870
1920
|
});
|
|
1871
1921
|
}
|
|
1872
1922
|
const timeoutMs = rawTimeout;
|
|
1923
|
+
const wantProfile = opts.profile === true && (lang === 'nodejs' || lang === 'js' || lang === undefined);
|
|
1924
|
+
let profileUserFile = null;
|
|
1873
1925
|
let cmd, args;
|
|
1874
|
-
if (lang === 'nodejs' || lang === 'js') {
|
|
1926
|
+
if (lang === 'nodejs' || lang === 'js') {
|
|
1927
|
+
if (wantProfile) {
|
|
1928
|
+
profileUserFile = path.join(os.tmpdir(), `gm-prof-${process.pid}-${execProfileSeq++}.js`);
|
|
1929
|
+
fs.writeFileSync(profileUserFile, `module.exports = (async () => {\n${code}\n});`, 'utf-8');
|
|
1930
|
+
const runnerCode = `${AGGREGATE_CPU_PROFILE_SRC}\n`
|
|
1931
|
+
+ `const __inspector = require('inspector');\n`
|
|
1932
|
+
+ `const __session = new __inspector.Session();\n`
|
|
1933
|
+
+ `__session.connect();\n`
|
|
1934
|
+
+ `const __post = (m, p) => new Promise((res, rej) => __session.post(m, p || {}, (e, r) => e ? rej(e) : res(r)));\n`
|
|
1935
|
+
+ `(async () => {\n`
|
|
1936
|
+
+ ` let __profile = null, __profileError = null, __userResult = null, __userError = null;\n`
|
|
1937
|
+
+ ` try {\n`
|
|
1938
|
+
+ ` await __post('Profiler.enable');\n`
|
|
1939
|
+
+ ` await __post('Profiler.setSamplingInterval', { interval: ${Number.isFinite(opts.sampleIntervalUs) && opts.sampleIntervalUs > 0 ? Math.floor(opts.sampleIntervalUs) : 100} });\n`
|
|
1940
|
+
+ ` await __post('Profiler.start');\n`
|
|
1941
|
+
+ ` try { __userResult = await require(${JSON.stringify(profileUserFile)})(); } catch (ue) { __userError = String(ue && ue.stack || ue); }\n`
|
|
1942
|
+
+ ` const __r = await __post('Profiler.stop');\n`
|
|
1943
|
+
+ ` __profile = __r && __r.profile || null;\n`
|
|
1944
|
+
+ ` } catch (pe) { __profileError = String(pe && pe.message || pe); }\n`
|
|
1945
|
+
+ ` const __agg = __profile ? aggregateCpuProfile(__profile) : { timeframe: null, culprits: [] };\n`
|
|
1946
|
+
+ ` process.stdout.write('__GM_PROFILE__' + JSON.stringify({ result: __userResult, user_error: __userError, profile: __agg, profile_error: __profileError }));\n`
|
|
1947
|
+
+ ` __session.disconnect();\n`
|
|
1948
|
+
+ `})();\n`;
|
|
1949
|
+
cmd = process.execPath; args = ['-e', runnerCode];
|
|
1950
|
+
} else {
|
|
1951
|
+
cmd = process.execPath; args = ['-e', code];
|
|
1952
|
+
}
|
|
1953
|
+
}
|
|
1875
1954
|
else if (lang === 'python') { cmd = 'python'; args = ['-c', code]; }
|
|
1876
1955
|
else if (lang === 'bash') { cmd = 'bash'; args = ['-c', code]; }
|
|
1877
1956
|
else if (lang === 'deno') { cmd = 'deno'; args = ['eval', code]; }
|
|
1878
1957
|
else { return writeWasmJson(instanceRef.value, { ok: false, error: `unsupported lang: ${lang}` }); }
|
|
1879
1958
|
const __execT0 = Date.now();
|
|
1880
1959
|
const result = spawnSync(cmd, args, { encoding: 'utf-8', timeout: timeoutMs, cwd, env: process.env });
|
|
1960
|
+
if (profileUserFile) { try { fs.unlinkSync(profileUserFile); } catch (_) {} }
|
|
1961
|
+
if (wantProfile) {
|
|
1962
|
+
const raw = result.stdout || '';
|
|
1963
|
+
const idx = raw.indexOf('__GM_PROFILE__');
|
|
1964
|
+
let parsed = null;
|
|
1965
|
+
if (idx >= 0) { try { parsed = JSON.parse(raw.slice(idx + '__GM_PROFILE__'.length)); } catch (_) {} }
|
|
1966
|
+
return writeWasmJson(instanceRef.value, {
|
|
1967
|
+
ok: result.status === 0 && parsed !== null && !parsed.user_error,
|
|
1968
|
+
stdout: idx >= 0 ? raw.slice(0, idx) : raw,
|
|
1969
|
+
stderr: result.stderr || '',
|
|
1970
|
+
exit_code: result.status === null ? -1 : result.status,
|
|
1971
|
+
timed_out: result.signal === 'SIGTERM',
|
|
1972
|
+
duration_ms: Date.now() - __execT0,
|
|
1973
|
+
result: parsed ? parsed.result : null,
|
|
1974
|
+
profile: parsed ? parsed.profile : { timeframe: null, culprits: [] },
|
|
1975
|
+
profile_error: parsed ? parsed.profile_error : 'profile sentinel not found in stdout',
|
|
1976
|
+
user_error: parsed ? parsed.user_error : null,
|
|
1977
|
+
});
|
|
1978
|
+
}
|
|
1881
1979
|
return writeWasmJson(instanceRef.value, {
|
|
1882
1980
|
ok: result.status === 0,
|
|
1883
1981
|
stdout: result.stdout || '',
|
|
@@ -2000,16 +2098,49 @@ function makeHostFunctions(instanceRef) {
|
|
|
2000
2098
|
evalBody = timeoutMatch[2];
|
|
2001
2099
|
}
|
|
2002
2100
|
}
|
|
2003
|
-
|
|
2004
|
-
|
|
2005
|
-
|
|
2006
|
-
|
|
2007
|
-
|
|
2008
|
-
|
|
2009
|
-
|
|
2010
|
-
|
|
2011
|
-
|
|
2101
|
+
let startUrl = null;
|
|
2102
|
+
const urlMatch = evalBody.match(/^url=(\S+)[ \t]*\n([\s\S]*)$/);
|
|
2103
|
+
if (urlMatch) {
|
|
2104
|
+
startUrl = urlMatch[1];
|
|
2105
|
+
evalBody = urlMatch[2];
|
|
2106
|
+
} else {
|
|
2107
|
+
const bare = evalBody.trim();
|
|
2108
|
+
if (/^https?:\/\/\S+$/.test(bare)) {
|
|
2109
|
+
startUrl = bare;
|
|
2110
|
+
evalBody = 'return {url: page.url(), title: await page.title()};';
|
|
2111
|
+
}
|
|
2112
|
+
}
|
|
2113
|
+
const navTimeout = Math.min(timeoutMs, 60000);
|
|
2114
|
+
const gotoPrefix = startUrl
|
|
2115
|
+
? `await page.goto(${JSON.stringify(startUrl)},{waitUntil:'load',timeout:${navTimeout}});\n`
|
|
2116
|
+
: '';
|
|
2117
|
+
const modeMatch = evalBody.match(/^(capture|profile)[ \t]*\n([\s\S]*)$/);
|
|
2118
|
+
const debugSetup = `const __logs=[],__errs=[],__net=[];\n`
|
|
2119
|
+
+ `try{page.on('console',m=>{try{__logs.push({type:m.type(),text:m.text()});}catch(_){}});`
|
|
2120
|
+
+ `page.on('pageerror',e=>{try{__errs.push(String(e&&e.message||e));}catch(_){}});`
|
|
2121
|
+
+ `page.on('requestfinished',r=>{try{const t=r.timing();__net.push({url:String(r.url()).slice(0,120),dur_ms:Math.round(t.responseEnd),ttfb_ms:Math.round(t.responseStart)});}catch(_){}});}catch(_){}\n`;
|
|
2122
|
+
const perfRead = `let __perf=null;try{__perf=await page.evaluate(()=>{const n=performance.getEntriesByType('navigation')[0];return n?{load_ms:Math.round(n.loadEventEnd||0),dcl_ms:Math.round(n.domContentLoadedEventEnd||0),resources:performance.getEntriesByType('resource').length,now:Math.round(performance.now())}:null;});}catch(_){}\n`;
|
|
2123
|
+
if (modeMatch && modeMatch[1] === 'profile') {
|
|
2124
|
+
const userScript = modeMatch[2];
|
|
2125
|
+
const intervalUs = 100;
|
|
2126
|
+
evalBody = debugSetup
|
|
2127
|
+
+ `let __profile=null,__profileError=null;\n`
|
|
2128
|
+
+ `let __cdp=null;\n`
|
|
2129
|
+
+ `try{__cdp=await page.context().newCDPSession(page);await __cdp.send('Profiler.enable');await __cdp.send('Profiler.setSamplingInterval',{interval:${intervalUs}});await __cdp.send('Profiler.start');}catch(e){__profileError=String(e&&e.message||e);__cdp=null;}\n`
|
|
2130
|
+
+ `const __result = await (async () => {\n${gotoPrefix}${userScript}\n})();\n`
|
|
2131
|
+
+ `if(__cdp){try{const __r=await __cdp.send('Profiler.stop');__profile=__r&&__r.profile||null;}catch(e){__profileError=String(e&&e.message||e);}}\n`
|
|
2132
|
+
+ perfRead
|
|
2133
|
+
+ AGGREGATE_CPU_PROFILE_SRC + `\n`
|
|
2134
|
+
+ `const __agg = __profile ? aggregateCpuProfile(__profile) : {timeframe:null,culprits:[]};\n`
|
|
2135
|
+
+ `return {result:__result,profile:__agg,profile_error:__profileError,debug:{console:__logs,pageErrors:__errs,network:__net.slice(0,30),performance:__perf}};`;
|
|
2136
|
+
} else if (modeMatch && modeMatch[1] === 'capture') {
|
|
2137
|
+
const userScript = modeMatch[2];
|
|
2138
|
+
evalBody = debugSetup
|
|
2139
|
+
+ `const __result = await (async () => {\n${gotoPrefix}${userScript}\n})();\n`
|
|
2140
|
+
+ perfRead
|
|
2012
2141
|
+ `return {result:__result,debug:{console:__logs,pageErrors:__errs,network:__net.slice(0,30),performance:__perf}};`;
|
|
2142
|
+
} else if (startUrl) {
|
|
2143
|
+
evalBody = `${gotoPrefix}${evalBody}`;
|
|
2013
2144
|
}
|
|
2014
2145
|
const outerTimeoutMs = Math.min(timeoutMs + 6000, 126000);
|
|
2015
2146
|
const r = runBrowserRunner(pw, ['-s', pwSessionId, '--timeout', String(timeoutMs), '-e', evalBody], outerTimeoutMs, cwd, sessionId);
|
package/gm.json
CHANGED
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gm-skill",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.1617",
|
|
4
4
|
"description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
|
|
5
5
|
"author": "AnEntrypoint",
|
|
6
6
|
"license": "MIT",
|