@humanjs/mcp 0.1.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -2
- package/dist/index.cjs +161 -15
- package/dist/index.cjs.map +1 -1
- package/dist/index.js +162 -16
- package/dist/index.js.map +1 -1
- package/package.json +10 -5
package/README.md
CHANGED
|
@@ -48,10 +48,15 @@ Some clients can register the server for you, no manual JSON:
|
|
|
48
48
|
**Claude Code:**
|
|
49
49
|
|
|
50
50
|
```bash
|
|
51
|
+
# this project only (default scope: local)
|
|
51
52
|
claude mcp add humanjs --env HUMANJS_PERSONALITY=careful -- npx -y @humanjs/mcp
|
|
52
|
-
|
|
53
|
+
|
|
54
|
+
# all your projects (global): add --scope user (-s user)
|
|
55
|
+
claude mcp add humanjs --scope user --env HUMANJS_PERSONALITY=careful -- npx -y @humanjs/mcp
|
|
53
56
|
```
|
|
54
57
|
|
|
58
|
+
`--scope` is `local` (default, this project only), `user` (you, across all projects), or `project` (shared via a checked-in `.mcp.json`). Use `user` for a one-time global install.
|
|
59
|
+
|
|
55
60
|
**Cursor** — one click:
|
|
56
61
|
|
|
57
62
|
[](https://cursor.com/install-mcp?name=humanjs&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBodW1hbmpzL21jcCJdfQ==)
|
|
@@ -66,6 +71,7 @@ The `config` payload is base64 of `{"command":"npx","args":["-y","@humanjs/mcp"]
|
|
|
66
71
|
| `HUMANJS_SPEED` | `human` \| `fast` \| `instant` | `human` | Humanization pace. `human` = full realistic motion; `fast` = humanized but quick; `instant` = no humanized motion. Changes how long each action *executes*, not the wait between actions. |
|
|
67
72
|
| `HUMANJS_HEADLESS` | `true` \| `false` | `false` | Headless browser. Default is visible — the point of the MCP. |
|
|
68
73
|
| `HUMANJS_OUTPUT_DIR` | path | server's CWD | Where screenshots and recordings are written. |
|
|
74
|
+
| `HUMANJS_UPLOAD_DIR` | path | server's CWD | Folder `human_upload` reads files from (basename only — can't escape it). |
|
|
69
75
|
| `HUMANJS_VIEWPORT` | `WIDTHxHEIGHT` | `1440x900` | Default viewport for new sessions. Bump to `1920x1080` for crisper recordings. |
|
|
70
76
|
| `HUMANJS_AUTO_INSTALL` | `true` \| `false` | `true` | Auto-download the Chromium binary on first launch if missing. Set `false` to require a manual `npx playwright install chromium`. |
|
|
71
77
|
| `HUMANJS_PERSIST` | `true` \| `false` | `false` | Persist a profile across runs (logins/cookies survive). Uses `~/.humanjs/profile` unless `HUMANJS_USER_DATA_DIR` is set. See [Browser modes](#browser-modes). |
|
|
@@ -128,7 +134,7 @@ Click / rightClick / move / drag take a **selector or raw x/y coordinates** —
|
|
|
128
134
|
| Tool | What it does |
|
|
129
135
|
|---|---|
|
|
130
136
|
| `human_start_recording` | Begin capturing (frames + action timeline) |
|
|
131
|
-
| `human_stop_recording` | Finalize and write one or more files — `.mp4` / `.webm`
|
|
137
|
+
| `human_stop_recording` | Finalize and write one or more files — `.mp4` / `.webm` (video), `.gif`, `.json` (timeline), `.ts` (HumanJS script), `.spec.ts` / `.test.ts` (Playwright test). Pass several to export multiple ways, e.g. a video + a ready-to-commit test |
|
|
132
138
|
|
|
133
139
|
**Sessions** — only needed for parallel browsers; the default session is implicit:
|
|
134
140
|
|
|
@@ -227,6 +233,7 @@ The server ships **built-in guidance** (sent to the agent on connect via MCP `in
|
|
|
227
233
|
|
|
228
234
|
- **No arbitrary-JS `evaluate` tool.** Executing page-supplied JavaScript is a prompt-injection cliff — a malicious page could trick the agent into running code that exfiltrates data. The read-only inspection tools cover the legitimate "what's on the page" need.
|
|
229
235
|
- **File-path safety.** Tools that write files accept a basename only; path components (`../`, absolute paths) are rejected, so a prompt-injected filename can't escape `HUMANJS_OUTPUT_DIR`.
|
|
236
|
+
- **Upload path safety.** `human_upload` can attach a local file to a web form — a potential exfiltration path if a page prompt-injects the agent. So it reads files by **basename only** from `HUMANJS_UPLOAD_DIR` (default: the server's working dir); subdirectories, `../`, and absolute paths are rejected, so the agent can't reach (and send) files outside that folder. Point `HUMANJS_UPLOAD_DIR` at where your upload fixtures live.
|
|
230
237
|
- **No credentials handling.** The server drives the browser; it doesn't manage logins, payment details, or secrets on your behalf.
|
|
231
238
|
- **Attaching to your real browser (CDP) is opt-in and env-only.** When you point `HUMANJS_CDP_URL` at your running browser, the agent acts with *your* live sessions — a bigger blast radius if a page tries to manipulate it. That's why it's a deliberate config choice you make up front, never something a tool can switch on.
|
|
232
239
|
|
package/dist/index.cjs
CHANGED
|
@@ -23,6 +23,7 @@ function readEnv() {
|
|
|
23
23
|
speed: parseSpeed(process.env.HUMANJS_SPEED),
|
|
24
24
|
headless: parseBool(process.env.HUMANJS_HEADLESS, false),
|
|
25
25
|
outputDir: process.env.HUMANJS_OUTPUT_DIR ?? process.cwd(),
|
|
26
|
+
uploadDir: process.env.HUMANJS_UPLOAD_DIR ?? process.cwd(),
|
|
26
27
|
viewport: parseViewport(process.env.HUMANJS_VIEWPORT),
|
|
27
28
|
autoInstall: parseBool(process.env.HUMANJS_AUTO_INSTALL, true),
|
|
28
29
|
browser: resolveBrowserConfig(),
|
|
@@ -192,7 +193,10 @@ var SessionManager = class {
|
|
|
192
193
|
stop = resolve;
|
|
193
194
|
});
|
|
194
195
|
const video = options.video ?? true;
|
|
195
|
-
const done = session.human.record(
|
|
196
|
+
const done = session.human.record(
|
|
197
|
+
{ name: options.name, video, quality: options.quality ?? "high" },
|
|
198
|
+
() => signal
|
|
199
|
+
);
|
|
196
200
|
session.recording = {
|
|
197
201
|
name: options.name ?? "recording",
|
|
198
202
|
startedAt: Date.now(),
|
|
@@ -589,6 +593,24 @@ function resolveOutputPath(outputDir, filename) {
|
|
|
589
593
|
}
|
|
590
594
|
return path.join(outputDir, base);
|
|
591
595
|
}
|
|
596
|
+
function resolveUploadPath(uploadDir, filename) {
|
|
597
|
+
const base = path.basename(filename);
|
|
598
|
+
if (base !== filename || base.length === 0) {
|
|
599
|
+
throw new Error(
|
|
600
|
+
`upload filename must be a plain name with no path components, got "${filename}". Files are read from HUMANJS_UPLOAD_DIR \u2014 place the file there (or point HUMANJS_UPLOAD_DIR at its folder) and pass just the name.`
|
|
601
|
+
);
|
|
602
|
+
}
|
|
603
|
+
return path.join(uploadDir, base);
|
|
604
|
+
}
|
|
605
|
+
function resolveRecordingFormat(filename) {
|
|
606
|
+
const lower = filename.toLowerCase();
|
|
607
|
+
if (lower.endsWith(".mp4") || lower.endsWith(".webm")) return "video";
|
|
608
|
+
if (lower.endsWith(".gif")) return "gif";
|
|
609
|
+
if (lower.endsWith(".json")) return "timeline";
|
|
610
|
+
if (lower.endsWith(".spec.ts") || lower.endsWith(".test.ts")) return "playwright";
|
|
611
|
+
if (lower.endsWith(".ts")) return "humanjs";
|
|
612
|
+
return null;
|
|
613
|
+
}
|
|
592
614
|
|
|
593
615
|
// src/tools/inspection.ts
|
|
594
616
|
var sessionArg = zod.z.string().optional().describe("Session ID to act on. Omit to use the default session.");
|
|
@@ -634,6 +656,22 @@ function registerInspectionTools(server, ctx) {
|
|
|
634
656
|
return { content: [{ type: "text", text }] };
|
|
635
657
|
}
|
|
636
658
|
);
|
|
659
|
+
server.registerTool(
|
|
660
|
+
"human_outline",
|
|
661
|
+
{
|
|
662
|
+
title: "Page outline (accessibility tree)",
|
|
663
|
+
description: 'Returns a compact accessibility-tree outline of the page (or a region) \u2014 every interactive element and landmark by its ARIA role + accessible name, as YAML (e.g. `- button "Sign in"`, `- textbox "Email"`). The most token-efficient way to see what is actionable and pick a selector: the names map directly to getByRole / accessible-name selectors. Prefer this over human_get_html for "what can I click or fill"; use human_screenshot when you need the visual layout.',
|
|
664
|
+
inputSchema: {
|
|
665
|
+
selector: zod.z.string().optional().describe("Optional region selector to scope the outline. Omit for the whole page."),
|
|
666
|
+
session: sessionArg
|
|
667
|
+
}
|
|
668
|
+
},
|
|
669
|
+
async ({ selector, session }) => {
|
|
670
|
+
const { human } = await ctx.sessions.get(session);
|
|
671
|
+
const text = await human.outline(selector);
|
|
672
|
+
return { content: [{ type: "text", text }] };
|
|
673
|
+
}
|
|
674
|
+
);
|
|
637
675
|
server.registerTool(
|
|
638
676
|
"human_get_text",
|
|
639
677
|
{
|
|
@@ -712,7 +750,7 @@ function resolveTarget(input) {
|
|
|
712
750
|
var sessionArg2 = zod.z.string().optional().describe(
|
|
713
751
|
"Session ID to act on. Omit to use the default session (created lazily on first call). Use human_create_session for parallel browsers."
|
|
714
752
|
);
|
|
715
|
-
function registerPrimitiveTools(server, { sessions }) {
|
|
753
|
+
function registerPrimitiveTools(server, { sessions, env }) {
|
|
716
754
|
server.registerTool(
|
|
717
755
|
"human_goto",
|
|
718
756
|
{
|
|
@@ -759,6 +797,22 @@ function registerPrimitiveTools(server, { sessions }) {
|
|
|
759
797
|
};
|
|
760
798
|
}
|
|
761
799
|
);
|
|
800
|
+
server.registerTool(
|
|
801
|
+
"human_doubleClick",
|
|
802
|
+
{
|
|
803
|
+
title: "Double-click (humanized)",
|
|
804
|
+
description: "Double-clicks the target \u2014 same humanized motion as human_click, but two presses within the OS double-click window. Use for things that open/activate on double-click (list rows, file items, editable cells). Target is a selector OR x/y coordinates.",
|
|
805
|
+
inputSchema: { ...targetFields, session: sessionArg2 }
|
|
806
|
+
},
|
|
807
|
+
async ({ selector, x, y, session }) => {
|
|
808
|
+
const { human } = await sessions.get(session);
|
|
809
|
+
const target = resolveTarget({ selector, x, y });
|
|
810
|
+
await human.doubleClick(target);
|
|
811
|
+
return {
|
|
812
|
+
content: [{ type: "text", text: `double-clicked ${describeTarget(selector, x, y)}` }]
|
|
813
|
+
};
|
|
814
|
+
}
|
|
815
|
+
);
|
|
762
816
|
server.registerTool(
|
|
763
817
|
"human_hover",
|
|
764
818
|
{
|
|
@@ -853,6 +907,94 @@ function registerPrimitiveTools(server, { sessions }) {
|
|
|
853
907
|
return { content: [{ type: "text", text: `pasted ${value.length} chars into ${selector}` }] };
|
|
854
908
|
}
|
|
855
909
|
);
|
|
910
|
+
server.registerTool(
|
|
911
|
+
"human_clear",
|
|
912
|
+
{
|
|
913
|
+
title: "Clear a field (humanized)",
|
|
914
|
+
description: "Clears a text field (input/textarea/contenteditable) with a real keyboard gesture \u2014 click to focus, select-all, then delete \u2014 firing the input events the page expects. Use before human_type when you need to replace an existing value rather than append to it.",
|
|
915
|
+
inputSchema: {
|
|
916
|
+
selector: zod.z.string().describe("Selector of the field to clear."),
|
|
917
|
+
session: sessionArg2
|
|
918
|
+
}
|
|
919
|
+
},
|
|
920
|
+
async ({ selector, session }) => {
|
|
921
|
+
const { human } = await sessions.get(session);
|
|
922
|
+
await human.clear(selector);
|
|
923
|
+
return { content: [{ type: "text", text: `cleared ${selector}` }] };
|
|
924
|
+
}
|
|
925
|
+
);
|
|
926
|
+
server.registerTool(
|
|
927
|
+
"human_check",
|
|
928
|
+
{
|
|
929
|
+
title: "Check a box (humanized)",
|
|
930
|
+
description: "Ticks a checkbox or radio \u2014 moves the cursor to it and clicks, but only if it is not already checked (a real user does not re-click a ticked box). Verifies the resulting state. Pass the checkbox/radio input itself (or a [role=checkbox]) \u2014 not a wrapping <label> \u2014 so the current state can be read and the click stays idempotent.",
|
|
931
|
+
inputSchema: {
|
|
932
|
+
selector: zod.z.string().describe("Selector of the checkbox/radio input."),
|
|
933
|
+
session: sessionArg2
|
|
934
|
+
}
|
|
935
|
+
},
|
|
936
|
+
async ({ selector, session }) => {
|
|
937
|
+
const { human } = await sessions.get(session);
|
|
938
|
+
await human.check(selector);
|
|
939
|
+
return { content: [{ type: "text", text: `checked ${selector}` }] };
|
|
940
|
+
}
|
|
941
|
+
);
|
|
942
|
+
server.registerTool(
|
|
943
|
+
"human_uncheck",
|
|
944
|
+
{
|
|
945
|
+
title: "Uncheck a box (humanized)",
|
|
946
|
+
description: "Unticks a checkbox \u2014 humanized click only if currently checked. Radios cannot be unchecked by clicking (select a different option instead). Pass the checkbox input itself (or a [role=checkbox]) \u2014 not a wrapping <label> \u2014 so its state can be read and the click stays idempotent.",
|
|
947
|
+
inputSchema: {
|
|
948
|
+
selector: zod.z.string().describe("Selector of the checkbox input."),
|
|
949
|
+
session: sessionArg2
|
|
950
|
+
}
|
|
951
|
+
},
|
|
952
|
+
async ({ selector, session }) => {
|
|
953
|
+
const { human } = await sessions.get(session);
|
|
954
|
+
await human.uncheck(selector);
|
|
955
|
+
return { content: [{ type: "text", text: `unchecked ${selector}` }] };
|
|
956
|
+
}
|
|
957
|
+
);
|
|
958
|
+
server.registerTool(
|
|
959
|
+
"human_selectOption",
|
|
960
|
+
{
|
|
961
|
+
title: "Select dropdown option (humanized)",
|
|
962
|
+
description: "Chooses option(s) in a native <select> \u2014 moves the cursor to the dropdown, then sets the value (native selects open an OS menu automation can't drive, so the value is set programmatically, firing change/input). For custom DOM dropdowns, use human_click on the rendered options instead. Match by value(s); pass one string or an array for multi-selects.",
|
|
963
|
+
inputSchema: {
|
|
964
|
+
selector: zod.z.string().describe("Selector of the <select> element."),
|
|
965
|
+
values: zod.z.union([zod.z.string(), zod.z.array(zod.z.string())]).describe("Option value, or array of values for a multi-select."),
|
|
966
|
+
session: sessionArg2
|
|
967
|
+
}
|
|
968
|
+
},
|
|
969
|
+
async ({ selector, values, session }) => {
|
|
970
|
+
const { human } = await sessions.get(session);
|
|
971
|
+
const selected = await human.selectOption(selector, values);
|
|
972
|
+
return {
|
|
973
|
+
content: [{ type: "text", text: `selected ${selected.join(", ")} in ${selector}` }]
|
|
974
|
+
};
|
|
975
|
+
}
|
|
976
|
+
);
|
|
977
|
+
server.registerTool(
|
|
978
|
+
"human_upload",
|
|
979
|
+
{
|
|
980
|
+
title: "Upload file(s) (humanized)",
|
|
981
|
+
description: `Attaches file(s) to a file input \u2014 moves the cursor to the control, then sets the files (never opens the OS dialog, which would hang). For safety, files are read by basename from HUMANJS_UPLOAD_DIR (default: the server working dir) \u2014 subdirectories, "../", and absolute paths are rejected, so the agent can't read and exfiltrate arbitrary local files. Pass the <input type="file"> selector and the filename(s).`,
|
|
982
|
+
inputSchema: {
|
|
983
|
+
selector: zod.z.string().describe("Selector of the file input."),
|
|
984
|
+
files: zod.z.union([zod.z.string(), zod.z.array(zod.z.string())]).describe("Filename(s) inside HUMANJS_UPLOAD_DIR \u2014 a basename only, no path components."),
|
|
985
|
+
session: sessionArg2
|
|
986
|
+
}
|
|
987
|
+
},
|
|
988
|
+
async ({ selector, files, session }) => {
|
|
989
|
+
const { human } = await sessions.get(session);
|
|
990
|
+
const names = Array.isArray(files) ? files : [files];
|
|
991
|
+
const paths = names.map((name) => resolveUploadPath(env.uploadDir, name));
|
|
992
|
+
await human.upload(selector, paths);
|
|
993
|
+
return {
|
|
994
|
+
content: [{ type: "text", text: `uploaded ${paths.length} file(s) to ${selector}` }]
|
|
995
|
+
};
|
|
996
|
+
}
|
|
997
|
+
);
|
|
856
998
|
server.registerTool(
|
|
857
999
|
"human_press",
|
|
858
1000
|
{
|
|
@@ -955,32 +1097,32 @@ function registerRecordingTools(server, { sessions, env }) {
|
|
|
955
1097
|
"human_stop_recording",
|
|
956
1098
|
{
|
|
957
1099
|
title: "Stop recording and save",
|
|
958
|
-
description: `Stops the active recording and writes it to one or more files in HUMANJS_OUTPUT_DIR. Each filename's extension picks its format: .mp4/.webm = video, .gif = animated gif, .json = action timeline. Pass several to export the same recording multiple ways, e.g. ["demo.mp4", "
|
|
1100
|
+
description: `Stops the active recording and writes it to one or more files in HUMANJS_OUTPUT_DIR. Each filename's extension picks its format: .mp4/.webm = video, .gif = animated gif, .json = action timeline, .ts = runnable HumanJS script, .spec.ts/.test.ts = @playwright/test spec (humanized, with derived assertions). Pass several to export the same recording multiple ways, e.g. ["demo.mp4", "checkout.spec.ts"] for a video plus a ready-to-commit test. Path components are rejected for safety.`,
|
|
959
1101
|
inputSchema: {
|
|
960
1102
|
filenames: zod.z.array(zod.z.string()).min(1).describe(
|
|
961
|
-
'One or more output filenames. The recording is saved to each, format chosen by extension. e.g. ["demo.mp4"] or ["demo.mp4", "demo.
|
|
1103
|
+
'One or more output filenames. The recording is saved to each, format chosen by extension. e.g. ["demo.mp4"], ["checkout.spec.ts"], or ["demo.mp4", "demo.json", "demo.ts"].'
|
|
962
1104
|
),
|
|
963
1105
|
session: zod.z.string().optional().describe("Session ID. Omit for the default session.")
|
|
964
1106
|
}
|
|
965
1107
|
},
|
|
966
1108
|
async ({ filenames, session }) => {
|
|
967
|
-
const targets = filenames.map((filename) =>
|
|
968
|
-
|
|
969
|
-
|
|
970
|
-
}));
|
|
971
|
-
for (const { ext } of targets) {
|
|
972
|
-
if (ext !== ".mp4" && ext !== ".webm" && ext !== ".gif" && ext !== ".json") {
|
|
1109
|
+
const targets = filenames.map((filename) => {
|
|
1110
|
+
const format = resolveRecordingFormat(filename);
|
|
1111
|
+
if (format === null) {
|
|
973
1112
|
throw new Error(
|
|
974
|
-
`Unsupported output extension "${
|
|
1113
|
+
`Unsupported output extension for "${filename}". Use .mp4/.webm (video), .gif, .json (timeline), .ts (HumanJS script), or .spec.ts/.test.ts (Playwright test).`
|
|
975
1114
|
);
|
|
976
1115
|
}
|
|
977
|
-
|
|
1116
|
+
return { path: resolveOutputPath(env.outputDir, filename), format };
|
|
1117
|
+
});
|
|
978
1118
|
const recording = await sessions.stopRecording(session);
|
|
979
1119
|
try {
|
|
980
1120
|
const saved = [];
|
|
981
|
-
for (const { path,
|
|
982
|
-
if (
|
|
983
|
-
else if (
|
|
1121
|
+
for (const { path, format } of targets) {
|
|
1122
|
+
if (format === "gif") saved.push(await recording.toGif(path));
|
|
1123
|
+
else if (format === "timeline") saved.push(await recording.toTimeline(path));
|
|
1124
|
+
else if (format === "humanjs") saved.push(await recording.toHumanJS(path));
|
|
1125
|
+
else if (format === "playwright") saved.push(await recording.toPlaywright(path));
|
|
984
1126
|
else saved.push(await recording.toVideo(path));
|
|
985
1127
|
}
|
|
986
1128
|
return { content: [{ type: "text", text: `saved recording to:
|
|
@@ -1065,6 +1207,10 @@ Recording a flow (the natural-looking way):
|
|
|
1065
1207
|
1. EXPLORE FIRST (un-recorded). Navigate the flow once to discover correct, unambiguous selectors (human_screenshot / human_get_html / human_get_attribute). Do this by default whenever the selectors aren't already known \u2014 no need for the user to ask. Skip it only if the selectors are already known or the user tells you not to explore.
|
|
1066
1208
|
2. THEN RECORD ONE CLEAN RUN AS A SINGLE BATCH: human_start_recording + every action + human_stop_recording, all emitted in one turn. Keep selector-guessing and fumbles out of the take.
|
|
1067
1209
|
|
|
1210
|
+
Export as a test: human_stop_recording picks format by extension. A .spec.ts (or .test.ts) filename writes a ready-to-commit @playwright/test with derived assertions; a .ts writes a standalone HumanJS script; .mp4/.webm/.gif/.json are video/timeline. So "record this flow and save it as a test" = run the clean pass, then stop into e.g. "checkout.spec.ts".
|
|
1211
|
+
|
|
1212
|
+
Captured input + passwords: typed/pasted text IS recorded into the timeline and code exports, so generated scripts/tests are runnable \u2014 EXCEPT password fields, which are always masked (emitted as an empty string with a "fill in" comment). This is intentional, not a bug; don't work around it by hand-editing the secret back in. If the user explicitly wants the flow to log in, edit the exported file to read the credential from an env var (e.g. process.env.APP_PASSWORD) and tell them to set it \u2014 never hardcode a real password into a file that may be committed.
|
|
1213
|
+
|
|
1068
1214
|
Dynamic UI: prefer specific selectors (role, aria-label) over text \u2014 the same visible text often matches several cards before a filter, or the wrong one after. If a click reports multiple matches, narrow the selector.
|
|
1069
1215
|
|
|
1070
1216
|
Browser state: by default each run is a fresh, signed-out browser. If a flow needs a login, tell the user to enable persistence (human_enable_persistence or HUMANJS_PERSIST) or CDP attach \u2014 see human_browser_info.`;
|