npm - @geometra/mcp - Versions diffs - 1.19.11 → 1.19.13 - Mend

@geometra/mcp 1.19.11 → 1.19.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +47 -22
package/dist/__tests__/connect-utils.test.js +44 -2
package/dist/__tests__/server-batch-results.test.js +219 -3
package/dist/__tests__/session-model.test.js +230 -1
package/dist/proxy-spawn.js +80 -4
package/dist/server.d.ts +1 -0
package/dist/server.js +455 -18
package/dist/session.d.ts +87 -0
package/dist/session.js +350 -6
package/package.json +2 -2

package/README.md CHANGED Viewed

@@ -19,12 +19,15 @@ Geometra proxy:       Chromium → DOM geometry → same WebSocket as native →
 | Tool | Description |
 |---|---|
 | `geometra_connect` | Connect with `url` (ws://…) **or** `pageUrl` (https://…) to auto-start geometra-proxy; `url: "https://…"` is auto-coerced onto the proxy path |
-| `geometra_query` | Find elements by stable id, role, name, text content, current value, or semantic state such as `invalid`, `required`, or `busy` |
+| `geometra_query` | Find elements by stable id, role, name, text content, ancestor/prompt context, current value, or semantic state such as `invalid`, `required`, or `busy` |
 | `geometra_wait_for` | Wait for a semantic condition instead of guessing sleeps (`busy`, `disabled`, alerts, values, etc.) |
+| `geometra_form_schema` | Compact, fill-oriented form schema with stable field ids and collapsed radio/button groups |
+| `geometra_fill_form` | Fill a form from `valuesById` / `valuesByLabel` in one MCP call; preferred low-token happy path for standard forms |
 | `geometra_fill_fields` | Fill labeled text/choice/toggle/file fields in one MCP call; can return final-only status for the smallest responses |
 | `geometra_run_actions` | Execute a batch of high-level actions in one MCP round trip and get one consolidated result, with optional final-only output |
 | `geometra_page_model` | Summary-first webpage model: archetypes, stable section ids, counts, top-level sections, primary actions |
-| `geometra_expand_section` | Expand one form/dialog/list/landmark from `geometra_page_model` on demand |
+| `geometra_expand_section` | Expand one form/dialog/list/landmark from `geometra_page_model` on demand, with paging/filtering for long sections |
+| `geometra_reveal` | Scroll until a matching node is visible instead of guessing wheel deltas |
 | `geometra_click` | Click an element by coordinates |
 | `geometra_type` | Type text into the focused element |
 | `geometra_key` | Send special keys (Enter, Tab, Escape, arrows) |
@@ -274,19 +277,15 @@ With `python3 -m http.server 8080` in `demos/proxy-mcp-sample` and `npx geometra
 Agent:  geometra_connect({ url: "ws://127.0.0.1:3200" })
         → Connected. UI includes textbox "Email", button "Save", …
-Agent:  geometra_page_model({})
-        → {"viewport":{"width":1024,"height":768},"archetypes":["shell","form"],"summary":{...},"forms":[{"id":"fm:1.0","fieldCount":3,"actionCount":1}], ...}
+Agent:  geometra_form_schema({})
+        → {"forms":[{"formId":"fm:1.0","fields":[{"id":"ff:1.0.0","label":"Email"}, ...]}]}
-Agent:  geometra_expand_section({ id: "fm:1.0" })
-        → {"id":"fm:1.0","kind":"form","fields":[{"id":"n:1.0.0","name":"Email"}, ...], "actions":[...]}
-Agent:  geometra_query({ role: "textbox", name: "Email" })
-        → bounds for the email field (viewport coordinates)
-Agent:  geometra_click({ x: <center-x>, y: <center-y> })
-        → Focuses the input
-Agent:  geometra_type({ text: "hello@example.com" })
+Agent:  geometra_fill_form({
+          formId: "fm:1.0",
+          valuesByLabel: { "Email": "hello@example.com" },
+          failOnInvalid: true
+        })
+        → {"completed":true,"successCount":1,"errorCount":0,"final":{"invalidCount":0,...}}
 Agent:  geometra_query({ role: "button", name: "Save" })
         → Click center to submit the sample form; status text updates in the DOM
@@ -298,20 +297,22 @@ Agent:  geometra_query({ role: "button", name: "Save" })
 2. It receives the computed layout (`{ x, y, width, height }` for every node) and the UI tree (`kind`, `semantic`, `props`, `handlers`, `children`).
 3. It builds an accessibility tree from that data — roles, names, focusable state, bounds.
 4. **`geometra_snapshot`** defaults to a **compact** flat list of viewport-visible actionable nodes (minified JSON) to reduce LLM tokens; use `view: "full"` for the complete nested tree.
-5. **`geometra_page_model`** is summary-first: page archetypes, stable section ids, counts, top-level landmarks/forms/dialogs/lists, and a few primary actions. It is designed to be cheaper than dumping full previews for every section.
-6. **`geometra_expand_section`** fetches richer details only for the section you care about (fields, actions, headings, nested lists, list items, text preview).
-7. After interactions, action tools return a **semantic delta** when possible (dialogs opened/closed, forms appeared/removed, list counts changed, named/focusable nodes added/removed/updated). If nothing meaningful changed, they fall back to a short current-UI overview.
-8. Tools expose query, click, type, snapshot, page-model, and section-expansion operations over this structured data.
-9. After each interaction, the peer sends updated geometry (full `frame` or `patch`) — the MCP tools interpret that into compact summaries.
+5. **`geometra_form_schema`** is the compact form-specific path: stable field ids, required/invalid state, current values, and collapsed choice groups without layout-heavy section detail.
+6. **`geometra_fill_form`** turns a compact values object into semantic field operations server-side, so the model does not need to emit one tool call per field.
+7. **`geometra_page_model`** is still the right summary-first path for non-form exploration: page archetypes, stable section ids, counts, top-level landmarks/forms/dialogs/lists, and a few primary actions.
+8. **`geometra_expand_section`** fetches richer details only for the section you care about (fields, actions, headings, nested lists, list items, text preview).
+9. After interactions, action tools return a **semantic delta** when possible (dialogs opened/closed, forms appeared/removed, list counts changed, named/focusable nodes added/removed/updated). If nothing meaningful changed, they fall back to a short current-UI overview.
+10. After each interaction, the peer sends updated geometry (full `frame` or `patch`) — the MCP tools interpret that into compact summaries.
 ## Long Forms
 For long application flows, prefer one of these patterns:
-1. `geometra_page_model`
-2. `geometra_expand_section`
-3. `geometra_fill_fields` for obvious field entry
+1. `geometra_form_schema`
+2. `geometra_fill_form`
+3. `geometra_reveal` for far-below-fold targets such as submit buttons
 4. `geometra_run_actions` when you need mixed navigation + waits + field entry
+5. `geometra_page_model` + `geometra_expand_section` when you are still exploring the page rather than filling it
 Typical batch:
@@ -333,6 +334,30 @@ For the smallest long-form responses, prefer:
 1. `detail: "minimal"` for structured step metadata instead of narrated deltas
 2. `includeSteps: false` when you only need aggregate success/error counts plus the final validation/state payload
+Typical low-token form fill:
+```json
+{
+  "formId": "fm:1.0",
+  "valuesById": {
+    "ff:1.0.0": "Taylor Applicant",
+    "ff:1.0.1": "taylor@example.com",
+    "ff:1.0.2": "Germany",
+    "ff:1.0.3": "No"
+  },
+  "failOnInvalid": true,
+  "includeSteps": false,
+  "detail": "minimal"
+}
+```
+For long single-page forms:
+1. Use `geometra_expand_section` with `fieldOffset` / `actionOffset` to page through large forms instead of taking a full snapshot.
+2. Add `onlyRequiredFields: true` or `onlyInvalidFields: true` when you want the actionable subset.
+3. Use `contextText` in `geometra_query` / `geometra_wait_for` to disambiguate repeated `Yes` / `No` controls by question text.
+4. Use `geometra_reveal` instead of manual wheel loops when the next target is offscreen.
 Typical field fill:
 ```json

package/dist/__tests__/connect-utils.test.js CHANGED Viewed

@@ -1,4 +1,4 @@
-import { existsSync, mkdirSync, mkdtempSync, rmSync, symlinkSync, writeFileSync } from 'node:fs';
+import { existsSync, mkdirSync, mkdtempSync, rmSync, writeFileSync } from 'node:fs';
 import { createRequire } from 'node:module';
 import { tmpdir } from 'node:os';
 import path from 'node:path';
@@ -78,7 +78,22 @@ describe('proxy ready helpers', () => {
             const packageDir = path.join(scopeDir, 'proxy');
             const probePath = path.join(tempRoot, 'probe.cjs');
             mkdirSync(scopeDir, { recursive: true });
-            symlinkSync(path.resolve(process.cwd(), 'packages/proxy'), packageDir, 'dir');
+            mkdirSync(path.join(packageDir, 'src'), { recursive: true });
+            writeFileSync(path.join(packageDir, 'package.json'), JSON.stringify({
+                name: '@geometra/proxy',
+                version: '0.0.0-test',
+                type: 'module',
+            }));
+            writeFileSync(path.join(packageDir, 'tsconfig.build.json'), JSON.stringify({
+                extends: path.resolve(process.cwd(), 'tsconfig.base.json'),
+                compilerOptions: {
+                    outDir: 'dist',
+                    rootDir: 'src',
+                    noEmit: false,
+                },
+                include: ['src'],
+            }));
+            writeFileSync(path.join(packageDir, 'src', 'index.ts'), 'console.log("proxy");\n');
             writeFileSync(probePath, 'module.exports = {}');
             const customRequire = createRequire(probePath);
             const scriptPath = resolveProxyScriptPathWith(customRequire);
@@ -89,6 +104,33 @@ describe('proxy ready helpers', () => {
             rmSync(tempRoot, { recursive: true, force: true });
         }
     });
+    it('prefers the current workspace proxy dist over a bundled nested dependency in source checkouts', () => {
+        const tempRoot = mkdtempSync(path.join(tmpdir(), 'geometra-proxy-workspace-prefer-'));
+        try {
+            const workspaceDistDir = path.join(tempRoot, 'packages', 'proxy', 'dist');
+            const bundledProxyDir = path.join(tempRoot, 'mcp', 'node_modules', '@geometra', 'proxy');
+            const bundledDistDir = path.join(bundledProxyDir, 'dist');
+            const mcpDistDir = path.join(tempRoot, 'mcp', 'dist');
+            const probePath = path.join(mcpDistDir, 'proxy-spawn.cjs');
+            mkdirSync(workspaceDistDir, { recursive: true });
+            mkdirSync(bundledDistDir, { recursive: true });
+            mkdirSync(mcpDistDir, { recursive: true });
+            writeFileSync(path.join(workspaceDistDir, 'index.js'), 'export const source = "workspace";\n');
+            writeFileSync(path.join(bundledDistDir, 'index.js'), 'export const source = "bundled";\n');
+            writeFileSync(path.join(bundledProxyDir, 'package.json'), JSON.stringify({
+                name: '@geometra/proxy',
+                version: '0.0.0-test',
+                type: 'module',
+            }));
+            writeFileSync(probePath, 'module.exports = {};\n');
+            const customRequire = createRequire(probePath);
+            const scriptPath = resolveProxyScriptPathWith(customRequire, mcpDistDir);
+            expect(scriptPath).toBe(path.join(workspaceDistDir, 'index.js'));
+        }
+        finally {
+            rmSync(tempRoot, { recursive: true, force: true });
+        }
+    });
     it('falls back to the packaged sibling proxy dist when package exports are stale', () => {
         const tempRoot = mkdtempSync(path.join(tmpdir(), 'geometra-proxy-stale-exports-'));
         try {

package/dist/__tests__/server-batch-results.test.js CHANGED Viewed

@@ -7,7 +7,7 @@ function node(role, name, options) {
         ...(options?.state ? { state: options.state } : {}),
         ...(options?.validation ? { validation: options.validation } : {}),
         ...(options?.meta ? { meta: options.meta } : {}),
-        bounds: { x: 0, y: 0, width: 120, height: 40 },
+        bounds: options?.bounds ?? { x: 0, y: 0, width: 120, height: 40 },
         path: options?.path ?? [],
         children: options?.children ?? [],
         focusable: role !== 'group',
@@ -23,6 +23,9 @@ const mockState = vi.hoisted(() => ({
         url: 'ws://127.0.0.1:3200',
         updateRevision: 1,
     },
+    formSchemas: [],
+    connect: vi.fn(),
+    connectThroughProxy: vi.fn(),
     sendClick: vi.fn(async () => ({ status: 'updated', timeoutMs: 2000 })),
     sendType: vi.fn(async () => ({ status: 'updated', timeoutMs: 2000 })),
     sendKey: vi.fn(async () => ({ status: 'updated', timeoutMs: 2000 })),
@@ -36,8 +39,8 @@ const mockState = vi.hoisted(() => ({
     waitForUiCondition: vi.fn(async () => true),
 }));
 vi.mock('../session.js', () => ({
-    connect: vi.fn(),
-    connectThroughProxy: vi.fn(),
+    connect: mockState.connect,
+    connectThroughProxy: mockState.connectThroughProxy,
     disconnect: vi.fn(),
     getSession: vi.fn(() => mockState.session),
     sendClick: mockState.sendClick,
@@ -62,6 +65,7 @@ vi.mock('../session.js', () => ({
         dialogs: [],
         lists: [],
     })),
+    buildFormSchemas: vi.fn(() => mockState.formSchemas),
     expandPageSection: vi.fn(() => null),
     buildUiDelta: vi.fn(() => ({})),
     hasUiDelta: vi.fn(() => false),
@@ -79,6 +83,9 @@ function getToolHandler(name) {
 describe('batch MCP result shaping', () => {
     beforeEach(() => {
         vi.clearAllMocks();
+        mockState.connect.mockResolvedValue(mockState.session);
+        mockState.connectThroughProxy.mockResolvedValue(mockState.session);
+        mockState.formSchemas = [];
         mockState.currentA11yRoot = node('group', undefined, {
             meta: { pageUrl: 'https://jobs.example.com/application', scrollX: 0, scrollY: 420 },
             children: [
@@ -203,4 +210,213 @@ describe('batch MCP result shaping', () => {
         expect(final.invalidFields.length).toBe(4);
         expect(final.alerts.length).toBe(1);
     });
+    it('returns a compact structured connect payload by default', async () => {
+        const handler = getToolHandler('geometra_connect');
+        const result = await handler({
+            pageUrl: 'https://jobs.example.com/application',
+            headless: true,
+        });
+        const payload = JSON.parse(result.content[0].text);
+        expect(payload).toMatchObject({
+            connected: true,
+            transport: 'proxy',
+            wsUrl: 'ws://127.0.0.1:3200',
+            pageUrl: 'https://jobs.example.com/application',
+        });
+        expect(payload).not.toHaveProperty('currentUi');
+    });
+    it('returns compact form schemas without requiring section expansion', async () => {
+        const handler = getToolHandler('geometra_form_schema');
+        mockState.formSchemas = [
+            {
+                formId: 'fm:0',
+                name: 'Application',
+                fieldCount: 4,
+                requiredCount: 3,
+                invalidCount: 0,
+                fields: [
+                    { id: 'ff:0.0', kind: 'text', label: 'Full name', required: true },
+                    { id: 'ff:0.1', kind: 'choice', label: 'Preferred location', required: true },
+                    { id: 'ff:0.2', kind: 'choice', label: 'Are you legally authorized to work in Germany?', options: ['Yes', 'No'], optionCount: 2 },
+                    { id: 'ff:0.3', kind: 'toggle', label: 'Share my profile for future roles', controlType: 'checkbox' },
+                ],
+            },
+        ];
+        const result = await handler({ maxFields: 20 });
+        const payload = JSON.parse(result.content[0].text);
+        expect(payload.forms).toEqual([
+            expect.objectContaining({
+                formId: 'fm:0',
+                fieldCount: 4,
+                requiredCount: 3,
+                invalidCount: 0,
+            }),
+        ]);
+    });
+    it('fills a form from ids and labels without echoing long essay content', async () => {
+        const longAnswer = 'B'.repeat(220);
+        const handler = getToolHandler('geometra_fill_form');
+        mockState.formSchemas = [
+            {
+                formId: 'fm:0',
+                name: 'Application',
+                fieldCount: 4,
+                requiredCount: 3,
+                invalidCount: 0,
+                fields: [
+                    { id: 'ff:0.0', kind: 'text', label: 'Full name', required: true },
+                    { id: 'ff:0.1', kind: 'choice', label: 'Are you legally authorized to work in Germany?', options: ['Yes', 'No'], optionCount: 2 },
+                    { id: 'ff:0.2', kind: 'toggle', label: 'Share my profile for future roles', controlType: 'checkbox' },
+                    { id: 'ff:0.3', kind: 'text', label: 'Why Geometra?' },
+                ],
+            },
+        ];
+        mockState.currentA11yRoot = node('group', undefined, {
+            meta: { pageUrl: 'https://jobs.example.com/application', scrollX: 0, scrollY: 640 },
+            children: [
+                node('textbox', 'Full name', { value: 'Taylor Applicant', path: [0] }),
+                node('textbox', 'Why Geometra?', { value: longAnswer, path: [1] }),
+                node('checkbox', 'Share my profile for future roles', {
+                    path: [2],
+                    state: { checked: true },
+                }),
+            ],
+        });
+        const result = await handler({
+            valuesById: {
+                'ff:0.0': 'Taylor Applicant',
+            },
+            valuesByLabel: {
+                'Are you legally authorized to work in Germany?': true,
+                'Share my profile for future roles': true,
+                'Why Geometra?': longAnswer,
+            },
+            includeSteps: true,
+            detail: 'minimal',
+        });
+        const text = result.content[0].text;
+        const payload = JSON.parse(text);
+        const steps = payload.steps;
+        expect(text).not.toContain(longAnswer);
+        expect(mockState.sendFieldChoice).toHaveBeenCalledWith(mockState.session, 'Are you legally authorized to work in Germany?', 'Yes', { exact: undefined, query: undefined }, undefined);
+        expect(payload).toMatchObject({
+            completed: true,
+            formId: 'fm:0',
+            requestedValueCount: 4,
+            fieldCount: 4,
+            successCount: 4,
+            errorCount: 0,
+        });
+        expect(steps[3]).toMatchObject({
+            kind: 'text',
+            fieldLabel: 'Why Geometra?',
+            valueLength: 220,
+            readback: { role: 'textbox', valueLength: 220 },
+        });
+    });
+});
+describe('query and reveal tools', () => {
+    beforeEach(() => {
+        vi.clearAllMocks();
+    });
+    it('lets query disambiguate repeated controls by context text', async () => {
+        const handler = getToolHandler('geometra_query');
+        mockState.currentA11yRoot = node('group', undefined, {
+            meta: { pageUrl: 'https://jobs.example.com/application', scrollX: 0, scrollY: 900 },
+            children: [
+                node('form', 'Application', {
+                    path: [0],
+                    children: [
+                        node('group', undefined, {
+                            path: [0, 0],
+                            children: [
+                                node('text', 'Are you legally authorized to work here?', { path: [0, 0, 0] }),
+                                node('button', 'Yes', { path: [0, 0, 1] }),
+                                node('button', 'No', { path: [0, 0, 2] }),
+                            ],
+                        }),
+                        node('group', undefined, {
+                            path: [0, 1],
+                            children: [
+                                node('text', 'Will you require sponsorship?', { path: [0, 1, 0] }),
+                                node('button', 'Yes', { path: [0, 1, 1] }),
+                                node('button', 'No', { path: [0, 1, 2] }),
+                            ],
+                        }),
+                    ],
+                }),
+            ],
+        });
+        const result = await handler({
+            role: 'button',
+            name: 'Yes',
+            contextText: 'sponsorship',
+        });
+        const payload = JSON.parse(result.content[0].text);
+        expect(payload).toHaveLength(1);
+        expect(payload[0]).toMatchObject({
+            role: 'button',
+            name: 'Yes',
+            context: {
+                prompt: 'Will you require sponsorship?',
+                section: 'Application',
+            },
+        });
+    });
+    it('reveals an offscreen target with semantic scrolling instead of requiring manual wheels', async () => {
+        const handler = getToolHandler('geometra_reveal');
+        mockState.currentA11yRoot = node('group', undefined, {
+            bounds: { x: 0, y: 0, width: 1280, height: 800 },
+            meta: { pageUrl: 'https://jobs.example.com/application', scrollX: 0, scrollY: 0 },
+            children: [
+                node('form', 'Application', {
+                    bounds: { x: 20, y: -200, width: 760, height: 1900 },
+                    path: [0],
+                    children: [
+                        node('button', 'Submit application', {
+                            bounds: { x: 60, y: 1540, width: 180, height: 40 },
+                            path: [0, 0],
+                        }),
+                    ],
+                }),
+            ],
+        });
+        mockState.sendWheel.mockImplementationOnce(async () => {
+            mockState.currentA11yRoot = node('group', undefined, {
+                bounds: { x: 0, y: 0, width: 1280, height: 800 },
+                meta: { pageUrl: 'https://jobs.example.com/application', scrollX: 0, scrollY: 1220 },
+                children: [
+                    node('form', 'Application', {
+                        bounds: { x: 20, y: -1420, width: 760, height: 1900 },
+                        path: [0],
+                        children: [
+                            node('button', 'Submit application', {
+                                bounds: { x: 60, y: 320, width: 180, height: 40 },
+                                path: [0, 0],
+                            }),
+                        ],
+                    }),
+                ],
+            });
+            return { status: 'updated', timeoutMs: 2500 };
+        });
+        const result = await handler({
+            role: 'button',
+            name: 'Submit application',
+            maxSteps: 3,
+            fullyVisible: true,
+            timeoutMs: 2500,
+        });
+        const payload = JSON.parse(result.content[0].text);
+        expect(mockState.sendWheel).toHaveBeenCalledWith(mockState.session, expect.any(Number), expect.objectContaining({ x: expect.any(Number), y: expect.any(Number) }), 2500);
+        expect(payload).toMatchObject({
+            revealed: true,
+            attempts: 1,
+            target: {
+                role: 'button',
+                name: 'Submit application',
+                visibility: { fullyVisible: true },
+            },
+        });
+    });
 });