npm - agentgate-mcp - Versions diffs - 0.2.0 → 0.3.0 - Mend

agentgate-mcp 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/ARCHITECTURE.md +18 -34
package/MCP_TOOLS.md +50 -26
package/README.md +54 -75
package/package.json +1 -3
package/src/browser-session.js +230 -0
package/src/cli.js +14 -45
package/src/config.js +1 -9
package/src/mcp-server.js +136 -67
package/src/orchestrator.js +54 -67
package/services/_template.service.json +0 -34
package/src/browser-runtime.js +0 -411
package/src/integrations/captcha-solver.js +0 -128
package/src/integrations/gmail-watcher.js +0 -129
package/src/playwright-engine.js +0 -391
package/src/registry.js +0 -47
package/src/scaffold.js +0 -103
package/src/setup.js +0 -109
package/src/signup-engine.js +0 -24
package/src/vault.js +0 -105

package/ARCHITECTURE.md CHANGED Viewed

@@ -4,45 +4,29 @@
 1. User runs `agentgate login` → opens Chromium → signs into Google → browser profile saved
 2. `agentgate serve` starts MCP server over stdio
-3. AI agent calls `get_or_create_key({ service, signup_url, api_key_url })`
-4. Orchestrator checks SQLite cache for an active key
-5. If cached: returns immediately
-6. If not cached: launches browser with saved Google profile
-7. Smart navigation: finds Google sign-in → authenticates → navigates to API keys → extracts key
-8. Key is stored in SQLite and returned to agent
+3. AI agent calls `get_or_create_key("openai")` → no key cached
+4. Agent calls `open_browser("https://platform.openai.com")` → sees screenshot
+5. Agent decides what to click/fill based on the screenshot
+6. Agent calls `browser_action({ action: "click", selector: "text=Sign in" })` → sees result
+7. Agent navigates to API keys page, creates a key, extracts it
+8. Agent calls `save_key("openai", "sk-...")` → key cached for next time
+9. Agent calls `close_browser()` → done
+The AI agent is the brain. AgentGate is the hands.
 ## Core modules
-- `src/cli.js` — CLI entry point (`login`, `serve`, `doctor`, `scaffold`)
-- `src/mcp-server.js` — MCP JSON-RPC over stdio
-- `src/orchestrator.js` — Key lifecycle: cache check → create → store
-- `src/playwright-engine.js` — Browser automation with persistent Google session
-- `src/signup-engine.js` — Logging wrapper around PlaywrightEngine
-- `src/browser-runtime.js` — Workflow DSL executor (for service recipes)
-- `src/vault.js` — AES-256-GCM encrypted local vault
-- `src/db.js` — SQLite key + alias storage
-- `src/registry.js` — Optional service recipe loader
-- `src/scaffold.js` — Recipe template generator
-- `src/logger.js` — Structured JSON logging with rotation
-## Two modes of key creation
-### Smart mode (default)
-No recipe needed. Engine uses heuristics to:
-1. Find and click Google sign-in buttons
-2. Handle OAuth popup
-3. Navigate to API keys page
-4. Click "Create API Key" buttons
-5. Extract the key from the page
-### Recipe mode (optional)
-For services with non-standard flows, a JSON recipe in `services/` provides
-an explicit workflow with DSL actions (goto, click, fill, extract, etc.).
+- `src/cli.js` — CLI entry point (`login`, `serve`, `doctor`)
+- `src/mcp-server.js` — MCP JSON-RPC over stdio, 7 tools
+- `src/orchestrator.js` — Thin coordinator between DB and browser session
+- `src/browser-session.js` — Persistent Chromium session with Google profile
+- `src/db.js` — SQLite key storage
+- `src/config.js` — Path resolution and directory setup
+- `src/logger.js` — Structured JSON logging with secret masking
 ## Security
 - Browser profile stored locally in `~/.agentgate/browser-profile/`
-- Vault encrypted with AES-256-GCM (keyring + vault file)
 - SQLite database stored locally
 - API keys masked in logs (last 4 chars visible)
 - File permissions 0o600 on sensitive files
@@ -52,5 +36,5 @@ an explicit workflow with DSL actions (goto, click, fill, extract, etc.).
 - Version: 2024-11-05
 - Methods: `initialize`, `ping`, `tools/list`, `tools/call`
-- Notifications: `notifications/initialized`, `notifications/cancelled`
-- Tools: `get_or_create_key`, `list_my_keys`, `revoke_key`, `check_key_status`
+- Notifications: `notifications/initialized`
+- Tools: `get_or_create_key`, `open_browser`, `browser_action`, `save_key`, `close_browser`, `list_my_keys`, `revoke_key`

package/MCP_TOOLS.md CHANGED Viewed

@@ -2,44 +2,68 @@
 ## `get_or_create_key`
-Works for **any** service. No pre-configuration needed.
+Check if a cached API key exists for a service.
-Input:
+```json
+{ "service": "openai" }
+```
+Returns the cached key or `{ "exists": false }`.
+## `open_browser`
+Open a browser with the saved Google session and navigate to a URL. Returns a screenshot.
 ```json
-{
-  "service": "twelvelabs",
-  "signup_url": "https://api.twelvelabs.io/signup",
-  "api_key_url": "https://api.twelvelabs.io/dashboard/api-keys"
-}
+{ "url": "https://platform.openai.com/signup" }
 ```
-Parameters:
-- `service` (required) — Service name, used as cache key
-- `signup_url` (required) — Where to start the sign-up/login flow
-- `api_key_url` (optional) — Direct link to the API keys dashboard
+## `browser_action`
-Behavior:
-1. Returns cached key from SQLite if one exists
-2. Otherwise opens browser with saved Google session
-3. Navigates to signup_url, finds Google sign-in, authenticates
-4. Navigates to api_key_url (if provided), creates and extracts key
-5. Caches key in SQLite and returns it
+Perform an action in the open browser. Returns a screenshot after each action.
-## `list_my_keys`
+```json
+{ "action": "click", "selector": "text=Sign in with Google" }
+{ "action": "fill", "selector": "input[name=email]", "value": "test@example.com" }
+{ "action": "goto", "url": "https://platform.openai.com/api-keys" }
+{ "action": "extract_text", "selector": ".api-key" }
+{ "action": "extract_all_text" }
+{ "action": "scroll", "value": "500" }
+{ "action": "press", "key": "Enter" }
+{ "action": "wait", "selector": ".loaded", "ms": 5000 }
+{ "action": "screenshot" }
+```
-Input: `{}`
+Actions: `click`, `fill`, `select`, `press`, `scroll`, `goto`, `wait`, `screenshot`, `extract_text`, `extract_all_text`
-Returns all API keys in the local database (active and revoked).
+## `save_key`
-## `revoke_key`
+Store an API key the agent found on the page.
+```json
+{ "service": "openai", "api_key": "sk-..." }
+```
-Input: `{ "service": "openai" }`
+## `close_browser`
-Marks the local key as revoked. Does NOT revoke it on the provider side.
+Close the browser session.
-## `check_key_status`
+```json
+{}
+```
+## `list_my_keys`
+List all stored API keys.
+```json
+{}
+```
+## `revoke_key`
-Input: `{ "service": "openai" }`
+Delete a stored API key.
-Returns whether an active key exists. Does not create one.
+```json
+{ "service": "openai" }
+```

package/README.md CHANGED Viewed

@@ -1,15 +1,23 @@
 # AgentGate
-MCP server that lets AI agents get API keys for **any** service. Sign in with Google once, then your agent can create keys anywhere.
+MCP server that gives AI agents a browser with your Google session. The agent sees screenshots, decides what to click, and grabs API keys from any service.
 ## How it works
-1. You sign into Google in a real browser (one time)
-2. AgentGate saves that browser session
-3. When your AI agent needs an API key, AgentGate opens a browser with your Google session, signs up for the service, and extracts the key
-4. Keys are cached locally — second request is instant
+```
+You: "Get me a Twelve Labs API key"
+Agent: get_or_create_key("twelvelabs")        → no key cached
+Agent: open_browser("https://twelvelabs.io")   → sees screenshot
+Agent: click("Sign in with Google")            → sees dashboard
+Agent: goto("https://twelvelabs.io/api-keys")  → sees API keys page
+Agent: click("Create API Key")                 → sees new key
+Agent: extract_text(".api-key")                → reads the key
+Agent: save_key("twelvelabs", "tl_key_abc...")  → cached for next time
+Agent: close_browser()                         → done
+```
-No hardcoded services. No config files per provider. Works for anything with "Sign in with Google".
+The AI agent is the brain. AgentGate is the hands.
 ## Install
@@ -17,7 +25,7 @@ No hardcoded services. No config files per provider. Works for anything with "Si
 npm install -g agentgate-mcp
 ```
-Requires Node.js 23+ (for built-in SQLite support).
+Requires Node.js 23+.
 ## Setup (one time)
@@ -25,7 +33,7 @@ Requires Node.js 23+ (for built-in SQLite support).
 agentgate login
 ```
-This opens Chromium — sign into your Google account, then close the browser. Done.
+Opens Chromium — sign into Google, close the browser. Done.
 ## Add to Claude Code
@@ -33,70 +41,55 @@ This opens Chromium — sign into your Google account, then close the browser. D
 claude mcp add agentgate -- agentgate serve
 ```
-Or manually add to your MCP config:
-```json
-{
-  "mcpServers": {
-    "agentgate": {
-      "command": "agentgate",
-      "args": ["serve"]
-    }
-  }
-}
-```
-## Usage
-Just ask your AI agent naturally:
-- *"Get me an API key for Twelve Labs"*
-- *"I need an OpenAI key"*
-- *"Set me up with a Replicate key"*
-- *"Show all my keys"*
-- *"Revoke my openai key"*
 ## MCP Tools
-### `get_or_create_key`
+| Tool | What it does |
+|------|-------------|
+| `get_or_create_key` | Check if a key is cached for a service |
+| `open_browser` | Open browser with Google session, go to URL, return screenshot |
+| `browser_action` | Click, fill, scroll, extract text — returns screenshot after each action |
+| `save_key` | Store an API key the agent found |
+| `close_browser` | Close the browser |
+| `list_my_keys` | List all stored keys |
+| `revoke_key` | Delete a stored key |
-Works for **any** service. Just provide the name and URLs.
+### `open_browser`
 ```json
-{
-  "service": "twelvelabs",
-  "signup_url": "https://api.twelvelabs.io/signup",
-  "api_key_url": "https://api.twelvelabs.io/dashboard/api-keys"
-}
+{ "url": "https://platform.openai.com/signup" }
 ```
-| Parameter | Required | Description |
-|-----------|----------|-------------|
-| `service` | Yes | Service name (used as cache key) |
-| `signup_url` | Yes | Signup or login page URL |
-| `api_key_url` | No | Direct link to API keys dashboard |
+Returns a **screenshot** of the page so the agent can see it.
-### `list_my_keys`
+### `browser_action`
-Returns all stored keys (active and revoked).
-### `revoke_key`
-Removes a key from local store. `{ "service": "openai" }`
+```json
+{ "action": "click", "selector": "text=Sign in with Google" }
+{ "action": "fill", "selector": "input[name=email]", "value": "test@example.com" }
+{ "action": "goto", "url": "https://platform.openai.com/api-keys" }
+{ "action": "extract_text", "selector": ".api-key" }
+{ "action": "scroll", "value": "500" }
+{ "action": "press", "key": "Enter" }
+{ "action": "wait", "selector": ".loaded", "ms": 5000 }
+{ "action": "screenshot" }
+{ "action": "extract_all_text" }
+```
-### `check_key_status`
+Every action returns a screenshot so the agent always sees what happened.
-Checks if an active key exists. `{ "service": "openai" }`
+### `save_key`
-## Service Recipes (optional)
+```json
+{ "service": "openai", "api_key": "sk-..." }
+```
-For services with non-standard flows, add a JSON recipe:
+### `get_or_create_key`
-```bash
-agentgate scaffold myservice https://myservice.com/signup
+```json
+{ "service": "openai" }
 ```
-Most services work without a recipe.
+Returns cached key or `{ "exists": false }`.
 ## Commands
@@ -105,27 +98,13 @@ Most services work without a recipe.
 | `agentgate login` | Sign in with Google (opens browser) |
 | `agentgate serve` | Start MCP server |
 | `agentgate doctor` | Health check |
-| `agentgate scaffold <name> <url>` | Generate a service recipe |
-## How it stays secure
-- Your Google session stays on **your machine** in `~/.agentgate/browser-profile/`
-- API keys stored in local SQLite, encrypted vault uses AES-256-GCM
-- Nothing is sent to any cloud — fully local
-- No telemetry
+## Security
-## Troubleshooting
-```bash
-# Check everything is working
-agentgate doctor
-# Check logs
-cat ~/.agentgate/logs/agentgate.log
-# Re-login if session expired
-agentgate login
-```
+- Google session stays on **your machine** (`~/.agentgate/browser-profile/`)
+- API keys stored in local SQLite database
+- No cloud, no telemetry
+- The agent can only use YOUR authenticated session
 ## Development

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agentgate-mcp",
-  "version": "0.2.0",
+  "version": "0.3.0",
   "description": "MCP server that lets AI agents get API keys for any service via Google sign-in",
   "type": "module",
   "bin": {
@@ -8,7 +8,6 @@
   },
   "files": [
     "src/",
-    "services/",
     "README.md",
     "ARCHITECTURE.md",
     "MCP_TOOLS.md"
@@ -17,7 +16,6 @@
     "start": "node --disable-warning=ExperimentalWarning src/cli.js serve",
     "login": "node --disable-warning=ExperimentalWarning src/cli.js login",
     "doctor": "node --disable-warning=ExperimentalWarning src/cli.js doctor",
-    "scaffold": "node --disable-warning=ExperimentalWarning src/cli.js scaffold",
     "test": "node --disable-warning=ExperimentalWarning --test test/run.js",
     "postinstall": "npx playwright install chromium 2>/dev/null || echo 'Run: npx playwright install chromium'"
   },

package/src/browser-session.js ADDED Viewed

@@ -0,0 +1,230 @@
+import fs from 'node:fs';
+import { createLogger } from './logger.js';
+const log = createLogger('browser-session');
+async function importPlaywright() {
+  try {
+    return await import('playwright');
+  } catch {
+    throw new Error('Playwright is not installed. Run: npm i playwright && npx playwright install chromium');
+  }
+}
+export class BrowserSession {
+  constructor({ browserProfileDir }) {
+    this.browserProfileDir = browserProfileDir;
+    this.context = null;
+    this.page = null;
+  }
+  isOpen() {
+    return this.page !== null && this.context !== null;
+  }
+  /**
+   * Launch persistent browser, navigate to URL, return screenshot.
+   */
+  async open(url) {
+    if (!fs.existsSync(this.browserProfileDir)) {
+      throw new Error('No browser profile found. Run `agentgate login` first to sign in with Google.');
+    }
+    // Close existing session if open
+    if (this.isOpen()) {
+      await this.close();
+    }
+    const playwright = await importPlaywright();
+    log.info(`Opening browser: ${url}`);
+    try {
+      this.context = await playwright.chromium.launchPersistentContext(
+        this.browserProfileDir,
+        {
+          headless: true,
+          viewport: { width: 1366, height: 900 },
+          args: ['--disable-blink-features=AutomationControlled']
+        }
+      );
+    } catch (error) {
+      const msg = error instanceof Error ? error.message : String(error);
+      if (msg.includes('Permission denied') || msg.includes('Operation not permitted')) {
+        throw new Error('Playwright could not launch browser (permission denied).');
+      }
+      throw error;
+    }
+    this.page = this.context.pages()[0] || await this.context.newPage();
+    await this.page.goto(url, { waitUntil: 'domcontentloaded', timeout: 30_000 });
+    await this.page.waitForTimeout(1_500);
+    const screenshot = await this.takeScreenshot();
+    const pageUrl = this.page.url();
+    const title = await this.page.title();
+    return {
+      url: pageUrl,
+      title,
+      screenshot
+    };
+  }
+  /**
+   * Perform a browser action and return screenshot.
+   */
+  async action({ action, selector, value, key, url, ms }) {
+    this.ensureOpen();
+    switch (action) {
+      case 'screenshot':
+        break; // just return screenshot below
+      case 'goto':
+        if (!url) throw new Error('goto requires "url"');
+        await this.page.goto(url, { waitUntil: 'domcontentloaded', timeout: 30_000 });
+        await this.page.waitForTimeout(1_500);
+        break;
+      case 'click':
+        if (!selector) throw new Error('click requires "selector"');
+        await this.page.click(selector, { timeout: 10_000 });
+        await this.page.waitForTimeout(1_500);
+        break;
+      case 'fill':
+        if (!selector) throw new Error('fill requires "selector"');
+        await this.page.fill(selector, value || '', { timeout: 10_000 });
+        break;
+      case 'select':
+        if (!selector) throw new Error('select requires "selector"');
+        await this.page.selectOption(selector, value || '', { timeout: 10_000 });
+        break;
+      case 'press':
+        if (selector) {
+          await this.page.press(selector, key || 'Enter', { timeout: 10_000 });
+        } else {
+          await this.page.keyboard.press(key || 'Enter');
+        }
+        break;
+      case 'scroll':
+        await this.page.mouse.wheel(0, Number(value) || 500);
+        await this.page.waitForTimeout(500);
+        break;
+      case 'wait': {
+        const timeout = Number(ms) || 5_000;
+        if (selector) {
+          await this.page.waitForSelector(selector, { timeout, state: 'visible' });
+        } else {
+          await this.page.waitForTimeout(timeout);
+        }
+        break;
+      }
+      case 'extract_text': {
+        if (!selector) throw new Error('extract_text requires "selector"');
+        const text = await this.page.textContent(selector, { timeout: 10_000 });
+        const screenshot = await this.takeScreenshot();
+        return {
+          url: this.page.url(),
+          title: await this.page.title(),
+          extracted_text: (text || '').trim(),
+          screenshot
+        };
+      }
+      case 'extract_all_text': {
+        const body = await this.page.textContent('body');
+        const screenshot = await this.takeScreenshot();
+        return {
+          url: this.page.url(),
+          title: await this.page.title(),
+          extracted_text: (body || '').trim().slice(0, 10_000),
+          screenshot
+        };
+      }
+      default:
+        throw new Error(`Unknown browser action: "${action}". Available: screenshot, goto, click, fill, select, press, scroll, wait, extract_text, extract_all_text`);
+    }
+    // Default: return screenshot after action
+    const screenshot = await this.takeScreenshot();
+    return {
+      url: this.page.url(),
+      title: await this.page.title(),
+      screenshot
+    };
+  }
+  /**
+   * Close the browser session.
+   */
+  async close() {
+    if (this.context) {
+      log.info('Closing browser session');
+      try {
+        await this.context.close();
+      } catch {
+        // already closed
+      }
+      this.context = null;
+      this.page = null;
+    }
+  }
+  /**
+   * Open a visible browser for the user to sign into Google.
+   */
+  async login() {
+    const playwright = await importPlaywright();
+    log.info('Opening browser for Google login');
+    fs.mkdirSync(this.browserProfileDir, { recursive: true });
+    const context = await playwright.chromium.launchPersistentContext(
+      this.browserProfileDir,
+      {
+        headless: false,
+        viewport: { width: 1280, height: 900 },
+        args: ['--disable-blink-features=AutomationControlled']
+      }
+    );
+    const page = context.pages()[0] || await context.newPage();
+    await page.goto('https://accounts.google.com');
+    log.info('Waiting for user to complete Google sign-in...');
+    try {
+      await page.waitForURL(
+        (u) => u.href.includes('myaccount.google.com') || u.href.includes('google.com/search'),
+        { timeout: 300_000 }
+      );
+      log.info('Google sign-in detected');
+    } catch {
+      log.info('Login window closed or timed out — profile saved if login completed');
+    }
+    await context.close();
+    log.info('Browser profile saved');
+  }
+  // ── internal ──
+  ensureOpen() {
+    if (!this.isOpen()) {
+      throw new Error('No browser session open. Call open_browser first.');
+    }
+  }
+  async takeScreenshot() {
+    const buffer = await this.page.screenshot({ type: 'png' });
+    return buffer.toString('base64');
+  }
+}