hanzi-browse 2.3.1 → 2.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,45 +1,51 @@
1
- # Hanzi Browse — MCP Server
1
+ # Hanzi Browse
2
2
 
3
- The MCP server exposes browser tools to MCP clients and forwards browser work to
4
- the Chrome extension over the local WebSocket relay.
3
+ Give your AI agent a real browser with your existing logins, cookies, and sessions.
5
4
 
6
- ## Setup
5
+ **Two ways to use it:**
6
+ - **Use locally** — MCP server for Claude Code, Cursor, Codex, and other AI coding agents
7
+ - **Build with it** — REST API + TypeScript SDK for embedding browser automation in your product
8
+
9
+ ## Quick Start (MCP)
7
10
 
8
11
  ```bash
9
- cd mcp-server
10
- npm install
11
- npm run build
12
+ npx hanzi-browse setup
12
13
  ```
13
14
 
14
- Add to your MCP config (e.g., `~/.claude/claude_desktop_config.json`):
15
+ This installs the Chrome extension and configures your AI agent. One command, done.
15
16
 
16
- ```json
17
- {
18
- "mcpServers": {
19
- "browser": {
20
- "command": "node",
21
- "args": ["/path/to/hanzi-browse/mcp-server/dist/index.js"]
22
- }
23
- }
24
- }
17
+ **Prerequisites:** Chrome must be open with the [Hanzi extension](https://chromewebstore.google.com/detail/hanzi-browse/iklpkemlmbhemkiojndpbhoakgikpmcd) installed.
18
+
19
+ ## Quick Start (API)
20
+
21
+ ```bash
22
+ npm install @hanzi/browser-agent
25
23
  ```
26
24
 
27
- **Prerequisites:** The Chrome extension must be installed and running. See the [main README](../README.md) for full setup.
25
+ ```typescript
26
+ import { HanziClient } from '@hanzi/browser-agent';
28
27
 
29
- ## How It Works
28
+ const client = new HanziClient({ apiKey: 'hic_live_...' });
30
29
 
31
- ```text
32
- MCP client
33
- -> mcp-server (stdio)
34
- -> relay (WebSocket)
35
- -> Chrome extension
36
- -> browser agent
30
+ // 1. Pair a browser — give the URL to your user
31
+ const { pairingToken } = await client.createPairingToken();
32
+ // User visits: https://api.hanzilla.co/pair/{pairingToken}
33
+
34
+ // 2. Find their connected session
35
+ const sessions = await client.listSessions();
36
+ const browser = sessions.find(s => s.status === 'connected');
37
+
38
+ // 3. Run a task (polls until complete)
39
+ const result = await client.runTask({
40
+ browserSessionId: browser.id,
41
+ task: 'Go to example.com and read the page title',
42
+ });
43
+ console.log(result.answer);
37
44
  ```
38
45
 
39
- The extension is the browser executor. The MCP server should only manage MCP
40
- tool calls, local session bookkeeping, and blocking waits for completion.
46
+ Full API docs: [browse.hanzilla.co/docs.html](https://browse.hanzilla.co/docs.html)
41
47
 
42
- ## Tools
48
+ ## MCP Tools
43
49
 
44
50
  ### `browser_start`
45
51
 
@@ -55,7 +61,6 @@ browser_start(
55
61
  → {
56
62
  "session_id": "abc123",
57
63
  "status": "complete",
58
- "task": "Search for flights to Tokyo...",
59
64
  "answer": "Found 3 flights: JAL $850, ANA $920, United $780",
60
65
  "total_steps": 8,
61
66
  "recent_steps": ["Opened Google Flights", "Set destination to Tokyo", ...]
@@ -64,7 +69,7 @@ browser_start(
64
69
 
65
70
  ### `browser_message`
66
71
 
67
- Send follow-up instructions to an existing session. Also blocks until the agent finishes.
72
+ Send follow-up instructions to an existing session.
68
73
 
69
74
  ```
70
75
  browser_message(session_id: "abc123", message: "Book the cheapest one")
@@ -75,7 +80,7 @@ browser_message(session_id: "abc123", message: "Book the cheapest one")
75
80
  Check known sessions and their latest status.
76
81
 
77
82
  ```
78
- browser_status() // all active sessions
83
+ browser_status() // all active sessions
79
84
  browser_status(session_id: "abc123") // specific session
80
85
  ```
81
86
 
@@ -85,7 +90,7 @@ Stop a task.
85
90
 
86
91
  ```
87
92
  browser_stop(session_id: "abc123")
88
- browser_stop(session_id: "abc123", remove: true) // also delete session
93
+ browser_stop(session_id: "abc123", remove: true) // also close window
89
94
  ```
90
95
 
91
96
  ### `browser_screenshot`
@@ -98,84 +103,63 @@ browser_screenshot(session_id: "abc123")
98
103
 
99
104
  ## Examples
100
105
 
101
- **Research:**
102
- ```
103
- browser_start("Find the top 3 competitors for Acme Corp and summarize their pricing")
104
- ```
105
-
106
106
  **Logged-in workflows:**
107
107
  ```
108
- browser_start("Go to Jira, find my open tickets, and summarize what needs attention this week")
108
+ browser_start("Go to Jira, find my open tickets, and summarize what needs attention")
109
109
  ```
110
110
 
111
111
  **Multi-turn:**
112
112
  ```
113
113
  s = browser_start("Go to LinkedIn and find AI Engineer jobs in Montreal")
114
- → { session_id: "x1", answer: "Found: Applied AI Engineer at Cohere" }
115
-
116
- browser_message("x1", "Click into that job and tell me the requirements")
117
- → { answer: "Requirements: 3+ years Python, ML experience..." }
118
-
119
- browser_message("x1", "Apply to this job using my profile")
120
- → { answer: "Application submitted successfully" }
114
+ browser_message(s.session_id, "Click into the Cohere job and tell me the requirements")
115
+ browser_message(s.session_id, "Apply to this job using my profile")
121
116
  ```
122
117
 
123
118
  **Parallel execution:**
124
119
  ```
125
120
  browser_start("Check flight prices to Tokyo")
126
121
  browser_start("Check hotel prices in Shibuya")
127
- browser_start("Look up train pass costs")
128
- // All three run simultaneously
122
+ // Both run simultaneously in separate windows
129
123
  ```
130
124
 
131
125
  ## Configuration
132
126
 
133
127
  | Environment Variable | Default | Description |
134
128
  |---|---|---|
135
- | `HANZI_IN_CHROME_MAX_SESSIONS` | `5` | Max concurrent browser tasks |
129
+ | `HANZI_BROWSE_MAX_SESSIONS` | `5` | Max concurrent browser tasks |
130
+ | `HANZI_BROWSE_TIMEOUT_MS` | `300000` | Task timeout (ms) |
136
131
  | `WS_RELAY_PORT` | `7862` | WebSocket relay port |
137
132
 
138
- ## Architecture
133
+ ## Skills
139
134
 
140
- ```
141
- AI Tool (Claude Code, Cursor, etc.)
142
- ↓ MCP Protocol (stdio)
143
- MCP Server
144
- ↓ WebSocket
145
- Relay Server
146
- ↓ WebSocket
147
- Chrome Extension
148
- ↓ Extension agent loop
149
- Target Website
150
- ```
151
-
152
- The relay server starts automatically when the MCP server connects. It routes
153
- messages between the MCP server and the Chrome extension and briefly queues
154
- messages while the extension service worker is asleep.
155
-
156
- > **Principle**: Hanzi is for real browser work in your signed-in Chrome.
157
- > Agents should prefer code, logs, APIs, and existing tools first. Use Hanzi when the job needs a real browser session.
158
-
159
- ## Prompts
160
-
161
- The server exposes MCP prompts that clients auto-discover as slash commands:
135
+ The server exposes MCP prompts that clients auto-discover:
162
136
 
163
137
  | Prompt | Description |
164
138
  |--------|-------------|
165
- | `linkedin-prospector` | Goal-driven LinkedIn outreach — networking, sales, partnerships, or hiring |
166
- | `e2e-tester` | Test your app in a real browser — reports bugs with screenshots and code references |
167
- | `social-poster` | Post across LinkedIn, Twitter, Reddit, HN — drafts per-platform, posts from your browser |
168
-
169
- In Claude Code, use the built-in `linkedin-prospector` prompt from the MCP prompt list.
170
-
171
- ## Skills CLI
139
+ | `linkedin-prospector` | Goal-driven LinkedIn outreach |
140
+ | `e2e-tester` | Test your app in a real browser with screenshots |
141
+ | `social-poster` | Post across LinkedIn, Twitter, Reddit from your browser |
142
+ | `x-marketer` | Find X/Twitter conversations and draft voice-matched replies |
172
143
 
173
144
  ```bash
174
145
  hanzi-browser skills # list available skills
175
146
  hanzi-browser skills install linkedin-prospector # install SKILL.md to your project
176
147
  ```
177
148
 
178
- Skills are portable SKILL.md files for agents that don't support MCP prompts (Cline, Codex). Each skill follows the same principle: use existing tools first, Hanzi only for real browser steps.
149
+ ## Architecture
150
+
151
+ ```
152
+ AI Agent (Claude Code, Cursor, etc.)
153
+ ↓ MCP Protocol (stdio)
154
+ MCP Server (this package)
155
+ ↓ WebSocket
156
+ Chrome Extension
157
+ ↓ Chrome DevTools Protocol
158
+ User's Real Browser
159
+ ```
160
+
161
+ > **Principle**: Hanzi is for real browser work in your signed-in Chrome.
162
+ > Agents should prefer code, logs, APIs, and existing tools first. Use Hanzi when the job needs a real browser session.
179
163
 
180
164
  ## License
181
165
 
@@ -1,10 +1,11 @@
1
1
  /**
2
- * Domain-specific knowledge for the server-side agent loop.
3
- * Matches the extension's domain-skills.js but only includes domains
4
- * relevant to managed/API tasks.
2
+ * Domain-specific knowledge for the agent loop.
3
+ * Single source of truth shared between server (managed API, MCP)
4
+ * and extension (via import at build time).
5
5
  */
6
6
  interface DomainEntry {
7
7
  domain: string;
8
+ antiBot?: boolean;
8
9
  skill: string;
9
10
  }
10
11
  /**
@@ -12,4 +13,8 @@ interface DomainEntry {
12
13
  * Returns the first matching entry, or null.
13
14
  */
14
15
  export declare function getDomainSkill(url: string): DomainEntry | null;
16
+ /**
17
+ * Get all domain skills. Used by extension to import the full list.
18
+ */
19
+ export declare function getAllDomainSkills(): DomainEntry[];
15
20
  export {};
@@ -1,51 +1,15 @@
1
1
  /**
2
- * Domain-specific knowledge for the server-side agent loop.
3
- * Matches the extension's domain-skills.js but only includes domains
4
- * relevant to managed/API tasks.
2
+ * Domain-specific knowledge for the agent loop.
3
+ * Single source of truth shared between server (managed API, MCP)
4
+ * and extension (via import at build time).
5
5
  */
6
- const DOMAIN_KNOWLEDGE = [
7
- {
8
- domain: "x.com",
9
- skill: `X/Twitter verified patterns (updated 2026-03-30)
10
-
11
- ## Reading pages (CRITICAL)
12
- - X loads content asynchronously — page looks empty for 3-5 seconds after navigation.
13
- - read_page often returns ONLY "To view keyboard shortcuts" — tweets haven't loaded yet.
14
- - DO NOT re-navigate to the same URL. That resets loading and makes it worse.
15
- - Instead: wait 5 seconds, then use get_page_text — it reads visible text and is more reliable.
16
- - If get_page_text returns nothing, scroll down once and try again.
17
-
18
- ## Search
19
- - URL: x.com/search?q={encoded_query}&src=typed_query&f=live
20
- - After navigating, wait 5 seconds, then get_page_text (NOT read_page).
21
- - Scroll down once to load more tweets, then get_page_text again.
22
- - Tweet URLs in page text follow pattern: /status/{id}
23
-
24
- ## Text input (CRITICAL — Draft.js)
25
- - form_input DOES NOT WORK — Draft.js ignores programmatic input.
26
- - computer type action GARBLES TEXT.
27
- - ONLY RELIABLE METHOD — use javascript_tool:
28
- document.querySelector('[data-testid="tweetTextarea_0"]').focus();
29
- document.execCommand('insertText', false, 'your reply text here');
30
- - Always verify text appeared by reading after insertion.
31
-
32
- ## Replying to a tweet
33
- 1. Navigate to tweet URL (x.com/{handle}/status/{id})
34
- 2. Wait 3 seconds, read the page
35
- 3. Click the reply/comment icon (speech bubble) in the action bar
36
- 4. Use javascript_tool to insert text (see above)
37
- 5. Verify text appeared, then click blue "Reply" button
38
- 6. Wait 2 seconds to confirm reply posted
39
-
40
- ## Known traps
41
- - DO NOT scroll looking for "Post your reply" — reply box appears after clicking comment icon
42
- - x.com/compose/post may open — that's fine, type and click Reply there
43
- - "Leave site?" dialog — ALWAYS click Cancel, finish posting first
44
- - Reply button is disabled until text is entered — verify first
45
- - Space replies 15+ seconds apart (rate limiting)
46
- - NEVER navigate to the same URL you're already on`,
47
- },
48
- ];
6
+ import { readFileSync } from "fs";
7
+ import { fileURLToPath } from "url";
8
+ import { dirname, join } from "path";
9
+ // Load from shared JSON file
10
+ const __filename = fileURLToPath(import.meta.url);
11
+ const __dirname = dirname(__filename);
12
+ const DOMAIN_SKILLS = JSON.parse(readFileSync(join(__dirname, "domain-skills.json"), "utf-8"));
49
13
  /**
50
14
  * Look up domain knowledge for a URL.
51
15
  * Returns the first matching entry, or null.
@@ -53,11 +17,17 @@ const DOMAIN_KNOWLEDGE = [
53
17
  export function getDomainSkill(url) {
54
18
  try {
55
19
  const hostname = new URL(url).hostname.toLowerCase();
56
- return DOMAIN_KNOWLEDGE.find((d) => hostname === d.domain || hostname.endsWith("." + d.domain)) || null;
20
+ return DOMAIN_SKILLS.find((d) => hostname === d.domain || hostname.endsWith("." + d.domain)) || null;
57
21
  }
58
22
  catch {
59
23
  // URL might not be a full URL — try matching as a bare domain
60
24
  const lower = url.toLowerCase();
61
- return DOMAIN_KNOWLEDGE.find((d) => lower.includes(d.domain)) || null;
25
+ return DOMAIN_SKILLS.find((d) => lower.includes(d.domain)) || null;
62
26
  }
63
27
  }
28
+ /**
29
+ * Get all domain skills. Used by extension to import the full list.
30
+ */
31
+ export function getAllDomainSkills() {
32
+ return DOMAIN_SKILLS;
33
+ }
@@ -0,0 +1,66 @@
1
+ [
2
+ {
3
+ "domain": "mail.google.com",
4
+ "skill": "Gmail best practices:\n- To open an email, click directly on the email subject/preview text, NOT the checkbox or star\n- Use keyboard shortcuts: 'c' to compose, 'r' to reply, 'a' to reply all, 'f' to forward, 'e' to archive\n- To search, use the search bar at the top with operators like 'from:', 'to:', 'subject:', 'is:unread'\n- Reading pane may be on the right or below depending on user settings - check which layout is active\n- Verification codes are often in emails from 'noreply@' addresses with subjects containing 'verification', 'code', or 'confirm'"
5
+ },
6
+ {
7
+ "domain": "docs.google.com",
8
+ "skill": "Google Docs best practices:\n- This is a canvas-based application - use screenshots to see content, read_page may not capture all text\n- Use keyboard shortcuts: Cmd/Ctrl+B for bold, Cmd/Ctrl+I for italic, Cmd/Ctrl+K for links\n- To navigate, use Cmd/Ctrl+F to find text, then click on the result\n- For editing, click to place cursor then type - triple-click to select a paragraph\n- Access menus via the menu bar at the top (File, Edit, View, Insert, Format, etc.)"
9
+ },
10
+ {
11
+ "domain": "sheets.google.com",
12
+ "skill": "Google Sheets best practices:\n- Click on cells to select them, double-click to edit cell content\n- Use Tab to move right, Enter to move down, arrow keys to navigate\n- Formulas start with '=' - e.g., =SUM(A1:A10), =VLOOKUP(), =IF()\n- Use Cmd/Ctrl+C and Cmd/Ctrl+V for copy/paste\n- Select ranges by clicking and dragging, or Shift+click for range selection"
13
+ },
14
+ {
15
+ "domain": "github.com",
16
+ "skill": "GitHub best practices:\n- Repository navigation: Code tab for files, Issues for bug tracking, Pull requests for code review\n- To view a file, click on the filename in the file tree\n- Use 't' to open file finder, 'l' to jump to a line\n- In PRs: 'Files changed' tab shows diffs, 'Conversation' tab shows comments\n- Use the search bar with qualifiers: 'is:open is:pr', 'is:issue label:bug'"
17
+ },
18
+ {
19
+ "domain": "reddit.com",
20
+ "antiBot": true,
21
+ "skill": "Reddit UI patterns:\n- Posts are listed in a feed - click on post title to view full post and comments\n- Comments are nested/threaded - each comment has its own reply button underneath\n- Upvote (up arrow) and downvote (down arrow) buttons are to the left of each post/comment\n- To comment, scroll to comment box at top of comments section, or click reply under a specific comment\n- Use the search bar at top to find subreddits or posts\n- r/subredditname format for community names"
22
+ },
23
+ {
24
+ "domain": "linkedin.com",
25
+ "antiBot": true,
26
+ "skill": "LinkedIn UI patterns:\n\n## Messaging & Connections\n- To message someone: first check if you're connected (1st degree) - if not, send a connection request first\n- Connection request: go to their profile, click 'Connect' button, optionally add a note\n- Once connected, use the 'Message' button on their profile or go to Messaging tab\n- InMail (messaging non-connections) requires Premium subscription\n\n## Easy Apply Forms\n- Contact Info page is pre-filled from LinkedIn profile - don't try to modify, just click Next\n- Modal forms may need scrolling to see all content and buttons\n- Use screenshots over read_page for modals - accessibility tree often misses modal content\n\n## Navigation\n- Main tabs: Home (feed), My Network, Jobs, Messaging, Notifications\n- Job search: Jobs tab → filter by location, experience level, date posted\n- 'Easy Apply' = apply within LinkedIn; 'Apply' = external site\n- Profile sections are collapsible - click 'Show all' to expand"
27
+ },
28
+ {
29
+ "domain": "indeed.com",
30
+ "skill": "Indeed best practices:\n- Search for jobs using the 'What' and 'Where' fields at the top\n- Filter results by date posted, salary, job type, experience level\n- Click job title to view full description\n- 'Apply now' or 'Apply on company site' buttons are typically on the right panel\n- Sign in to save jobs and track applications"
31
+ },
32
+ {
33
+ "domain": "calendar.google.com",
34
+ "skill": "Google Calendar best practices:\n- Click on a time slot to create a new event\n- Drag events to reschedule them\n- Click on an event to view details, edit, or delete\n- Use the mini calendar on the left to navigate to different dates\n- Keyboard: 'c' to create event, 't' to go to today, arrow keys to navigate"
35
+ },
36
+ {
37
+ "domain": "drive.google.com",
38
+ "skill": "Google Drive best practices:\n- Double-click files to open them, single-click to select\n- Right-click for context menu (download, share, rename, etc.)\n- Use the search bar to find files by name or content\n- Create new items with the '+ New' button on the left\n- Drag and drop to move files between folders"
39
+ },
40
+ {
41
+ "domain": "notion.so",
42
+ "skill": "Notion best practices:\n- Click to place cursor, type '/' to open command menu\n- Drag blocks using the ⋮⋮ handle on the left\n- Use sidebar for navigation between pages\n- Toggle blocks expand/collapse on click\n- Databases can be viewed as table, board, calendar, etc."
43
+ },
44
+ {
45
+ "domain": "figma.com",
46
+ "skill": "Figma best practices:\n- This is a canvas-based design tool - always use screenshots to see content\n- Use 'V' for select tool, 'R' for rectangle, 'T' for text\n- Zoom with Cmd/Ctrl+scroll or Cmd/Ctrl++ and Cmd/Ctrl+-\n- Navigate frames in the left sidebar\n- Right-click for context menus and additional options"
47
+ },
48
+ {
49
+ "domain": "slack.com",
50
+ "skill": "Slack best practices:\n- Channels listed in left sidebar - click to switch\n- Cmd/Ctrl+K to quickly switch channels/DMs\n- @ mentions notify users, # references channels\n- Thread replies keep conversations organized\n- Use the search bar to find messages, files, and people"
51
+ },
52
+ {
53
+ "domain": "twitter.com",
54
+ "antiBot": true,
55
+ "skill": "See x.com — twitter.com redirects to x.com."
56
+ },
57
+ {
58
+ "domain": "x.com",
59
+ "antiBot": true,
60
+ "skill": "X/Twitter — verified patterns (updated 2026-03-30)\n\n## Reading pages (CRITICAL)\n- X loads content asynchronously — page looks empty for 3-5 seconds after navigation.\n- read_page often returns ONLY \"To view keyboard shortcuts\" — tweets haven't loaded yet.\n- DO NOT re-navigate to the same URL. That resets loading and makes it worse.\n- Instead: wait 5 seconds, then use get_page_text — it reads visible text and is more reliable.\n- If get_page_text returns nothing, scroll down once and try again.\n\n## Search\n- URL: x.com/search?q={encoded_query}&src=typed_query&f=live\n- After navigating, wait 5 seconds, then get_page_text (NOT read_page).\n- Scroll down once to load more tweets, then get_page_text again.\n- Tweet URLs in page text follow pattern: /status/{id}\n\n## Text input (CRITICAL — Draft.js)\n- form_input DOES NOT WORK — Draft.js ignores programmatic input.\n- computer type action GARBLES TEXT.\n- ONLY RELIABLE METHOD — use javascript_tool:\n document.querySelector('[data-testid=\"tweetTextarea_0\"]').focus();\n document.execCommand('insertText', false, 'your reply text here');\n- Always verify text appeared by reading after insertion.\n\n## Replying to a tweet\n1. Navigate to tweet URL (x.com/{handle}/status/{id})\n2. Wait 3 seconds, read the page\n3. Click the reply/comment icon (speech bubble) in the action bar\n4. Use javascript_tool to insert text (see above)\n5. Verify text appeared, then click blue \"Reply\" button\n6. Wait 2 seconds to confirm reply posted\n\n## Known traps\n- DO NOT scroll looking for \"Post your reply\" — reply box appears after clicking comment icon\n- x.com/compose/post may open — that's fine, type and click Reply there\n- \"Leave site?\" dialog — ALWAYS click Cancel, finish posting first\n- Reply button is disabled until text is entered — verify first\n- Space replies 15+ seconds apart (rate limiting)\n- NEVER navigate to the same URL you're already on"
61
+ },
62
+ {
63
+ "domain": "amazon.com",
64
+ "skill": "Amazon best practices:\n- Use the search bar at the top for product search\n- Filter results using the left sidebar (price, ratings, Prime, etc.)\n- Click 'Add to Cart' or 'Buy Now' to purchase\n- Product details and reviews are on the product page\n- Check seller information and shipping times before purchasing"
65
+ }
66
+ ]
@@ -955,6 +955,7 @@ function parseBody(req) {
955
955
  const ALLOWED_ORIGINS = [
956
956
  "https://browse.hanzilla.co",
957
957
  "https://api.hanzilla.co",
958
+ "https://tools.hanzilla.co",
958
959
  ...(process.env.NODE_ENV === "production" ? [] : [
959
960
  "http://localhost:3000",
960
961
  "http://localhost:5173", // Vite dev server
@@ -1096,11 +1097,7 @@ async function handleRequest(req, res) {
1096
1097
  if (method === "GET" && url === "/v1/me") {
1097
1098
  let profile = await resolveSessionProfile(req);
1098
1099
  if (!profile) {
1099
- // Debug: try reading session directly from cookie + DB
1100
- const cookieHeader = req.headers.cookie || '';
1101
- const tokenMatch = cookieHeader.match(/better-auth[.\-]session_token=([^;.\s]+)/);
1102
- const rawToken = tokenMatch ? decodeURIComponent(tokenMatch[1]) : null;
1103
- sendJson(req, res, 401, { error: "Not signed in", debug: { rawToken: rawToken?.substring(0, 10), cookieHeader: cookieHeader.substring(0, 100) } });
1100
+ sendJson(req, res, 401, { error: "Not signed in" });
1104
1101
  return;
1105
1102
  }
1106
1103
  sendJson(req, res, 200, {
@@ -1363,8 +1360,8 @@ async function handleRequest(req, res) {
1363
1360
  sendJson(req, res, 404, { error: "Not found" });
1364
1361
  }
1365
1362
  catch (err) {
1366
- log.error("Request error", { requestId }, { method, url, error: err.message });
1367
- sendJson(req, res, 500, { error: err.message, request_id: requestId });
1363
+ log.error("Request error", { requestId }, { method, url, error: err.message, stack: err.stack });
1364
+ sendJson(req, res, 500, { error: "Internal server error", request_id: requestId });
1368
1365
  }
1369
1366
  }
1370
1367
  /**
@@ -146,13 +146,11 @@ export async function resolveSessionProfile(req) {
146
146
  // Extract session token from cookie (handles both __Secure- and plain prefix)
147
147
  const cookieHeader = req.headers.cookie || '';
148
148
  const tokenMatch = cookieHeader.match(/better-auth[.\-]session_token=([^;]+)/);
149
- console.error(`[AUTH] step1: match=${!!tokenMatch} cookieLen=${cookieHeader.length}`);
150
149
  if (!tokenMatch)
151
150
  return null;
152
151
  // Token format: "rawToken.signature" — we only need the raw token for DB lookup
153
152
  const rawValue = decodeURIComponent(tokenMatch[1]);
154
153
  const token = rawValue.split('.')[0];
155
- console.error(`[AUTH] step2: token=${token.substring(0, 10)}... rawLen=${rawValue.length}`);
156
154
  if (!token)
157
155
  return null;
158
156
  const db = getProvisionPool();
@@ -160,12 +158,10 @@ export async function resolveSessionProfile(req) {
160
158
  FROM session s
161
159
  JOIN "user" u ON u.id = s."userId"
162
160
  WHERE s.token = $1 LIMIT 1`, [token]);
163
- console.error(`[AUTH] step3: rows=${sessionRes.rows.length}`);
164
161
  if (sessionRes.rows.length === 0)
165
162
  return null;
166
163
  const row = sessionRes.rows[0];
167
164
  // Check expiry
168
- console.error(`[AUTH] step4: userId=${row.userId} expires=${row.expiresAt} expired=${new Date(row.expiresAt) < new Date()}`);
169
165
  if (new Date(row.expiresAt) < new Date())
170
166
  return null;
171
167
  const wsRes = await db.query(`SELECT wm.workspace_id, w.name as workspace_name
@@ -173,7 +169,6 @@ export async function resolveSessionProfile(req) {
173
169
  JOIN workspaces w ON w.id = wm.workspace_id
174
170
  WHERE wm.user_id = $1
175
171
  ORDER BY wm.created_at ASC LIMIT 1`, [row.userId]);
176
- console.error(`[AUTH] step5: wsRows=${wsRes.rows.length}`);
177
172
  if (wsRes.rows.length === 0)
178
173
  return null;
179
174
  return {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "hanzi-browse",
3
- "version": "2.3.1",
3
+ "version": "2.3.2",
4
4
  "description": "Give your AI agent a real browser — click, type, fill forms, test workflows, post content, and read authenticated pages",
5
5
  "license": "PolyForm-Noncommercial-1.0.0",
6
6
  "author": "hanzili",
@@ -20,7 +20,7 @@
20
20
  "README.md"
21
21
  ],
22
22
  "scripts": {
23
- "build": "tsc && rm -rf dist/managed/templates && cp -r src/managed/templates dist/managed/templates && cd dashboard && npm run build",
23
+ "build": "tsc && rm -rf dist/managed/templates && cp -r src/managed/templates dist/managed/templates && cp src/agent/domain-skills.json dist/agent/domain-skills.json && cd dashboard && npm run build",
24
24
  "build:server": "tsc",
25
25
  "build:dashboard": "cd dashboard && npm run build",
26
26
  "dev": "tsc --watch",
@@ -0,0 +1,290 @@
1
+ ---
2
+ name: competitor-monitor
3
+ description: Monitor competitor websites for changes. Visit a list of URLs, extract pricing, features, positioning, and key content, compare against previous snapshots stored locally, and generate a change report summarizing what's different. Use when the user says "check competitors", "what changed on their site", "monitor these URLs", or wants periodic competitive intelligence. Requires the hanzi browser automation MCP server and Chrome extension.
4
+ ---
5
+
6
+ # Competitor Monitor
7
+
8
+ You monitor competitor websites and report what changed. You visit each URL, extract the important content (pricing, features, positioning, messaging), compare it against the last saved snapshot, and produce a clear change report.
9
+
10
+ ## Tool Selection Rule
11
+
12
+ - **Prefer existing tools first**: If a page is public and simple, try `WebFetch` or `curl` before opening a browser. Use Hanzi only when the page requires JavaScript rendering, authentication, or interactive elements (tabs, accordions, lazy-loaded sections).
13
+ - **Use filesystem tools** to read/write snapshots — never store snapshots in the browser.
14
+ - **If a site blocks or shows a CAPTCHA**, stop that URL and move to the next. Report the failure.
15
+
16
+ ## Before Starting — Preflight Check
17
+
18
+ Try calling `browser_status` to verify the browser extension is reachable. If the tool doesn't exist or returns an error:
19
+
20
+ > **Hanzi isn't set up yet.** This skill needs the hanzi browser extension running in Chrome.
21
+ >
22
+ > 1. Install from the Chrome Web Store: https://chromewebstore.google.com/detail/hanzi-browse/iklpkemlmbhemkiojndpbhoakgikpmcd
23
+ > 2. The extension will walk you through setup (~1 minute)
24
+ > 3. Then come back and run this again
25
+
26
+ ---
27
+
28
+ ## What You Need From the User
29
+
30
+ 1. **URLs** — list of competitor pages to monitor (pricing pages, feature pages, landing pages, etc.)
31
+ 2. **Focus areas** (optional) — what to pay attention to: pricing, features, positioning, team size, integrations, messaging, or "everything"
32
+ 3. **Label** (optional) — a name for this monitoring set (e.g., "competitor-pricing", "market-landscape"). Defaults to "default".
33
+
34
+ Optional:
35
+ - Specific sections or elements to watch (e.g., "only the pricing table", "the hero section tagline")
36
+ - Whether to take screenshots for visual comparison
37
+ - Authentication details if any pages require login
38
+
39
+ ---
40
+
41
+ ## Phase 1: Prepare the Monitoring Run
42
+
43
+ ### 1a. Load Previous Snapshots
44
+
45
+ Snapshots are stored in `~/.hanzi-browse/competitor-monitor/{label}/`. Each URL gets a file named by its sanitized hostname + path.
46
+
47
+ ```bash
48
+ mkdir -p ~/.hanzi-browse/competitor-monitor/{label}
49
+ ls ~/.hanzi-browse/competitor-monitor/{label}/ 2>/dev/null || echo "NO_PREVIOUS_SNAPSHOTS"
50
+ ```
51
+
52
+ For each URL, check if a previous snapshot exists:
53
+ ```bash
54
+ cat ~/.hanzi-browse/competitor-monitor/{label}/{sanitized_filename}.json 2>/dev/null || echo "NO_SNAPSHOT"
55
+ ```
56
+
57
+ If no previous snapshots exist, this is a **baseline run** — you'll capture the initial state without generating a diff.
58
+
59
+ ### 1b. Plan the Extraction
60
+
61
+ For each URL, determine what to extract based on the page type and user's focus areas:
62
+
63
+ | Page Type | What to Extract |
64
+ |-----------|----------------|
65
+ | **Pricing page** | Plan names, prices, billing periods, feature lists per tier, CTAs, free tier details, enterprise contact options |
66
+ | **Features page** | Feature names, descriptions, categories, "new" or "coming soon" badges, comparison tables |
67
+ | **Landing/home page** | Hero headline, subheadline, value propositions, social proof (logos, testimonials, stats), CTAs |
68
+ | **About/team page** | Team size, key hires, office locations, funding mentions |
69
+ | **Blog/changelog** | Latest 3-5 post titles, dates, and summaries |
70
+ | **Integrations page** | List of integrations, categories, "new" badges |
71
+ | **Docs/API page** | Navigation structure, new sections, deprecation notices |
72
+
73
+ Present the plan: "I'll visit these N URLs and extract [focus areas]. Previous snapshots: [found/not found]. Ready to proceed?"
74
+
75
+ **Wait for user confirmation before visiting any URLs.**
76
+
77
+ ---
78
+
79
+ ## Phase 2: Visit and Extract (browser via Hanzi)
80
+
81
+ Visit each URL using `browser_start`. Run up to 3 URLs **in parallel** — each gets its own browser window.
82
+
83
+ For each URL:
84
+
85
+ ```
86
+ browser_start({
87
+ task: "Visit this page and extract all [focus areas]. Read the full page content including any sections behind tabs, accordions, or 'show more' buttons. Return structured data with: page_title, extraction_date, and each content section with its heading and text.",
88
+ url: "{competitor_url}",
89
+ context: "Focus areas: {focus_areas}. Extract exact text — do not paraphrase. Include prices with currency symbols. Expand any collapsed sections."
90
+ })
91
+ ```
92
+
93
+ After `browser_start` returns:
94
+ 1. Parse the result to extract structured content
95
+ 2. If the user requested screenshots, call `browser_screenshot` for each page
96
+ 3. Call `browser_stop` with `remove: true` to clean up
97
+
98
+ ### Extraction Format
99
+
100
+ Structure the extracted data consistently:
101
+
102
+ ```json
103
+ {
104
+ "url": "https://competitor.com/pricing",
105
+ "extracted_at": "2026-04-02T10:30:00Z",
106
+ "page_title": "Pricing - Competitor",
107
+ "sections": [
108
+ {
109
+ "name": "Plans",
110
+ "content": [
111
+ {
112
+ "plan": "Starter",
113
+ "price": "$9/mo",
114
+ "billing": "billed annually",
115
+ "features": ["Feature A", "Feature B", "5 users"]
116
+ }
117
+ ]
118
+ },
119
+ {
120
+ "name": "Hero",
121
+ "headline": "The fastest way to do X",
122
+ "subheadline": "Used by 10,000+ teams"
123
+ }
124
+ ]
125
+ }
126
+ ```
127
+
128
+ ### Error Handling
129
+
130
+ - **Page blocked / CAPTCHA**: Skip, note in report, move to next URL
131
+ - **Page not found (404)**: Record as "page removed" — this itself is a significant change
132
+ - **Timeout**: Call `browser_screenshot` to capture current state, then `browser_stop`. Retry once. If it fails again, skip.
133
+ - **Login required**: Stop and ask the user for credentials. Pass via `context` field, never in `task`.
134
+
135
+ ---
136
+
137
+ ## Phase 3: Compare Against Previous Snapshots (no browser)
138
+
139
+ For each URL, compare the new extraction against the stored snapshot.
140
+
141
+ ### Diff Categories
142
+
143
+ Classify every change into one of these categories:
144
+
145
+ | Category | What it means | Priority |
146
+ |----------|--------------|----------|
147
+ | **Pricing change** | Price increase/decrease, new tier, removed tier, changed billing | HIGH |
148
+ | **Feature change** | New feature added, feature removed, feature renamed or moved between tiers | HIGH |
149
+ | **Positioning change** | Headline, tagline, or value prop rewritten | MEDIUM |
150
+ | **Social proof change** | New logos, updated stats, new testimonials | LOW |
151
+ | **Structural change** | New page sections, reorganized layout, new navigation items | LOW |
152
+ | **Content update** | Minor text edits, typo fixes, updated dates | LOW |
153
+ | **Page removed** | URL now returns 404 or redirects | HIGH |
154
+ | **New page** | First time monitoring this URL (baseline) | INFO |
155
+
156
+ ### Comparison Rules
157
+
158
+ - Compare section by section, not character by character
159
+ - For pricing: flag exact dollar amounts, percentage changes, and tier restructuring
160
+ - For features: track additions, removals, and tier movements separately
161
+ - For text: ignore minor formatting changes (whitespace, punctuation). Flag substantive rewording.
162
+ - If a section existed before but is now missing, flag as removed
163
+ - If a new section appears, flag as added
164
+
165
+ ---
166
+
167
+ ## Phase 4: Save Updated Snapshots
168
+
169
+ After comparison, save the new extraction as the current snapshot:
170
+
171
+ ```bash
172
+ mkdir -p ~/.hanzi-browse/competitor-monitor/{label}
173
+ ```
174
+
175
+ Write the JSON snapshot file for each URL:
176
+ ```bash
177
+ cat > ~/.hanzi-browse/competitor-monitor/{label}/{sanitized_filename}.json << 'SNAPSHOT_EOF'
178
+ {extracted_json_here}
179
+ SNAPSHOT_EOF
180
+ ```
181
+
182
+ Also append to the monitoring log:
183
+ ```bash
184
+ echo '{"url":"{url}","checked_at":"{timestamp}","changes_found":{count},"categories":["{cat1}","{cat2}"]}' >> ~/.hanzi-browse/competitor-monitor/{label}/monitor-log.jsonl
185
+ ```
186
+
187
+ ---
188
+
189
+ ## Phase 5: Generate Change Report
190
+
191
+ ### Baseline Run (no previous snapshots)
192
+
193
+ If this is the first run, present the captured state:
194
+
195
+ ```
196
+ Competitor Monitor — Baseline Captured
197
+
198
+ Label: {label}
199
+ Date: {date}
200
+ URLs monitored: {N}
201
+
202
+ Competitor: {name or domain}
203
+ URL: {url}
204
+ Pricing: {summary of tiers and prices}
205
+ Key features: {top features}
206
+ Positioning: "{headline}" — {subheadline}
207
+
208
+ Competitor: {name or domain}
209
+ ...
210
+
211
+ Snapshots saved to ~/.hanzi-browse/competitor-monitor/{label}/
212
+ Next run will compare against this baseline.
213
+ ```
214
+
215
+ ### Change Report (subsequent runs)
216
+
217
+ ```
218
+ Competitor Monitor — Change Report
219
+
220
+ Label: {label}
221
+ Date: {date}
222
+ URLs monitored: {N}
223
+ URLs with changes: {N}
224
+ URLs unchanged: {N}
225
+ URLs failed: {N}
226
+
227
+ --- HIGH PRIORITY CHANGES ---
228
+
229
+ [Competitor Name] — {url}
230
+ PRICING: Starter plan increased from $9/mo to $12/mo (+33%)
231
+ PRICING: New "Enterprise" tier added at custom pricing
232
+ FEATURE: "AI Assistant" added to Pro tier (was not listed before)
233
+
234
+ --- MEDIUM PRIORITY CHANGES ---
235
+
236
+ [Competitor Name] — {url}
237
+ POSITIONING: Hero headline changed
238
+ Was: "The simple way to manage projects"
239
+ Now: "The AI-powered way to manage projects"
240
+
241
+ --- LOW PRIORITY CHANGES ---
242
+
243
+ [Competitor Name] — {url}
244
+ SOCIAL PROOF: Customer count updated from "5,000+" to "10,000+"
245
+ CONTENT: Footer copyright year updated to 2026
246
+
247
+ --- NO CHANGES ---
248
+
249
+ [Competitor Name] — {url}: No changes detected
250
+
251
+ --- FAILED ---
252
+
253
+ [Competitor Name] — {url}: Blocked by CAPTCHA
254
+ ```
255
+
256
+ ### Strategic Insights
257
+
258
+ After the change report, provide analysis:
259
+
260
+ 1. **Pricing trends** — Are competitors raising or lowering prices? Adding tiers? Moving to usage-based?
261
+ 2. **Feature signals** — What are they building? What features are moving down-market (from enterprise to lower tiers)?
262
+ 3. **Positioning shifts** — How is their messaging evolving? Are they targeting a new audience?
263
+ 4. **Competitive implications** — What do these changes mean for the user's product? Any threats or opportunities?
264
+ 5. **Recommended actions** — Specific suggestions: "Consider matching their free tier offering", "Their new AI feature overlaps with your roadmap item X"
265
+
266
+ ---
267
+
268
+ ## Snapshot File Naming
269
+
270
+ Sanitize URLs to create filenames:
271
+ - Replace `https://` and `http://` with nothing
272
+ - Replace `/`, `?`, `&`, `=` with `_`
273
+ - Replace `.` with `_` (except file extensions)
274
+ - Truncate to 100 characters
275
+ - Example: `https://competitor.com/pricing?plan=all` becomes `competitor_com_pricing_plan_all`
276
+
277
+ ---
278
+
279
+ ## Rules
280
+
281
+ - Always confirm the URL list and focus areas with the user before visiting any pages
282
+ - Never modify competitor websites — read only
283
+ - Save snapshots after every run so the next run has a baseline
284
+ - Classify all changes by priority — don't bury pricing changes in a wall of minor edits
285
+ - If a URL requires authentication, ask the user — never guess credentials
286
+ - Max 20 URLs per session to avoid rate limiting and long execution times
287
+ - Run up to 3 browser visits in parallel for speed, but not more (avoids overwhelming the browser)
288
+ - If a competitor site blocks automated access, note it and suggest the user visit manually
289
+ - Keep snapshots as structured JSON, not raw HTML — this makes comparison reliable across minor layout changes
290
+ - Log every monitoring run to `monitor-log.jsonl` for trend tracking across sessions
@@ -0,0 +1,260 @@
1
+ ---
2
+ name: job-applier
3
+ description: Apply to jobs from your real browser. Reads job postings, matches requirements against your resume, fills out application forms, and handles multi-step flows on Lever, Greenhouse, Workday, Indeed, LinkedIn Jobs, and other platforms. Reviews everything before submitting. Requires the hanzi browser automation MCP server and Chrome extension.
4
+ ---
5
+
6
+ # Job Application Helper
7
+
8
+ You help users apply to jobs by reading postings, matching qualifications, and filling out application forms in a real browser with their signed-in sessions.
9
+
10
+ ## Tool Selection Rule
11
+
12
+ - **Prefer existing tools first**: code search, file reads, APIs, and other MCP integrations.
13
+ - **Use Hanzi only for browser-required steps**: navigating job boards, reading authenticated postings, and filling application forms.
14
+ - **If a platform shows a CAPTCHA, rate limit, or bot detection**, stop immediately and tell the user.
15
+
16
+ ## Before Starting — Preflight Check
17
+
18
+ Try calling `browser_status` to verify the browser extension is reachable. If the tool doesn't exist or returns an error:
19
+
20
+ > **Hanzi isn't set up yet.** This skill needs the hanzi browser extension running in Chrome.
21
+ >
22
+ > 1. Install from the Chrome Web Store: https://chromewebstore.google.com/detail/hanzi-browse/iklpkemlmbhemkiojndpbhoakgikpmcd
23
+ > 2. The extension will walk you through setup (~1 minute)
24
+ > 3. Then come back and run this again
25
+
26
+ ---
27
+
28
+ ## What You Need From the User
29
+
30
+ 1. **Job URL** — link to the job posting
31
+ 2. **Resume / profile context** — either a file path to their resume, a pasted summary, or "use my LinkedIn profile"
32
+ 3. **Additional context** — cover letter tone, salary expectations, visa status, availability, or any field-specific answers
33
+
34
+ Optional:
35
+ - Preferred name or contact info overrides
36
+ - Answers to common screening questions (years of experience, willing to relocate, etc.)
37
+ - Whether to actually submit or just fill and pause for review
38
+
39
+ ---
40
+
41
+ ## Phase 1: Gather Context BEFORE Opening the Browser
42
+
43
+ ### Read the user's resume
44
+
45
+ If the user provided a file path, read it. If they pasted text, use that. Extract:
46
+ - Name, email, phone, location
47
+ - Current title and company
48
+ - Skills and technologies
49
+ - Years of experience
50
+ - Education
51
+ - Notable achievements
52
+
53
+ Store these as structured data — you'll need them for every form field.
54
+
55
+ ### Read the job posting
56
+
57
+ If the job URL is publicly accessible, try fetching it with `WebFetch` or `curl` first — no browser needed for public postings.
58
+
59
+ If it requires authentication (LinkedIn Jobs behind login, internal company portals), use `browser_start` to navigate and read the posting.
60
+
61
+ Extract from the posting:
62
+ - Job title, company, location, remote/hybrid/onsite
63
+ - Required qualifications
64
+ - Preferred qualifications
65
+ - Responsibilities
66
+ - Application deadline (if listed)
67
+
68
+ ### Match qualifications
69
+
70
+ Compare the user's profile against the job requirements. Present a brief match summary:
71
+
72
+ ```
73
+ Match summary for [Job Title] at [Company]:
74
+
75
+ Strong matches:
76
+ - [Requirement] — [How the user matches]
77
+ - [Requirement] — [How the user matches]
78
+
79
+ Gaps:
80
+ - [Requirement] — [What's missing or weak]
81
+
82
+ Overall fit: [Strong / Moderate / Weak]
83
+ ```
84
+
85
+ If the fit is weak, tell the user honestly. Ask if they want to proceed anyway.
86
+
87
+ ---
88
+
89
+ ## Phase 2: Prepare Application Answers
90
+
91
+ Before touching the browser, prepare answers for common application fields:
92
+
93
+ **Standard fields** (auto-fill from resume):
94
+ - Full name, email, phone, location
95
+ - Current company, current title
96
+ - LinkedIn URL, portfolio/website
97
+ - Resume upload (file path)
98
+
99
+ **Screening questions** (need user input if not provided):
100
+ - Are you authorized to work in [country]?
101
+ - Do you require visa sponsorship?
102
+ - Years of experience in [skill]
103
+ - Desired salary / salary expectations
104
+ - Earliest start date
105
+ - Willing to relocate?
106
+ - How did you hear about this role?
107
+
108
+ **Cover letter** (generate if needed):
109
+ - Tailor to the specific role and company
110
+ - Reference 2-3 specific requirements from the posting that match the user's experience
111
+ - Keep it under 300 words
112
+ - Match the tone the user requested (professional, conversational, etc.)
113
+
114
+ Present all prepared answers to the user. Ask them to confirm or adjust before proceeding.
115
+
116
+ ---
117
+
118
+ ## Phase 3: Fill the Application Form
119
+
120
+ Navigate to the application page using `browser_start`. Apply **one job at a time, sequentially**.
121
+
122
+ ### Platform-specific patterns
123
+
124
+ **Lever** (jobs.lever.co):
125
+ - Single-page form with resume upload, standard fields, and custom questions
126
+ - Upload resume first — Lever sometimes auto-fills fields from it
127
+ - Custom questions appear below the standard fields
128
+
129
+ **Greenhouse** (boards.greenhouse.io):
130
+ - Multi-step form: personal info, resume, custom questions, voluntary self-identification
131
+ - Each step has a "Next" or "Continue" button
132
+ - EEOC/voluntary fields are optional — skip unless the user wants to fill them
133
+
134
+ **Workday** (*.myworkdayjobs.com):
135
+ - Requires account creation — warn the user before proceeding
136
+ - Multi-page flow with "Save and Continue" between sections
137
+ - Often requires re-entering information already on the resume
138
+ - May have autofill from LinkedIn — use it if the user is signed in
139
+
140
+ **LinkedIn Jobs** (linkedin.com/jobs):
141
+ - "Easy Apply" is a modal overlay, not a new page
142
+ - May have 1-3 steps within the modal
143
+ - Can upload resume or use LinkedIn profile
144
+ - Some postings redirect to the company's external ATS
145
+
146
+ **Indeed** (indeed.com):
147
+ - May require an Indeed account
148
+ - "Apply Now" sometimes redirects to external site
149
+ - Indeed's own application flow has screening questions upfront
150
+
151
+ **Generic ATS / company career pages**:
152
+ - Look for the application form — usually behind "Apply" or "Apply Now"
153
+ - Fill fields based on label matching (name, email, phone, etc.)
154
+ - For file uploads, use the resume file path from the user
155
+ - For dropdowns, select the closest matching option
156
+
157
+ ### Filling strategy
158
+
159
+ 1. Navigate to the application URL
160
+ 2. If the platform auto-fills from resume upload, upload first and let it populate
161
+ 3. Fill remaining empty fields using prepared answers
162
+ 4. For multi-step forms, complete each step before moving to the next
163
+ 5. On the final step, **STOP before clicking Submit**
164
+
165
+ Pass all form data via the `context` field in `browser_start`:
166
+
167
+ ```
168
+ browser_start({
169
+ task: "Fill out the job application form. Upload the resume, fill all fields, answer screening questions. DO NOT click Submit — stop on the final review page.",
170
+ url: "https://jobs.lever.co/company/position-id/apply",
171
+ context: "Resume file: /path/to/resume.pdf\nName: Jane Smith\nEmail: jane@example.com\nPhone: 555-0123\nLinkedIn: linkedin.com/in/janesmith\nCover letter: [prepared text]\nYears of Python experience: 6\nVisa sponsorship needed: No\nDesired salary: $150,000"
172
+ })
173
+ ```
174
+
175
+ If `browser_start` times out mid-form, call `browser_screenshot` to see progress, then `browser_message` to continue from where it left off.
176
+
177
+ ---
178
+
179
+ ## Phase 4: Review Before Submitting
180
+
181
+ This is the most important phase. **Never submit without explicit user approval.**
182
+
183
+ After the form is filled, call `browser_screenshot` to capture the final state. Present to the user:
184
+
185
+ ```
186
+ Application ready for [Job Title] at [Company]:
187
+
188
+ Filled fields:
189
+ - Name: Jane Smith
190
+ - Email: jane@example.com
191
+ - Phone: 555-0123
192
+ - Resume: uploaded
193
+ - Cover letter: [first 2 lines]...
194
+ - [Screening question]: [answer]
195
+ - [Screening question]: [answer]
196
+
197
+ Screenshot attached showing the completed form.
198
+
199
+ Ready to submit? (yes / no / edit [field])
200
+ ```
201
+
202
+ If the user says yes:
203
+ ```
204
+ browser_message({
205
+ session_id: "abc123",
206
+ message: "Click the Submit / Apply button to submit the application."
207
+ })
208
+ ```
209
+
210
+ If the user wants edits, use `browser_message` to make the changes, take a new screenshot, and confirm again.
211
+
212
+ After successful submission, take a final screenshot as confirmation. Log the application:
213
+ ```bash
214
+ mkdir -p ~/.hanzi-browse && echo "[date] | [company] | [job title] | [url] | submitted" >> ~/.hanzi-browse/applications.txt
215
+ ```
216
+
217
+ ---
218
+
219
+ ## Batch Applications
220
+
221
+ If the user provides multiple job URLs:
222
+
223
+ 1. Analyze all postings first (Phase 1 for each)
224
+ 2. Present a summary table:
225
+
226
+ | # | Company | Role | Fit | Platform | Status |
227
+ |---|---------|------|-----|----------|--------|
228
+ | 1 | Acme Corp | Senior Engineer | Strong | Lever | Ready |
229
+ | 2 | Beta Inc | Staff Engineer | Moderate | Greenhouse | Ready |
230
+ | 3 | Gamma Co | Principal Engineer | Weak | Workday | Needs discussion |
231
+
232
+ 3. Ask which ones to proceed with
233
+ 4. Apply one at a time, reviewing each before submission
234
+ 5. Report progress: "Applied 2/5 — continuing with #3..."
235
+
236
+ ---
237
+
238
+ ## Safety Rules
239
+
240
+ - **Never submit without explicit user approval** — always pause on the final step
241
+ - **Never create accounts** on job platforms without asking the user first
242
+ - **Never provide false information** — if a field asks something you don't have an answer for, ask the user
243
+ - **One application at a time** — don't run parallel browser sessions for applications
244
+ - If the form requires payment or credit card information, stop and warn the user
245
+ - If the application asks for SSN, government ID, or similarly sensitive data, stop and tell the user to fill those fields manually
246
+ - If a CAPTCHA appears, pause and ask the user to solve it, then continue
247
+ - Max 10 applications per session to avoid triggering platform rate limits
248
+
249
+ ---
250
+
251
+ ## When Done
252
+
253
+ Summarize:
254
+ - Total applications: filled / submitted / skipped
255
+ - Per-application status with confirmation screenshots
256
+ - Any issues encountered (CAPTCHAs, missing fields, platform errors)
257
+ - Running total from the applications log:
258
+ ```bash
259
+ wc -l ~/.hanzi-browse/applications.txt 2>/dev/null || echo "0 applications logged"
260
+ ```
@@ -0,0 +1,146 @@
1
+ ---
2
+ name: seo-checker
3
+ description: Audit web pages for SEO issues in a real browser. Checks rendered meta tags, heading hierarchy, image alt text, structured data, canonical URLs, mobile rendering, and performance signals. Produces a scored report with specific findings and fixes. Read-only — inspects, doesn't modify. Requires the hanzi browser automation MCP server and Chrome extension.
4
+ ---
5
+
6
+ # SEO Checker
7
+
8
+ You audit web pages for SEO issues using a real browser — rendered meta tags, actual heading structure, real schema markup, mobile viewport behavior. This skill is read-only: observe and report, don't modify.
9
+
10
+ ## Tool Selection Rule
11
+
12
+ - Prefer existing tools first (code search, local files, `curl`). Review HTML source, meta tags, and sitemap before opening the browser.
13
+ - Use Hanzi for Phase 2–5 — always open the browser for these phases even if Phase 1 found all static data. Do not substitute curl or WebFetch for browser phases.
14
+
15
+ ## Before Starting
16
+
17
+ Call `browser_status` to verify the extension is reachable. If unavailable, tell the user to install from: https://chromewebstore.google.com/detail/hanzi-browse/iklpkemlmbhemkiojndpbhoakgikpmcd
18
+
19
+ ## What You Need
20
+
21
+ 1. **URL** — page or site to audit
22
+ 2. **Scope** — single page, specific section, or full site (default: single page)
23
+ 3. **Focus** — any specific SEO concerns (e.g., "we're not showing up in rich results", "mobile traffic dropped")
24
+
25
+ ## Audit Phases
26
+
27
+ ### Phase 1 — Source Review (before browser)
28
+
29
+ Check what you can without a browser:
30
+
31
+ - **Robots.txt**: Fetch `<domain>/robots.txt` — check for accidental `Disallow: /` or blocked important paths
32
+ - **Sitemap**: Fetch `<domain>/sitemap.xml` — verify it exists and includes the target URL
33
+ - **HTML source**: If accessible, review raw `<head>` for meta tags, canonical, hreflang
34
+ - **Codebase** (if source available): Scan for hardcoded noindex, missing meta tag templates, SEO component patterns
35
+
36
+ Summarize findings before opening the browser.
37
+
38
+ ### Phase 2 — Meta & Head Tags (browser)
39
+
40
+ Use `browser_start` to open the page and inspect the rendered DOM. JavaScript-rendered SPAs may have different meta tags than the raw HTML source.
41
+
42
+ - **Title tag**: Exists, 50-60 characters, unique, descriptive (not "Home" or "Untitled")
43
+ - **Meta description**: Exists, 150-160 characters, includes target keywords, compelling for CTR
44
+ - **Canonical URL**: Present, points to the correct URL (not a duplicate or wrong domain)
45
+ - **Robots meta**: Check for unintentional `noindex`, `nofollow`, or `none` directives
46
+ - **Open Graph tags**: `og:title`, `og:description`, `og:image`, `og:url` — all present and correct
47
+ - **Twitter Card tags**: `twitter:card`, `twitter:title`, `twitter:description`, `twitter:image`
48
+ - **Viewport meta**: `<meta name="viewport" content="width=device-width, initial-scale=1">` present
49
+ - **Charset & lang**: `<meta charset="utf-8">` and `<html lang="...">` set correctly
50
+
51
+ Screenshot the page after loading.
52
+
53
+ ### Phase 3 — Content Structure (browser)
54
+
55
+ - **H1 tag**: Exactly one per page, descriptive, contains primary keyword
56
+ - **Heading hierarchy**: H1 → H2 → H3 — no skipped levels (e.g., H1 → H3 with no H2)
57
+ - **Image alt text**: All meaningful images have descriptive alt text; decorative images use `alt=""`
58
+ - **Internal links**: Key pages are linked, anchor text is descriptive (not "click here")
59
+ - **Broken links**: Check for obvious 404s or dead links on the page
60
+
61
+ ### Phase 4 — Structured Data (browser)
62
+
63
+ - **JSON-LD / Microdata**: Check `<script type="application/ld+json">` blocks in the rendered DOM
64
+ - **Schema types**: Verify appropriate types are used (Article, Product, LocalBusiness, BreadcrumbList, FAQ, etc.)
65
+ - **Required properties**: Each schema type has required fields — check they're populated (e.g., Article needs `headline`, `datePublished`, `author`)
66
+ - **Validation**: Flag malformed JSON-LD or schemas with empty/placeholder values
67
+
68
+ ### Phase 5 — Mobile & Performance (browser)
69
+
70
+ Render the page at a mobile viewport (375×812, iPhone-sized):
71
+
72
+ - **Mobile layout**: No horizontal scrolling, text readable without zooming, tap targets at least 48×48px
73
+ - **Content parity**: Mobile version has the same key content as desktop (Google uses mobile-first indexing)
74
+ - **Image optimization**: Check for oversized images (e.g., 2000px wide image in a 375px container), missing `loading="lazy"` on below-fold images
75
+ - **CLS indicators**: Elements that visibly shift during load (ads, images without dimensions, dynamically injected content)
76
+
77
+ Screenshot at mobile viewport.
78
+
79
+ ## Scoring
80
+
81
+ Rate each category on a 0-10 scale:
82
+
83
+ | Category | What's checked |
84
+ |----------|---------------|
85
+ | **Meta Tags** | Title, description, canonical, robots, OG, Twitter cards |
86
+ | **Content Structure** | H1, heading hierarchy, image alt text, internal links |
87
+ | **Structured Data** | JSON-LD presence, correct types, required properties |
88
+ | **Mobile** | Responsive layout, content parity, tap targets |
89
+ | **Performance Signals** | Image sizes, lazy loading, CLS indicators |
90
+ | **Internationalisation** | hreflang alternates, lang attribute, multilingual implementation |
91
+
92
+ **Overall score** = average of the 6 category scores (out of 10).
93
+
94
+ - **9-10**: Excellent — production-ready SEO
95
+ - **7-8**: Good — minor improvements needed
96
+ - **5-6**: Needs work — several issues affecting visibility
97
+ - **3-4**: Poor — significant SEO problems
98
+ - **0-2**: Critical — major issues blocking indexing or ranking
99
+
100
+ ## Report Format
101
+
102
+ ```
103
+ # SEO Audit: [URL]
104
+ Overall Score: [X]/10
105
+
106
+ ## Meta Tags — [X]/10
107
+ ✓ [What passed — one line each]
108
+ ✗ [What failed — element, issue, specific fix]
109
+ 📸 Screenshot: [evidence]
110
+
111
+ ## Content Structure — [X]/10
112
+ ✓ / ✗ [same format]
113
+
114
+ ## Structured Data — [X]/10
115
+ ✓ / ✗ [same format]
116
+
117
+ ## Mobile — [X]/10
118
+ ✓ / ✗ [same format]
119
+
120
+ ## Performance Signals — [X]/10
121
+ ✓ / ✗ [same format]
122
+
123
+ ## Internationalisation — [X]/10
124
+ ✓ / ✗ [same format]
125
+
126
+ ## Top 3 Priorities
127
+ 1. [Most impactful fix — what to do and why]
128
+ 2. [Second priority]
129
+ 3. [Third priority]
130
+ ```
131
+
132
+ For each failing item, include:
133
+ - **What's wrong**: specific element and current value
134
+ - **Why it matters**: impact on search visibility or user experience
135
+ - **How to fix**: concrete action (e.g., "Add `<meta name="description" content="...">` with 150-160 chars describing the page")
136
+ - **Reference**: link to relevant Google/web.dev guideline where helpful
137
+
138
+ ## Rules
139
+
140
+ - One page at a time — screenshot at each phase
141
+ - Be specific: "the hero image (1920×1080, 2.4MB)" not "some images are large"
142
+ - Cite standards: Google's SEO guidelines, web.dev, Schema.org specs
143
+ - Don't report unverified issues — if the rendered DOM differs from source, note both
144
+ - If `browser_start` times out, call `browser_screenshot` to diagnose
145
+ - Read-only — never modify the page, submit forms, or click CTAs
146
+ - SPA handling: always check rendered DOM, not just HTML source — SPAs may inject meta tags via JavaScript