claude-for-safari 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +148 -0
- package/README_CN.md +148 -0
- package/SKILL.md +383 -0
- package/package.json +37 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 SDLLL
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,148 @@
|
|
|
1
|
+
<h1 align="center">Claude for Safari</h1>
|
|
2
|
+
|
|
3
|
+
<p align="center">
|
|
4
|
+
<strong>Give your AI Agent the power to control Safari</strong>
|
|
5
|
+
</p>
|
|
6
|
+
|
|
7
|
+
<p align="center">
|
|
8
|
+
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg?style=for-the-badge" alt="MIT License"></a>
|
|
9
|
+
<a href="https://www.apple.com/macos/"><img src="https://img.shields.io/badge/macOS-only-black.svg?style=for-the-badge&logo=apple" alt="macOS"></a>
|
|
10
|
+
<a href="https://github.com/SDLLL/claude-for-safari/stargazers"><img src="https://img.shields.io/github/stars/SDLLL/claude-for-safari?style=for-the-badge" alt="GitHub Stars"></a>
|
|
11
|
+
</p>
|
|
12
|
+
|
|
13
|
+
<p align="center">
|
|
14
|
+
<a href="https://safari.skilljam.dev">Website</a> · <a href="#quick-start">Quick Start</a> · <a href="#features">Features</a> · <a href="#faq">FAQ</a>
|
|
15
|
+
</p>
|
|
16
|
+
|
|
17
|
+
<p align="center">
|
|
18
|
+
<a href="README.md">English</a> | <a href="README_CN.md">中文</a>
|
|
19
|
+
</p>
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Why This?
|
|
24
|
+
|
|
25
|
+
You want your AI Agent to help with browser tasks — then you discover:
|
|
26
|
+
|
|
27
|
+
- 🔒 **Playwright** → Separate browser instance, hijacks your session
|
|
28
|
+
- 🧩 **Claude for Chrome** → Requires Chrome extension, doesn't work with Safari
|
|
29
|
+
- 📝 **Copy & paste** → Manually feeding page content to AI every time
|
|
30
|
+
|
|
31
|
+
**You just want AI to use your Safari, as if you were doing it yourself.**
|
|
32
|
+
|
|
33
|
+
**Claude for Safari makes it one command:**
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
npx skills add SDLLL/claude-for-safari
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
After installing, tell Claude "check what's open in my Safari" — it reads and controls your real browser directly.
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## Quick Start
|
|
44
|
+
|
|
45
|
+
Run this in your terminal:
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
npx skills add SDLLL/claude-for-safari
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
Then launch [Claude Code](https://claude.ai/download):
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
claude
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
Say "show me what tabs are open in Safari". The agent will guide you through permission setup automatically.
|
|
58
|
+
|
|
59
|
+
> Compatible with any AI Agent that supports Skills: Claude Code, Cursor, Windsurf, etc.
|
|
60
|
+
|
|
61
|
+
### First-Time Setup
|
|
62
|
+
|
|
63
|
+
The agent auto-detects and guides you, but you can configure ahead of time:
|
|
64
|
+
|
|
65
|
+
1. **System Settings > Privacy & Security > Automation** → Allow terminal to control Safari
|
|
66
|
+
2. **Safari > Settings > Advanced** → Enable "Show features for web developers"
|
|
67
|
+
3. **Safari > Develop menu** → Check "Allow JavaScript from Apple Events"
|
|
68
|
+
4. **(Optional) System Settings > Privacy & Security > Screen Recording** → Allow terminal (enables background screenshots)
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## Features
|
|
73
|
+
|
|
74
|
+
Zero install. Pure macOS native capabilities. One Skill covers all browser operations:
|
|
75
|
+
|
|
76
|
+
| Capability | What the Agent Does | How |
|
|
77
|
+
|---|---|---|
|
|
78
|
+
| **List tabs** | List all windows and tabs with title & URL | AppleScript |
|
|
79
|
+
| **Read pages** | Extract text, structured data, simplified DOM | AppleScript + JavaScript |
|
|
80
|
+
| **Execute JS** | Run arbitrary JavaScript in page context | AppleScript `do JavaScript` |
|
|
81
|
+
| **Screenshot** | Capture Safari window — AI can "see" the page | `screencapture` |
|
|
82
|
+
| **Navigate** | Open URLs, new tabs, new windows | AppleScript |
|
|
83
|
+
| **Click** | Click elements (React/Vue/Angular compatible) | JavaScript `dispatchEvent` |
|
|
84
|
+
| **Type** | Fill forms, simulate keyboard input | JavaScript + System Events |
|
|
85
|
+
| **Scroll** | Scroll up/down, scroll to element | JavaScript `scrollBy/scrollTo` |
|
|
86
|
+
| **Switch tabs** | Switch by index or URL keyword | AppleScript |
|
|
87
|
+
| **Wait for load** | Wait until page is fully loaded | JavaScript `readyState` |
|
|
88
|
+
|
|
89
|
+
### Screenshot Modes
|
|
90
|
+
|
|
91
|
+
| Mode | Permission Required | Window Switch | Best For |
|
|
92
|
+
|---|---|---|---|
|
|
93
|
+
| **Background** | Screen Recording | No switch | Recommended, seamless |
|
|
94
|
+
| **Foreground** | None | Brief (~0.3s) | Default, auto-switches back |
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## How It Works
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
Claude Code ──osascript──► Safari (reads/controls your real browser)
|
|
102
|
+
│
|
|
103
|
+
└──screencapture──► screenshot ──► Claude sees the page
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
No extensions. No proxy servers. No extra processes.
|
|
107
|
+
|
|
108
|
+
Everything runs through macOS native AppleScript and screencapture. Websites see a real user — no automation fingerprints.
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## FAQ
|
|
113
|
+
|
|
114
|
+
<details>
|
|
115
|
+
<summary><strong>Do I need to install anything?</strong></summary>
|
|
116
|
+
|
|
117
|
+
No. This Skill relies entirely on macOS built-in AppleScript and screencapture. Just grant a few system permissions on first use.
|
|
118
|
+
</details>
|
|
119
|
+
|
|
120
|
+
<details>
|
|
121
|
+
<summary><strong>Does it support Chrome / Firefox / Arc?</strong></summary>
|
|
122
|
+
|
|
123
|
+
Safari only. For other browsers, use <a href="https://github.com/nicepkg/playwright-mcp">Playwright MCP</a> or <a href="https://github.com/Areo-Joe/chrome-acp">Chrome ACP</a>. Safari is the only macOS browser with full AppleScript automation support.
|
|
124
|
+
</details>
|
|
125
|
+
|
|
126
|
+
<details>
|
|
127
|
+
<summary><strong>Is it safe? Will AI do random things?</strong></summary>
|
|
128
|
+
|
|
129
|
+
Claude Code's permission system asks for your confirmation before every sensitive action. You can approve individually or in bulk. All operations are visible in your terminal.
|
|
130
|
+
</details>
|
|
131
|
+
|
|
132
|
+
<details>
|
|
133
|
+
<summary><strong>The window flickers when taking screenshots?</strong></summary>
|
|
134
|
+
|
|
135
|
+
Without Screen Recording permission, Safari briefly activates (~0.3s) then switches back. Grant Screen Recording permission for fully background screenshots with zero window switching.
|
|
136
|
+
</details>
|
|
137
|
+
|
|
138
|
+
<details>
|
|
139
|
+
<summary><strong>Which AI Agents are compatible?</strong></summary>
|
|
140
|
+
|
|
141
|
+
Any agent supporting Claude Code Skills: Claude Code, Cursor, Windsurf, etc.
|
|
142
|
+
</details>
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## License
|
|
147
|
+
|
|
148
|
+
[MIT](LICENSE)
|
package/README_CN.md
ADDED
|
@@ -0,0 +1,148 @@
|
|
|
1
|
+
<h1 align="center">Claude for Safari</h1>
|
|
2
|
+
|
|
3
|
+
<p align="center">
|
|
4
|
+
<strong>给你的 AI Agent 装上 Safari 浏览器操控能力</strong>
|
|
5
|
+
</p>
|
|
6
|
+
|
|
7
|
+
<p align="center">
|
|
8
|
+
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg?style=for-the-badge" alt="MIT License"></a>
|
|
9
|
+
<a href="https://www.apple.com/macos/"><img src="https://img.shields.io/badge/macOS-only-black.svg?style=for-the-badge&logo=apple" alt="macOS"></a>
|
|
10
|
+
<a href="https://github.com/SDLLL/claude-for-safari/stargazers"><img src="https://img.shields.io/github/stars/SDLLL/claude-for-safari?style=for-the-badge" alt="GitHub Stars"></a>
|
|
11
|
+
</p>
|
|
12
|
+
|
|
13
|
+
<p align="center">
|
|
14
|
+
<a href="#快速上手">快速开始</a> · <a href="#它能做什么">功能</a> · <a href="#常见问题">FAQ</a>
|
|
15
|
+
</p>
|
|
16
|
+
|
|
17
|
+
<p align="center">
|
|
18
|
+
<a href="README.md">English</a> | <a href="README_CN.md">中文</a>
|
|
19
|
+
</p>
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## 为什么需要这个?
|
|
24
|
+
|
|
25
|
+
你想让 AI Agent 帮你操作浏览器——然后发现:
|
|
26
|
+
|
|
27
|
+
- 🔒 **Playwright** → 独立浏览器实例,抢占用户使用
|
|
28
|
+
- 🧩 **Claude for Chrome** → 需要安装 Chrome 扩展不适配 Safari
|
|
29
|
+
- 📝 **手动复制粘贴** → 每次都要自己把网页内容喂给 AI,效率极低
|
|
30
|
+
|
|
31
|
+
**你只是想让 AI 直接用你的 Safari,就像你自己操作一样。**
|
|
32
|
+
|
|
33
|
+
**Claude for Safari 把这一切变成一句话:**
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
npx skills add SDLLL/claude-for-safari
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
安装后对 Claude 说「帮我看看 Safari 里打开了什么」,它就能直接读取、操控你的真实浏览器。
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## 快速上手
|
|
44
|
+
|
|
45
|
+
复制这行命令,在终端运行:
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
npx skills add SDLLL/claude-for-safari
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
然后启动 [Claude Code](https://claude.ai/download):
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
claude
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
对它说「帮我看看 Safari 当前打开了哪些网页」。Agent 会自动引导完成权限配置。
|
|
58
|
+
|
|
59
|
+
> 兼容任何支持 Skills 的 AI Agent:Claude Code、Cursor、Windsurf 等。
|
|
60
|
+
|
|
61
|
+
### 前置配置(仅首次)
|
|
62
|
+
|
|
63
|
+
Agent 会自动检测并引导你完成,但你也可以提前配置:
|
|
64
|
+
|
|
65
|
+
1. **系统设置 > 隐私与安全性 > 自动化** → 允许终端控制 Safari
|
|
66
|
+
2. **Safari > 设置 > 高级** → 开启「显示网页开发者功能」
|
|
67
|
+
3. **Safari > 开发菜单** → 勾选「允许 Apple 事件的 JavaScript」
|
|
68
|
+
4. **(可选)系统设置 > 隐私与安全性 > 屏幕录制** → 允许终端录屏(启用后台截图)
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
## 它能做什么
|
|
73
|
+
|
|
74
|
+
零安装,纯 macOS 原生能力,一个 Skill 覆盖所有浏览器操作:
|
|
75
|
+
|
|
76
|
+
| 能力 | Agent 做的事 | 实现方式 |
|
|
77
|
+
|------|------------|---------|
|
|
78
|
+
| **查看标签页** | 列出所有窗口和标签的标题、URL | AppleScript |
|
|
79
|
+
| **读取页面** | 提取页面文本、结构化数据、简化 DOM | AppleScript + JavaScript |
|
|
80
|
+
| **执行 JS** | 在页面上下文中运行任意 JavaScript | AppleScript `do JavaScript` |
|
|
81
|
+
| **截图** | 截取 Safari 窗口画面,AI 可以"看到"页面 | `screencapture` |
|
|
82
|
+
| **导航** | 打开 URL、新建标签页、新建窗口 | AppleScript |
|
|
83
|
+
| **点击** | 点击页面元素(兼容 React/Vue/Angular) | JavaScript `dispatchEvent` |
|
|
84
|
+
| **输入** | 填写表单、模拟键盘输入 | JavaScript + System Events |
|
|
85
|
+
| **滚动** | 上下滚动、滚动到指定元素 | JavaScript `scrollBy/scrollTo` |
|
|
86
|
+
| **切换标签** | 按序号或 URL 关键词切换标签页 | AppleScript |
|
|
87
|
+
| **等待加载** | 等待页面加载完成后再操作 | JavaScript `readyState` |
|
|
88
|
+
|
|
89
|
+
### 截图模式
|
|
90
|
+
|
|
91
|
+
| 模式 | 需要权限 | 是否切换窗口 | 适用场景 |
|
|
92
|
+
|------|---------|------------|---------|
|
|
93
|
+
| **后台截图** | 屏幕录制权限 | 不切换 | 推荐,无感截图 |
|
|
94
|
+
| **前台截图** | 无需额外权限 | 短暂切换 (~0.3s) | 默认,自动切回 |
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## 工作原理
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
Claude Code ──osascript──► Safari(读取/操控你的真实浏览器)
|
|
102
|
+
│
|
|
103
|
+
└──screencapture──► 截图 ──► Claude 看到页面内容
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
没有扩展,没有代理服务器,没有额外进程。
|
|
107
|
+
|
|
108
|
+
所有操作都通过 macOS 原生的 AppleScript 和 screencapture 完成,Safari 看到的就是真实用户操作。
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## 常见问题
|
|
113
|
+
|
|
114
|
+
<details>
|
|
115
|
+
<summary><strong>需要提前安装什么吗?</strong></summary>
|
|
116
|
+
|
|
117
|
+
不需要安装任何软件。本 Skill 完全依赖 macOS 内置的 AppleScript 和 screencapture,开箱即用。只需在首次使用时授予几个系统权限。
|
|
118
|
+
</details>
|
|
119
|
+
|
|
120
|
+
<details>
|
|
121
|
+
<summary><strong>支持 Chrome / Firefox / Arc 吗?</strong></summary>
|
|
122
|
+
|
|
123
|
+
目前仅支持 Safari。其他浏览器推荐使用 <a href="https://github.com/nicepkg/playwright-mcp">Playwright MCP</a> 或 <a href="https://github.com/Areo-Joe/chrome-acp">Chrome ACP</a>。Safari 是 macOS 上唯一完整支持 AppleScript 自动化的浏览器。
|
|
124
|
+
</details>
|
|
125
|
+
|
|
126
|
+
<details>
|
|
127
|
+
<summary><strong>安全吗?AI 会不会乱操作?</strong></summary>
|
|
128
|
+
|
|
129
|
+
Claude Code 的权限系统会在每次敏感操作前征求你的确认。你可以选择逐条审批或批量授权。所有操作都在你的终端中可见。
|
|
130
|
+
</details>
|
|
131
|
+
|
|
132
|
+
<details>
|
|
133
|
+
<summary><strong>截图时窗口会闪一下?</strong></summary>
|
|
134
|
+
|
|
135
|
+
如果没有授予屏幕录制权限,截图时需要短暂激活 Safari 窗口(约 0.3 秒),之后会自动切回。授予屏幕录制权限后可以实现完全后台截图,不会有任何窗口切换。
|
|
136
|
+
</details>
|
|
137
|
+
|
|
138
|
+
<details>
|
|
139
|
+
<summary><strong>兼容哪些 AI Agent?</strong></summary>
|
|
140
|
+
|
|
141
|
+
任何支持 Claude Code Skills 的 AI Agent 都能用,包括 Claude Code、Cursor、Windsurf 等。
|
|
142
|
+
</details>
|
|
143
|
+
|
|
144
|
+
---
|
|
145
|
+
|
|
146
|
+
## License
|
|
147
|
+
|
|
148
|
+
[MIT](LICENSE)
|
package/SKILL.md
ADDED
|
@@ -0,0 +1,383 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: claude-for-safari
|
|
3
|
+
description: Control the user's real Safari browser on macOS using AppleScript and screencapture. This skill should be used when the user asks to interact with Safari, browse websites, read web pages, automate browser tasks, take screenshots of web content, or when any task would benefit from seeing or interacting with what's in their browser. Triggers on keywords like "safari", "browser", "web page", "open tab", "screenshot the page", "read this site", "browse", "click on", "fill in the form".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Claude for Safari
|
|
7
|
+
|
|
8
|
+
Operate the user's real Safari browser on macOS via AppleScript (`osascript`) and `screencapture`. This provides full access to the user's actual browser session — including login state, cookies, and open tabs — without any extensions or additional software.
|
|
9
|
+
|
|
10
|
+
## Prerequisites
|
|
11
|
+
|
|
12
|
+
Before first use, verify two settings are enabled. Run this check at the start of every session:
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
osascript -e 'tell application "Safari" to get name of front window' 2>&1
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
If this fails, instruct the user to enable:
|
|
19
|
+
1. **System Settings > Privacy & Security > Automation** — grant terminal app permission to control Safari
|
|
20
|
+
2. **Safari > Settings > Advanced** — enable "Show features for web developers", then **Develop menu > Allow JavaScript from Apple Events**
|
|
21
|
+
|
|
22
|
+
## Core Capabilities
|
|
23
|
+
|
|
24
|
+
### 1. List All Open Tabs
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
osascript -e '
|
|
28
|
+
tell application "Safari"
|
|
29
|
+
set output to ""
|
|
30
|
+
repeat with w from 1 to (count of windows)
|
|
31
|
+
repeat with t from 1 to (count of tabs of window w)
|
|
32
|
+
set tabName to name of tab t of window w
|
|
33
|
+
set tabURL to URL of tab t of window w
|
|
34
|
+
set output to output & "W" & w & "T" & t & " | " & tabName & " | " & tabURL & linefeed
|
|
35
|
+
end repeat
|
|
36
|
+
end repeat
|
|
37
|
+
return output
|
|
38
|
+
end tell'
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### 2. Read Page Content
|
|
42
|
+
|
|
43
|
+
Read the full text content of the current tab:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
osascript -e '
|
|
47
|
+
tell application "Safari"
|
|
48
|
+
do JavaScript "document.body.innerText" in current tab of front window
|
|
49
|
+
end tell'
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Read structured content (title, URL, meta description, headings):
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
osascript -e '
|
|
56
|
+
tell application "Safari"
|
|
57
|
+
do JavaScript "JSON.stringify({
|
|
58
|
+
title: document.title,
|
|
59
|
+
url: location.href,
|
|
60
|
+
description: document.querySelector(\"meta[name=description]\")?.content || \"\",
|
|
61
|
+
h1: [...document.querySelectorAll(\"h1\")].map(e => e.textContent).join(\" | \"),
|
|
62
|
+
h2: [...document.querySelectorAll(\"h2\")].map(e => e.textContent).join(\" | \")
|
|
63
|
+
})" in current tab of front window
|
|
64
|
+
end tell'
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Read a simplified DOM (similar to Chrome ACP's `browser_read`):
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
osascript -e '
|
|
71
|
+
tell application "Safari"
|
|
72
|
+
do JavaScript "
|
|
73
|
+
(function() {
|
|
74
|
+
const walk = (node, depth) => {
|
|
75
|
+
let result = \"\";
|
|
76
|
+
for (const child of node.childNodes) {
|
|
77
|
+
if (child.nodeType === 3) {
|
|
78
|
+
const text = child.textContent.trim();
|
|
79
|
+
if (text) result += text + \"\\n\";
|
|
80
|
+
} else if (child.nodeType === 1) {
|
|
81
|
+
const tag = child.tagName.toLowerCase();
|
|
82
|
+
if ([\"script\",\"style\",\"noscript\",\"svg\"].includes(tag)) continue;
|
|
83
|
+
const style = getComputedStyle(child);
|
|
84
|
+
if (style.display === \"none\" || style.visibility === \"hidden\") continue;
|
|
85
|
+
if ([\"h1\",\"h2\",\"h3\",\"h4\",\"h5\",\"h6\"].includes(tag))
|
|
86
|
+
result += \"#\".repeat(parseInt(tag[1])) + \" \";
|
|
87
|
+
if (tag === \"a\") result += \"[\";
|
|
88
|
+
if (tag === \"img\") result += \"[Image: \" + (child.alt || \"\") + \"]\\n\";
|
|
89
|
+
else if (tag === \"input\") result += \"[Input \" + child.type + \": \" + (child.value || child.placeholder || \"\") + \"]\\n\";
|
|
90
|
+
else if (tag === \"button\") result += \"[Button: \" + child.textContent.trim() + \"]\\n\";
|
|
91
|
+
else result += walk(child, depth + 1);
|
|
92
|
+
if (tag === \"a\") result += \"](\" + child.href + \")\\n\";
|
|
93
|
+
if ([\"p\",\"div\",\"li\",\"tr\",\"br\",\"h1\",\"h2\",\"h3\",\"h4\",\"h5\",\"h6\"].includes(tag))
|
|
94
|
+
result += \"\\n\";
|
|
95
|
+
}
|
|
96
|
+
}
|
|
97
|
+
return result;
|
|
98
|
+
};
|
|
99
|
+
return walk(document.body, 0).substring(0, 50000);
|
|
100
|
+
})()
|
|
101
|
+
" in current tab of front window
|
|
102
|
+
end tell'
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### 3. Execute JavaScript
|
|
106
|
+
|
|
107
|
+
Run arbitrary JavaScript in the page context and get the return value:
|
|
108
|
+
|
|
109
|
+
```bash
|
|
110
|
+
osascript -e '
|
|
111
|
+
tell application "Safari"
|
|
112
|
+
do JavaScript "YOUR_JS_CODE_HERE" in current tab of front window
|
|
113
|
+
end tell'
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
For multi-line scripts, use a heredoc:
|
|
117
|
+
|
|
118
|
+
```bash
|
|
119
|
+
osascript << 'APPLESCRIPT'
|
|
120
|
+
tell application "Safari"
|
|
121
|
+
do JavaScript "
|
|
122
|
+
(function() {
|
|
123
|
+
// Multi-line JS here
|
|
124
|
+
return 'result';
|
|
125
|
+
})()
|
|
126
|
+
" in current tab of front window
|
|
127
|
+
end tell
|
|
128
|
+
APPLESCRIPT
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### 4. Screenshot
|
|
132
|
+
|
|
133
|
+
Two approaches are available. Auto-detect which to use at session start:
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
# Test if Screen Recording permission is granted (background screenshot available)
|
|
137
|
+
/tmp/safari_wid 2>/dev/null && echo "BACKGROUND_SCREENSHOT=true" || echo "BACKGROUND_SCREENSHOT=false"
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
#### Background Screenshot (requires Screen Recording permission)
|
|
141
|
+
|
|
142
|
+
If the user has granted Screen Recording permission to the terminal app, use `screencapture -l` to capture Safari **without activating it**:
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
# Compile the helper once per session (if not already compiled)
|
|
146
|
+
if [ ! -f /tmp/safari_wid ]; then
|
|
147
|
+
cat > /tmp/safari_wid.swift << 'SWIFT'
|
|
148
|
+
import CoreGraphics
|
|
149
|
+
import Foundation
|
|
150
|
+
let options: CGWindowListOption = [.optionOnScreenOnly, .excludeDesktopElements]
|
|
151
|
+
guard let windowList = CGWindowListCopyWindowInfo(options, kCGNullWindowID) as? [[String: Any]] else { exit(1) }
|
|
152
|
+
for window in windowList {
|
|
153
|
+
guard let owner = window[kCGWindowOwnerName as String] as? String,
|
|
154
|
+
owner == "Safari",
|
|
155
|
+
let layer = window[kCGWindowLayer as String] as? Int,
|
|
156
|
+
layer == 0,
|
|
157
|
+
let wid = window[kCGWindowNumber as String] as? Int else { continue }
|
|
158
|
+
print(wid)
|
|
159
|
+
exit(0)
|
|
160
|
+
}
|
|
161
|
+
exit(1)
|
|
162
|
+
SWIFT
|
|
163
|
+
swiftc /tmp/safari_wid.swift -o /tmp/safari_wid
|
|
164
|
+
fi
|
|
165
|
+
|
|
166
|
+
# Capture Safari window in background (no activation needed)
|
|
167
|
+
WID=$(/tmp/safari_wid)
|
|
168
|
+
screencapture -l "$WID" -o -x /tmp/safari_screenshot.png
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
To enable this, instruct the user: **System Settings > Privacy & Security > Screen Recording** — grant permission to the terminal app (Terminal / iTerm / Warp).
|
|
172
|
+
|
|
173
|
+
#### Foreground Screenshot (no extra permissions needed)
|
|
174
|
+
|
|
175
|
+
If Screen Recording is not granted, fall back to region-based capture. This briefly activates Safari (~0.5s), then switches back:
|
|
176
|
+
|
|
177
|
+
```bash
|
|
178
|
+
# Remember current frontmost app
|
|
179
|
+
FRONT_APP=$(osascript -e 'tell application "System Events" to get name of first process whose frontmost is true')
|
|
180
|
+
|
|
181
|
+
# Activate Safari and capture its window region
|
|
182
|
+
osascript -e 'tell application "Safari" to activate'
|
|
183
|
+
sleep 0.3
|
|
184
|
+
BOUNDS=$(osascript -e '
|
|
185
|
+
tell application "System Events"
|
|
186
|
+
tell process "Safari"
|
|
187
|
+
-- Safari may expose a thin toolbar as window 1; find the largest window
|
|
188
|
+
set bestW to 0
|
|
189
|
+
set bestBounds to ""
|
|
190
|
+
repeat with i from 1 to (count of windows)
|
|
191
|
+
set {x, y} to position of window i
|
|
192
|
+
set {w, h} to size of window i
|
|
193
|
+
if w * h > bestW then
|
|
194
|
+
set bestW to w * h
|
|
195
|
+
set bestBounds to (x as text) & "," & (y as text) & "," & (w as text) & "," & (h as text)
|
|
196
|
+
end if
|
|
197
|
+
end repeat
|
|
198
|
+
return bestBounds
|
|
199
|
+
end tell
|
|
200
|
+
end tell')
|
|
201
|
+
screencapture -x -R "$BOUNDS" /tmp/safari_screenshot.png
|
|
202
|
+
|
|
203
|
+
# Switch back to the previous app
|
|
204
|
+
osascript -e "tell application \"$FRONT_APP\" to activate"
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
After capturing with either method, read the screenshot to see what's on screen:
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Use the Read tool on /tmp/safari_screenshot.png to view the captured image.
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
### 5. Navigate
|
|
214
|
+
|
|
215
|
+
Open a URL in the current tab:
|
|
216
|
+
|
|
217
|
+
```bash
|
|
218
|
+
osascript -e '
|
|
219
|
+
tell application "Safari"
|
|
220
|
+
set URL of current tab of front window to "https://example.com"
|
|
221
|
+
end tell'
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
Open a URL in a new tab:
|
|
225
|
+
|
|
226
|
+
```bash
|
|
227
|
+
osascript -e '
|
|
228
|
+
tell application "Safari"
|
|
229
|
+
tell front window
|
|
230
|
+
set newTab to make new tab with properties {URL:"https://example.com"}
|
|
231
|
+
set current tab to newTab
|
|
232
|
+
end tell
|
|
233
|
+
end tell'
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
Open a URL in a new window:
|
|
237
|
+
|
|
238
|
+
```bash
|
|
239
|
+
osascript -e 'tell application "Safari" to make new document with properties {URL:"https://example.com"}'
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
### 6. Click Elements
|
|
243
|
+
|
|
244
|
+
Click using JavaScript (preferred — works with SPAs and reactive frameworks):
|
|
245
|
+
|
|
246
|
+
```bash
|
|
247
|
+
osascript -e '
|
|
248
|
+
tell application "Safari"
|
|
249
|
+
do JavaScript "
|
|
250
|
+
const el = document.querySelector(\"button.submit\");
|
|
251
|
+
if (el) {
|
|
252
|
+
el.dispatchEvent(new MouseEvent(\"click\", {bubbles: true, cancelable: true}));
|
|
253
|
+
\"clicked\";
|
|
254
|
+
} else {
|
|
255
|
+
\"element not found\";
|
|
256
|
+
}
|
|
257
|
+
" in current tab of front window
|
|
258
|
+
end tell'
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
**Important**: Use `dispatchEvent(new MouseEvent(..., {bubbles: true}))` instead of `.click()` for React/Vue/Angular compatibility. Native `.click()` may bypass synthetic event handlers.
|
|
262
|
+
|
|
263
|
+
### 7. Type and Fill Forms
|
|
264
|
+
|
|
265
|
+
Set input values via JavaScript:
|
|
266
|
+
|
|
267
|
+
```bash
|
|
268
|
+
osascript -e '
|
|
269
|
+
tell application "Safari"
|
|
270
|
+
do JavaScript "
|
|
271
|
+
const input = document.querySelector(\"input[name=search]\");
|
|
272
|
+
const nativeSetter = Object.getOwnPropertyDescriptor(window.HTMLInputElement.prototype, \"value\").set;
|
|
273
|
+
nativeSetter.call(input, \"search text\");
|
|
274
|
+
input.dispatchEvent(new Event(\"input\", {bubbles: true}));
|
|
275
|
+
input.dispatchEvent(new Event(\"change\", {bubbles: true}));
|
|
276
|
+
" in current tab of front window
|
|
277
|
+
end tell'
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
**Important**: For React-controlled inputs, use the native setter + `dispatchEvent` pattern shown above. Directly setting `.value` will not trigger React's state update.
|
|
281
|
+
|
|
282
|
+
Type via System Events (simulates real keyboard — useful when JS injection is blocked):
|
|
283
|
+
|
|
284
|
+
```bash
|
|
285
|
+
osascript -e '
|
|
286
|
+
tell application "Safari" to activate
|
|
287
|
+
delay 0.3
|
|
288
|
+
tell application "System Events"
|
|
289
|
+
keystroke "hello world"
|
|
290
|
+
end tell'
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
Press special keys:
|
|
294
|
+
|
|
295
|
+
```bash
|
|
296
|
+
osascript -e '
|
|
297
|
+
tell application "System Events"
|
|
298
|
+
key code 36 -- Enter/Return
|
|
299
|
+
key code 48 -- Tab
|
|
300
|
+
key code 51 -- Delete/Backspace
|
|
301
|
+
keystroke "a" using command down -- Cmd+A (select all)
|
|
302
|
+
keystroke "c" using command down -- Cmd+C (copy)
|
|
303
|
+
end tell'
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
### 8. Scroll
|
|
307
|
+
|
|
308
|
+
```bash
|
|
309
|
+
# Scroll down 500px
|
|
310
|
+
osascript -e 'tell application "Safari" to do JavaScript "window.scrollBy(0, 500)" in current tab of front window'
|
|
311
|
+
|
|
312
|
+
# Scroll to top
|
|
313
|
+
osascript -e 'tell application "Safari" to do JavaScript "window.scrollTo(0, 0)" in current tab of front window'
|
|
314
|
+
|
|
315
|
+
# Scroll to bottom
|
|
316
|
+
osascript -e 'tell application "Safari" to do JavaScript "window.scrollTo(0, document.body.scrollHeight)" in current tab of front window'
|
|
317
|
+
|
|
318
|
+
# Scroll element into view
|
|
319
|
+
osascript -e 'tell application "Safari" to do JavaScript "document.querySelector(\"#target\").scrollIntoView({behavior: \"smooth\"})" in current tab of front window'
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### 9. Switch Tabs
|
|
323
|
+
|
|
324
|
+
```bash
|
|
325
|
+
# Switch to tab 2 in the front window
|
|
326
|
+
osascript -e 'tell application "Safari" to set current tab of front window to tab 2 of front window'
|
|
327
|
+
|
|
328
|
+
# Switch to a tab by URL match
|
|
329
|
+
osascript -e '
|
|
330
|
+
tell application "Safari"
|
|
331
|
+
repeat with t from 1 to (count of tabs of front window)
|
|
332
|
+
if URL of tab t of front window contains "github.com" then
|
|
333
|
+
set current tab of front window to tab t of front window
|
|
334
|
+
exit repeat
|
|
335
|
+
end if
|
|
336
|
+
end repeat
|
|
337
|
+
end tell'
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
### 10. Wait for Page Load
|
|
341
|
+
|
|
342
|
+
```bash
|
|
343
|
+
osascript -e '
|
|
344
|
+
tell application "Safari"
|
|
345
|
+
-- Wait until page finishes loading (max 10 seconds)
|
|
346
|
+
repeat 20 times
|
|
347
|
+
set readyState to do JavaScript "document.readyState" in current tab of front window
|
|
348
|
+
if readyState is "complete" then exit repeat
|
|
349
|
+
delay 0.5
|
|
350
|
+
end repeat
|
|
351
|
+
end tell'
|
|
352
|
+
```
|
|
353
|
+
|
|
354
|
+
## Workflow: Browsing with Screenshot Feedback Loop
|
|
355
|
+
|
|
356
|
+
For tasks that require visual confirmation, use the screenshot loop:
|
|
357
|
+
|
|
358
|
+
1. Perform action (navigate, click, scroll, etc.)
|
|
359
|
+
2. Wait for page load if needed
|
|
360
|
+
3. Take screenshot (background or foreground) → Read the image to see result
|
|
361
|
+
4. Decide next action based on what is visible
|
|
362
|
+
|
|
363
|
+
## Operating on Specific Tabs
|
|
364
|
+
|
|
365
|
+
To operate on a tab other than the current one, use `tab N of window M` syntax:
|
|
366
|
+
|
|
367
|
+
```bash
|
|
368
|
+
# Read content of tab 3 in window 1
|
|
369
|
+
osascript -e 'tell application "Safari" to do JavaScript "document.title" in tab 3 of window 1'
|
|
370
|
+
|
|
371
|
+
# Execute JS in a specific tab
|
|
372
|
+
osascript -e 'tell application "Safari" to do JavaScript "document.body.innerText.substring(0, 1000)" in tab 2 of front window'
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
Note: Background screenshots capture the entire Safari window (whichever tab is active). To screenshot a specific tab, first switch to it via AppleScript.
|
|
376
|
+
|
|
377
|
+
## Limitations
|
|
378
|
+
|
|
379
|
+
- **macOS only** — AppleScript and screencapture are macOS-specific
|
|
380
|
+
- **Cannot intercept network requests** — only page content and JS execution
|
|
381
|
+
- **Cannot access cross-origin iframes** — browser security applies
|
|
382
|
+
- **Private browsing windows** — AppleScript cannot control private windows
|
|
383
|
+
- **System Events keystroke is "blind"** — it types into whatever is focused; ensure Safari is frontmost before using
|
package/package.json
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "claude-for-safari",
|
|
3
|
+
"version": "1.0.0",
|
|
4
|
+
"description": "Give your AI Agent the power to control Safari on macOS. No extensions, no separate browser.",
|
|
5
|
+
"keywords": [
|
|
6
|
+
"claude",
|
|
7
|
+
"safari",
|
|
8
|
+
"ai",
|
|
9
|
+
"agent",
|
|
10
|
+
"browser",
|
|
11
|
+
"automation",
|
|
12
|
+
"macos",
|
|
13
|
+
"applescript",
|
|
14
|
+
"claude-code",
|
|
15
|
+
"skill",
|
|
16
|
+
"safari-extension"
|
|
17
|
+
],
|
|
18
|
+
"homepage": "https://safari.skilljam.dev",
|
|
19
|
+
"repository": {
|
|
20
|
+
"type": "git",
|
|
21
|
+
"url": "https://github.com/SDLLL/claude-for-safari"
|
|
22
|
+
},
|
|
23
|
+
"bugs": {
|
|
24
|
+
"url": "https://github.com/SDLLL/claude-for-safari/issues"
|
|
25
|
+
},
|
|
26
|
+
"author": {
|
|
27
|
+
"name": "SDLLL",
|
|
28
|
+
"url": "https://x.com/Sidrel0610"
|
|
29
|
+
},
|
|
30
|
+
"license": "MIT",
|
|
31
|
+
"files": [
|
|
32
|
+
"SKILL.md",
|
|
33
|
+
"README.md",
|
|
34
|
+
"README_CN.md",
|
|
35
|
+
"LICENSE"
|
|
36
|
+
]
|
|
37
|
+
}
|