agentic-browsing-auditor 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,98 @@
1
+ Metadata-Version: 2.4
2
+ Name: agentic-browsing-auditor
3
+ Version: 1.0.0
4
+ Summary: Lighthouse Agentic Browsing Audit CLI and local Web Dashboard
5
+ Author-email: Amal Alexander <amalalex95@gmail.com>
6
+ Project-URL: Homepage, https://github.com/amal-alexander/agentic-browsing-auditor
7
+ Project-URL: LinkedIn, https://www.linkedin.com/in/amal-alexander-305780131/
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Operating System :: OS Independent
11
+ Requires-Python: >=3.8
12
+ Description-Content-Type: text/markdown
13
+ Requires-Dist: Flask>=3.0.0
14
+ Requires-Dist: click>=8.0.0
15
+ Requires-Dist: rich>=13.0.0
16
+
17
+ # Agentic Browsing Auditor
18
+
19
+ [![PyPI Version](https://img.shields.io/pypi/v/agentic-browsing-auditor.svg)](https://pypi.org/project/agentic-browsing-auditor/)
20
+ [![LinkedIn](https://img.shields.io/badge/LinkedIn-Amal%20Alexander-blue)](https://www.linkedin.com/in/amal-alexander-305780131/)
21
+
22
+ A Python-based CLI tool and local dashboard to audit website performance for LLM agents using Google Lighthouse's experimental **Agentic Browsing** category (evaluates `llms.txt`, `WebMCP`, agent-centric accessibility, and layout stability).
23
+
24
+ Developed by **Amal Alexander** ([LinkedIn](https://www.linkedin.com/in/amal-alexander-305780131/)).
25
+
26
+ ---
27
+
28
+ ## Why you need Chrome + Node
29
+
30
+ The Agentic Browsing category:
31
+ - Shipped in **Lighthouse 13.3** (May 2026) as part of the default config.
32
+ - Requires **Chrome 150+** (or Chrome Canary).
33
+ - Requires a local node environment to shell out to `lighthouse`.
34
+
35
+ ---
36
+
37
+ ## Setup & Installation
38
+
39
+ You can install the auditor package directly from PyPI:
40
+
41
+ ```bash
42
+ pip install agentic-browsing-auditor
43
+ ```
44
+
45
+ ### Pre-requisites
46
+
47
+ 1. **Install Node.js** (18+): https://nodejs.org
48
+ 2. **Install Lighthouse** globally:
49
+ ```bash
50
+ npm install -g lighthouse
51
+ ```
52
+ 3. **Get a compatible Chrome build.** Easiest path: install [Chrome Canary](https://www.google.com/chrome/canary/).
53
+ 4. **Point the tool at that Chrome binary** via the `CHROME_PATH` environment variable:
54
+ - **Windows (PowerShell)**:
55
+ ```powershell
56
+ $env:CHROME_PATH = "C:\Users\<YourUsername>\AppData\Local\Google\Chrome SxS\Application\chrome.exe"
57
+ ```
58
+ - **macOS**:
59
+ ```bash
60
+ export CHROME_PATH="/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary"
61
+ ```
62
+ - **Linux**:
63
+ ```bash
64
+ export CHROME_PATH="/usr/bin/google-chrome-canary"
65
+ ```
66
+
67
+ ---
68
+
69
+ ## Usage
70
+
71
+ Once installed, you can access the auditor using the global CLI command `agentic-auditor`.
72
+
73
+ ### 1. Audit a Single URL
74
+ Analyze a website and print a beautiful table of results directly inside the terminal:
75
+ ```bash
76
+ agentic-auditor audit example.com
77
+ ```
78
+
79
+ ### 2. Bulk Audit URLs (with CSV export)
80
+ Audit multiple URLs listed in a text file (one URL per line) and export the results to a CSV file.
81
+ ```bash
82
+ agentic-auditor bulk urls.txt --output results.csv
83
+ ```
84
+
85
+ ### 3. Launch the Local Web Dashboard
86
+ Serve the interactive visual Lighthouse-style dashboard locally:
87
+ ```bash
88
+ agentic-auditor serve
89
+ ```
90
+ Then visit http://localhost:5000 in your browser.
91
+
92
+ ---
93
+
94
+ ## Author & Contact
95
+
96
+ Built and maintained by **Amal Alexander**.
97
+ - **Email**: [amalalex95@gmail.com](mailto:amalalex95@gmail.com)
98
+ - **LinkedIn**: [amal-alexander-305780131](https://www.linkedin.com/in/amal-alexander-305780131/)
@@ -0,0 +1,82 @@
1
+ # Agentic Browsing Auditor
2
+
3
+ [![PyPI Version](https://img.shields.io/pypi/v/agentic-browsing-auditor.svg)](https://pypi.org/project/agentic-browsing-auditor/)
4
+ [![LinkedIn](https://img.shields.io/badge/LinkedIn-Amal%20Alexander-blue)](https://www.linkedin.com/in/amal-alexander-305780131/)
5
+
6
+ A Python-based CLI tool and local dashboard to audit website performance for LLM agents using Google Lighthouse's experimental **Agentic Browsing** category (evaluates `llms.txt`, `WebMCP`, agent-centric accessibility, and layout stability).
7
+
8
+ Developed by **Amal Alexander** ([LinkedIn](https://www.linkedin.com/in/amal-alexander-305780131/)).
9
+
10
+ ---
11
+
12
+ ## Why you need Chrome + Node
13
+
14
+ The Agentic Browsing category:
15
+ - Shipped in **Lighthouse 13.3** (May 2026) as part of the default config.
16
+ - Requires **Chrome 150+** (or Chrome Canary).
17
+ - Requires a local node environment to shell out to `lighthouse`.
18
+
19
+ ---
20
+
21
+ ## Setup & Installation
22
+
23
+ You can install the auditor package directly from PyPI:
24
+
25
+ ```bash
26
+ pip install agentic-browsing-auditor
27
+ ```
28
+
29
+ ### Pre-requisites
30
+
31
+ 1. **Install Node.js** (18+): https://nodejs.org
32
+ 2. **Install Lighthouse** globally:
33
+ ```bash
34
+ npm install -g lighthouse
35
+ ```
36
+ 3. **Get a compatible Chrome build.** Easiest path: install [Chrome Canary](https://www.google.com/chrome/canary/).
37
+ 4. **Point the tool at that Chrome binary** via the `CHROME_PATH` environment variable:
38
+ - **Windows (PowerShell)**:
39
+ ```powershell
40
+ $env:CHROME_PATH = "C:\Users\<YourUsername>\AppData\Local\Google\Chrome SxS\Application\chrome.exe"
41
+ ```
42
+ - **macOS**:
43
+ ```bash
44
+ export CHROME_PATH="/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary"
45
+ ```
46
+ - **Linux**:
47
+ ```bash
48
+ export CHROME_PATH="/usr/bin/google-chrome-canary"
49
+ ```
50
+
51
+ ---
52
+
53
+ ## Usage
54
+
55
+ Once installed, you can access the auditor using the global CLI command `agentic-auditor`.
56
+
57
+ ### 1. Audit a Single URL
58
+ Analyze a website and print a beautiful table of results directly inside the terminal:
59
+ ```bash
60
+ agentic-auditor audit example.com
61
+ ```
62
+
63
+ ### 2. Bulk Audit URLs (with CSV export)
64
+ Audit multiple URLs listed in a text file (one URL per line) and export the results to a CSV file.
65
+ ```bash
66
+ agentic-auditor bulk urls.txt --output results.csv
67
+ ```
68
+
69
+ ### 3. Launch the Local Web Dashboard
70
+ Serve the interactive visual Lighthouse-style dashboard locally:
71
+ ```bash
72
+ agentic-auditor serve
73
+ ```
74
+ Then visit http://localhost:5000 in your browser.
75
+
76
+ ---
77
+
78
+ ## Author & Contact
79
+
80
+ Built and maintained by **Amal Alexander**.
81
+ - **Email**: [amalalex95@gmail.com](mailto:amalalex95@gmail.com)
82
+ - **LinkedIn**: [amal-alexander-305780131](https://www.linkedin.com/in/amal-alexander-305780131/)
@@ -0,0 +1,36 @@
1
+ [build-system]
2
+ requires = ["setuptools>=61.0.0", "wheel"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "agentic-browsing-auditor"
7
+ version = "1.0.0"
8
+ description = "Lighthouse Agentic Browsing Audit CLI and local Web Dashboard"
9
+ readme = "README.md"
10
+ requires-python = ">=3.8"
11
+ authors = [
12
+ { name = "Amal Alexander", email = "amalalex95@gmail.com" }
13
+ ]
14
+ classifiers = [
15
+ "Programming Language :: Python :: 3",
16
+ "License :: OSI Approved :: MIT License",
17
+ "Operating System :: OS Independent",
18
+ ]
19
+ dependencies = [
20
+ "Flask>=3.0.0",
21
+ "click>=8.0.0",
22
+ "rich>=13.0.0",
23
+ ]
24
+
25
+ [project.urls]
26
+ "Homepage" = "https://github.com/amal-alexander/agentic-browsing-auditor"
27
+ "LinkedIn" = "https://www.linkedin.com/in/amal-alexander-305780131/"
28
+
29
+ [project.scripts]
30
+ agentic-auditor = "agentic_auditor.cli:main"
31
+
32
+ [tool.setuptools.packages.find]
33
+ where = ["src"]
34
+
35
+ [tool.setuptools.package-data]
36
+ agentic_auditor = ["templates/*", "static/*"]
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,10 @@
1
+ from .auditor import run_lighthouse, build_report, normalize_url, AuditError
2
+ from .app import app
3
+
4
+ __all__ = [
5
+ "run_lighthouse",
6
+ "build_report",
7
+ "normalize_url",
8
+ "AuditError",
9
+ "app",
10
+ ]
@@ -0,0 +1,34 @@
1
+ import os
2
+ from flask import Flask, jsonify, render_template, request
3
+ from .auditor import normalize_url, run_lighthouse, build_report, AuditError
4
+
5
+ # Resolve template and static folders relative to this file
6
+ base_dir = os.path.dirname(os.path.abspath(__file__))
7
+ app = Flask(
8
+ __name__,
9
+ template_folder=os.path.join(base_dir, "templates"),
10
+ static_folder=os.path.join(base_dir, "static")
11
+ )
12
+
13
+
14
+ @app.route("/")
15
+ def index():
16
+ return render_template("index.html")
17
+
18
+
19
+ @app.route("/api/audit", methods=["POST"])
20
+ def audit():
21
+ payload = request.get_json(silent=True) or {}
22
+ try:
23
+ url = normalize_url(payload.get("url", ""))
24
+ raw_report = run_lighthouse(url)
25
+ report = build_report(raw_report)
26
+ return jsonify({"ok": True, "report": report})
27
+ except AuditError as exc:
28
+ return jsonify({"ok": False, "error": str(exc)}), 400
29
+ except Exception as exc: # safety net
30
+ return jsonify({"ok": False, "error": f"Unexpected error: {exc}"}), 500
31
+
32
+
33
+ if __name__ == "__main__":
34
+ app.run(host="0.0.0.0", port=5000, debug=True)
@@ -0,0 +1,241 @@
1
+ import json
2
+ import os
3
+ import re
4
+ import shutil
5
+ import subprocess
6
+ import tempfile
7
+ from urllib.parse import urlparse
8
+
9
+ CATEGORY_ID = "agentic-browsing"
10
+ LIGHTHOUSE_TIMEOUT_SECONDS = 180
11
+
12
+
13
+ class AuditError(Exception):
14
+ """Raised for any expected failure while running/parsing Lighthouse."""
15
+
16
+
17
+ def normalize_url(raw: str) -> str:
18
+ raw = (raw or "").strip()
19
+ if not raw:
20
+ raise AuditError("Please enter a domain or URL.")
21
+ if not re.match(r"^https?://", raw, re.IGNORECASE):
22
+ raw = "https://" + raw
23
+ parsed = urlparse(raw)
24
+ if not parsed.netloc:
25
+ raise AuditError("That doesn't look like a valid domain or URL.")
26
+ return raw
27
+
28
+
29
+ def find_lighthouse_binary() -> list:
30
+ """
31
+ Prefer a locally/globally installed `lighthouse` binary. Fall back to
32
+ `npx lighthouse`, which will download it on first run if needed.
33
+ """
34
+ # Check project-local node_modules first
35
+ # Walk up from this file's location to check if node_modules is present
36
+ base_dir = os.path.dirname(os.path.abspath(__file__))
37
+ # In a installed package, node_modules might be located in parent directories
38
+ # like the workspace root or site-packages.
39
+ # Let's search from base_dir upwards for node_modules/.bin/lighthouse
40
+ curr = base_dir
41
+ while True:
42
+ local_bin = os.path.join(curr, "node_modules", ".bin")
43
+ local_lh = os.path.join(local_bin, "lighthouse.cmd" if os.name == "nt" else "lighthouse")
44
+ if os.path.isfile(local_lh):
45
+ return [local_lh]
46
+ parent = os.path.dirname(curr)
47
+ if parent == curr:
48
+ break
49
+ curr = parent
50
+
51
+ lighthouse_path = shutil.which("lighthouse")
52
+ if lighthouse_path:
53
+ return [lighthouse_path]
54
+ npx_path = shutil.which("npx")
55
+ if npx_path:
56
+ return [npx_path, "--yes", "lighthouse"]
57
+ raise AuditError(
58
+ "Neither `lighthouse` nor `npx` was found on this machine. "
59
+ "Install Node.js, then run: npm install -g lighthouse"
60
+ )
61
+
62
+
63
+ def resolve_chrome_path():
64
+ """
65
+ Resolve CHROME_PATH and validate it actually points at a file, so we
66
+ can fail loudly instead of silently falling back to whatever Chrome
67
+ chrome-launcher happens to find (usually stable Chrome).
68
+ """
69
+ chrome_path = os.environ.get("CHROME_PATH")
70
+ if not chrome_path:
71
+ return None
72
+ if not os.path.isfile(chrome_path):
73
+ raise AuditError(
74
+ f"CHROME_PATH is set to '{chrome_path}' but no file exists there. "
75
+ "Double-check the path (right-click your Chrome Canary shortcut "
76
+ "-> Properties -> Target on Windows)."
77
+ )
78
+ return chrome_path
79
+
80
+
81
+ def run_lighthouse(url: str) -> dict:
82
+ binary_cmd = find_lighthouse_binary()
83
+ chrome_path = resolve_chrome_path()
84
+
85
+ with tempfile.TemporaryDirectory() as tmp_dir:
86
+ output_path = os.path.join(tmp_dir, "report.json")
87
+
88
+ chrome_flags = "--headless=new --no-sandbox --disable-gpu"
89
+
90
+ cmd = binary_cmd + [
91
+ url,
92
+ f"--only-categories={CATEGORY_ID}",
93
+ "--output=json",
94
+ f"--output-path={output_path}",
95
+ f"--chrome-flags={chrome_flags}",
96
+ "--quiet",
97
+ "--max-wait-for-load=45000",
98
+ ]
99
+
100
+ # chrome-launcher (used internally by Lighthouse) primarily reads
101
+ # the CHROME_PATH *environment variable*, not a CLI flag - so we
102
+ # pass it explicitly into the subprocess's environment, and also
103
+ # append --chrome-path for the (newer) Lighthouse versions that
104
+ # support it directly. Belt and suspenders.
105
+ run_env = os.environ.copy()
106
+ if chrome_path:
107
+ run_env["CHROME_PATH"] = chrome_path
108
+ cmd.append(f"--chrome-path={chrome_path}")
109
+ print(f"[agentic-audit] Using Chrome at: {chrome_path}")
110
+ else:
111
+ print(
112
+ "[agentic-audit] WARNING: CHROME_PATH is not set. "
113
+ "Lighthouse will auto-discover a Chrome install, which is "
114
+ "likely your regular stable Chrome (won't support the "
115
+ "agentic-browsing category)."
116
+ )
117
+
118
+ try:
119
+ result = subprocess.run(
120
+ cmd,
121
+ capture_output=True,
122
+ text=True,
123
+ timeout=LIGHTHOUSE_TIMEOUT_SECONDS,
124
+ env=run_env,
125
+ )
126
+ except subprocess.TimeoutExpired as exc:
127
+ raise AuditError(
128
+ f"Lighthouse timed out after {LIGHTHOUSE_TIMEOUT_SECONDS}s "
129
+ "auditing this page."
130
+ ) from exc
131
+ except FileNotFoundError as exc:
132
+ raise AuditError(f"Couldn't launch Lighthouse: {exc}") from exc
133
+
134
+ if result.returncode != 0 or not os.path.exists(output_path):
135
+ stderr_tail = (result.stderr or "").strip()[-2000:]
136
+ raise AuditError(
137
+ "Lighthouse failed to produce a report. This usually means "
138
+ "Chrome couldn't be launched, the site couldn't be reached, "
139
+ "or your Chrome version doesn't support the Agentic Browsing "
140
+ f"category yet (needs Chrome 150+, or 130-149 with the "
141
+ f"webmcp-testing flag).\n\nDetails: {stderr_tail}"
142
+ )
143
+
144
+ with open(output_path, "r", encoding="utf-8") as f:
145
+ try:
146
+ return json.load(f)
147
+ except json.JSONDecodeError as exc:
148
+ raise AuditError("Lighthouse returned an unreadable report.") from exc
149
+
150
+
151
+ def classify_audit(audit: dict) -> str:
152
+ """
153
+ Map a Lighthouse audit result to one of:
154
+ 'fail', 'warning', 'pass', 'not_applicable', 'informative'
155
+ """
156
+ display_mode = audit.get("scoreDisplayMode")
157
+ score = audit.get("score")
158
+
159
+ if display_mode == "notApplicable":
160
+ return "not_applicable"
161
+ if display_mode == "informative":
162
+ return "informative"
163
+ if display_mode in ("manual", "error"):
164
+ return "warning"
165
+ if display_mode == "numeric":
166
+ # e.g. Cumulative Layout Shift - use score thresholds like Lighthouse does
167
+ if score is None:
168
+ return "informative"
169
+ if score >= 0.9:
170
+ return "pass"
171
+ if score >= 0.5:
172
+ return "warning"
173
+ return "fail"
174
+ # binary
175
+ if score == 1:
176
+ return "pass"
177
+ if score == 0:
178
+ return "fail"
179
+ return "warning"
180
+
181
+
182
+ def build_report(raw: dict) -> dict:
183
+ categories = raw.get("categories", {})
184
+ category = categories.get(CATEGORY_ID)
185
+ if category is None:
186
+ raise AuditError(
187
+ "This Lighthouse report has no 'agentic-browsing' category. "
188
+ "Your Chrome/Lighthouse version likely doesn't support it yet."
189
+ )
190
+
191
+ audits_by_id = raw.get("audits", {})
192
+ groups_meta = raw.get("categoryGroups") or raw.get("groups") or {}
193
+
194
+ grouped: dict = {}
195
+ ungrouped = []
196
+
197
+ pass_count = 0
198
+ fail_or_warn_count = 0
199
+
200
+ for ref in category.get("auditRefs", []):
201
+ audit_id = ref.get("id")
202
+ audit = audits_by_id.get(audit_id, {})
203
+ status = classify_audit(audit)
204
+
205
+ entry = {
206
+ "id": audit_id,
207
+ "title": audit.get("title", audit_id),
208
+ "description": audit.get("description", ""),
209
+ "status": status,
210
+ "display_value": audit.get("displayValue"),
211
+ "score": audit.get("score"),
212
+ }
213
+
214
+ if status in ("pass",):
215
+ pass_count += 1
216
+ elif status in ("fail", "warning"):
217
+ fail_or_warn_count += 1
218
+
219
+ group_id = ref.get("group")
220
+ if group_id:
221
+ group_title = (groups_meta.get(group_id) or {}).get("title", group_id)
222
+ grouped.setdefault(group_id, {"title": group_title, "audits": []})
223
+ grouped[group_id]["audits"].append(entry)
224
+ else:
225
+ ungrouped.append(entry)
226
+
227
+ total_scored = pass_count + fail_or_warn_count
228
+ ratio_label = f"{pass_count}/{total_scored}" if total_scored else "N/A"
229
+
230
+ return {
231
+ "final_url": raw.get("finalUrl") or raw.get("requestedUrl"),
232
+ "fetch_time": raw.get("fetchTime"),
233
+ "lighthouse_version": raw.get("lighthouseVersion"),
234
+ "category_title": category.get("title", "Agentic Browsing"),
235
+ "category_description": category.get("description", ""),
236
+ "pass_ratio_label": ratio_label,
237
+ "pass_count": pass_count,
238
+ "scored_count": total_scored,
239
+ "ungrouped_audits": ungrouped,
240
+ "groups": list(grouped.values()),
241
+ }