openclaw-smart-fetch 0.2.34 → 0.2.36
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -0
- package/dist/index.js +1046 -90
- package/dist/index.js.map +1 -1
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -9,6 +9,7 @@
|
|
|
9
9
|
- 🧠 **Useful metadata** — title, author, site, language, published date when available
|
|
10
10
|
- 📦 **Downloads + large file support** — stream attachments and binaries to temp files
|
|
11
11
|
- 🔁 **Client-side `<meta>` redirects** — follows sane meta refresh redirects with loop limits
|
|
12
|
+
- 🔗 **Alternate content fallback** — when extraction produces no/thin content, follows qualified `<link rel="alternate" type="...">` entries in `<head>` that match the requested output format
|
|
12
13
|
- ⚡ **Batch fetch** — fetch many URLs with bounded concurrency
|
|
13
14
|
- 📝 **Multiple output formats** — `markdown`, `html`, `text`, `json`
|
|
14
15
|
- 🔄 **Built-in `web_fetch` fallback** — automatically improves the core web_fetch tool
|
|
@@ -35,6 +36,7 @@ from Defuddle's extractors and cleanup:
|
|
|
35
36
|
Notes:
|
|
36
37
|
- Defuddle is the cleanup layer: it strips common page chrome like nav, sidebars, related links, share widgets, and footers
|
|
37
38
|
- It does **not** execute JavaScript or solve interactive anti-bot/login flows
|
|
39
|
+
- If an HTML shell advertises alternate content in `<head>`, smart-fetch can follow matching alternates such as `text/markdown`, `text/plain`, `text/html`, or JSON media types according to the requested `format`
|
|
38
40
|
|
|
39
41
|
## Install
|
|
40
42
|
|