visus-mcp 0.2.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/settings.local.json +22 -0
- package/LINKEDIN-STRATEGY.md +367 -0
- package/README.md +491 -16
- package/ROADMAP.md +214 -34
- package/SECURITY-AUDIT-v1.md +277 -0
- package/STATUS.md +801 -42
- package/TROUBLESHOOT-AUTH-20260322-2019.md +291 -0
- package/TROUBLESHOOT-JEST-20260323-1357.md +139 -0
- package/TROUBLESHOOT-LAMBDA-20260322-1945.md +183 -0
- package/VISUS-CLAUDE-CODE-PROMPT.md +1 -1
- package/VISUS-PROJECT-PLAN.md +7 -0
- package/dist/browser/playwright-renderer.d.ts.map +1 -1
- package/dist/browser/playwright-renderer.js +7 -0
- package/dist/browser/playwright-renderer.js.map +1 -1
- package/dist/browser/reader.d.ts +31 -0
- package/dist/browser/reader.d.ts.map +1 -0
- package/dist/browser/reader.js +98 -0
- package/dist/browser/reader.js.map +1 -0
- package/dist/index.d.ts +1 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +37 -5
- package/dist/index.js.map +1 -1
- package/dist/lambda-handler.d.ts +0 -6
- package/dist/lambda-handler.d.ts.map +1 -1
- package/dist/lambda-handler.js +97 -25
- package/dist/lambda-handler.js.map +1 -1
- package/dist/sanitizer/framework-mapper.d.ts +22 -0
- package/dist/sanitizer/framework-mapper.d.ts.map +1 -0
- package/dist/sanitizer/framework-mapper.js +296 -0
- package/dist/sanitizer/framework-mapper.js.map +1 -0
- package/dist/sanitizer/index.d.ts +10 -2
- package/dist/sanitizer/index.d.ts.map +1 -1
- package/dist/sanitizer/index.js +22 -6
- package/dist/sanitizer/index.js.map +1 -1
- package/dist/sanitizer/patterns.js +1 -1
- package/dist/sanitizer/patterns.js.map +1 -1
- package/dist/sanitizer/pii-allowlist.d.ts +49 -0
- package/dist/sanitizer/pii-allowlist.d.ts.map +1 -0
- package/dist/sanitizer/pii-allowlist.js +231 -0
- package/dist/sanitizer/pii-allowlist.js.map +1 -0
- package/dist/sanitizer/pii-redactor.d.ts +13 -1
- package/dist/sanitizer/pii-redactor.d.ts.map +1 -1
- package/dist/sanitizer/pii-redactor.js +26 -2
- package/dist/sanitizer/pii-redactor.js.map +1 -1
- package/dist/sanitizer/severity-classifier.d.ts +33 -0
- package/dist/sanitizer/severity-classifier.d.ts.map +1 -0
- package/dist/sanitizer/severity-classifier.js +113 -0
- package/dist/sanitizer/severity-classifier.js.map +1 -0
- package/dist/sanitizer/threat-reporter.d.ts +65 -0
- package/dist/sanitizer/threat-reporter.d.ts.map +1 -0
- package/dist/sanitizer/threat-reporter.js +160 -0
- package/dist/sanitizer/threat-reporter.js.map +1 -0
- package/dist/tools/fetch-structured.d.ts +5 -0
- package/dist/tools/fetch-structured.d.ts.map +1 -1
- package/dist/tools/fetch-structured.js +59 -8
- package/dist/tools/fetch-structured.js.map +1 -1
- package/dist/tools/fetch.d.ts +5 -0
- package/dist/tools/fetch.d.ts.map +1 -1
- package/dist/tools/fetch.js +43 -9
- package/dist/tools/fetch.js.map +1 -1
- package/dist/tools/read.d.ts +51 -0
- package/dist/tools/read.d.ts.map +1 -0
- package/dist/tools/read.js +127 -0
- package/dist/tools/read.js.map +1 -0
- package/dist/tools/search.d.ts +45 -0
- package/dist/tools/search.d.ts.map +1 -0
- package/dist/tools/search.js +220 -0
- package/dist/tools/search.js.map +1 -0
- package/dist/types.d.ts +74 -0
- package/dist/types.d.ts.map +1 -1
- package/dist/types.js.map +1 -1
- package/dist/utils/format-converter.d.ts +39 -0
- package/dist/utils/format-converter.d.ts.map +1 -0
- package/dist/utils/format-converter.js +191 -0
- package/dist/utils/format-converter.js.map +1 -0
- package/dist/utils/truncate.d.ts +26 -0
- package/dist/utils/truncate.d.ts.map +1 -0
- package/dist/utils/truncate.js +54 -0
- package/dist/utils/truncate.js.map +1 -0
- package/infrastructure/stack.ts +55 -6
- package/jest.config.js +3 -0
- package/package.json +9 -2
- package/src/browser/playwright-renderer.ts +8 -0
- package/src/browser/reader.ts +129 -0
- package/src/index.ts +49 -5
- package/src/lambda-handler.ts +131 -26
- package/src/sanitizer/framework-mapper.ts +347 -0
- package/src/sanitizer/index.ts +28 -6
- package/src/sanitizer/patterns.ts +1 -1
- package/src/sanitizer/pii-allowlist.ts +273 -0
- package/src/sanitizer/pii-redactor.ts +43 -2
- package/src/sanitizer/severity-classifier.ts +132 -0
- package/src/sanitizer/threat-reporter.ts +261 -0
- package/src/tools/fetch-structured.ts +63 -8
- package/src/tools/fetch.ts +45 -9
- package/src/tools/read.ts +143 -0
- package/src/tools/search.ts +263 -0
- package/src/types.ts +71 -0
- package/src/utils/format-converter.ts +236 -0
- package/src/utils/truncate.ts +64 -0
- package/tests/auth-smoke.test.ts +480 -0
- package/tests/fetch-tool.test.ts +595 -2
- package/tests/pii-allowlist.test.ts +282 -0
- package/tests/reader.test.ts +353 -0
- package/tests/sanitizer.test.ts +52 -0
- package/tests/search.test.ts +456 -0
- package/tests/threat-reporter.test.ts +266 -0
package/README.md
CHANGED
|
@@ -147,13 +147,22 @@ Restart Claude Desktop. Visus tools are now available to Claude.
|
|
|
147
147
|
|
|
148
148
|
### `visus_fetch`
|
|
149
149
|
|
|
150
|
-
Fetch and sanitize a web page.
|
|
150
|
+
Fetch and sanitize a web page with automatic format detection. Supports HTML, JSON, XML, and RSS/Atom feeds. Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
|
|
151
|
+
|
|
152
|
+
**Supported Formats:**
|
|
153
|
+
- **HTML** (`text/html`, `application/xhtml+xml`) - Standard web pages, returned as-is
|
|
154
|
+
- **JSON** (`application/json`) - API responses, formatted with 2-space indentation
|
|
155
|
+
- **XML** (`application/xml`, `text/xml`) - XML documents, converted to clean text representation
|
|
156
|
+
- **RSS/Atom** (`application/rss+xml`, `application/atom+xml`) - Feeds converted to Markdown with up to 10 items
|
|
157
|
+
|
|
158
|
+
### `visus_read`
|
|
159
|
+
|
|
160
|
+
Extract clean article content from a web page using Mozilla Readability (reader mode). Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
|
|
151
161
|
|
|
152
162
|
**Input:**
|
|
153
163
|
```json
|
|
154
164
|
{
|
|
155
|
-
"url": "https://example.com",
|
|
156
|
-
"format": "markdown", // or "text"
|
|
165
|
+
"url": "https://example.com/article",
|
|
157
166
|
"timeout_ms": 10000 // optional
|
|
158
167
|
}
|
|
159
168
|
```
|
|
@@ -161,25 +170,59 @@ Fetch and sanitize a web page.
|
|
|
161
170
|
**Output:**
|
|
162
171
|
```json
|
|
163
172
|
{
|
|
164
|
-
"url": "https://example.com",
|
|
165
|
-
"content": "
|
|
166
|
-
"sanitization": {
|
|
167
|
-
"patterns_detected": ["direct_instruction_injection"],
|
|
168
|
-
"pii_types_redacted": ["email", "phone"],
|
|
169
|
-
"content_modified": true
|
|
170
|
-
},
|
|
173
|
+
"url": "https://example.com/article",
|
|
174
|
+
"content": "This is the main article content, stripped of navigation, ads, and boilerplate...",
|
|
171
175
|
"metadata": {
|
|
172
|
-
"title": "
|
|
173
|
-
"
|
|
174
|
-
"
|
|
175
|
-
"
|
|
176
|
+
"title": "Article Title",
|
|
177
|
+
"author": "Jane Doe",
|
|
178
|
+
"published": "2024-01-15T10:00:00Z",
|
|
179
|
+
"word_count": 1250,
|
|
180
|
+
"reader_mode_available": true,
|
|
181
|
+
"sanitized": true,
|
|
182
|
+
"injections_removed": 0,
|
|
183
|
+
"pii_redacted": 1,
|
|
184
|
+
"truncated": false,
|
|
185
|
+
"fetched_at": "2024-01-15T10:30:00.000Z"
|
|
176
186
|
}
|
|
177
187
|
}
|
|
178
188
|
```
|
|
179
189
|
|
|
190
|
+
### `visus_search`
|
|
191
|
+
|
|
192
|
+
Search the web via DuckDuckGo and return sanitized results with prompt injection and PII removed. Use before `visus_fetch` or `visus_read` to safely discover and then read pages. Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
|
|
193
|
+
|
|
194
|
+
**Input:**
|
|
195
|
+
```json
|
|
196
|
+
{
|
|
197
|
+
"query": "TypeScript programming",
|
|
198
|
+
"max_results": 5 // optional, default: 5, max: 10
|
|
199
|
+
}
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
**Output:**
|
|
203
|
+
```json
|
|
204
|
+
{
|
|
205
|
+
"query": "TypeScript programming",
|
|
206
|
+
"result_count": 5,
|
|
207
|
+
"sanitized": true,
|
|
208
|
+
"results": [
|
|
209
|
+
{
|
|
210
|
+
"title": "TypeScript is a strongly typed programming language.",
|
|
211
|
+
"url": "https://typescriptlang.org",
|
|
212
|
+
"snippet": "TypeScript is a strongly typed programming language that builds on JavaScript...",
|
|
213
|
+
"injections_removed": 0,
|
|
214
|
+
"pii_redacted": 0
|
|
215
|
+
}
|
|
216
|
+
],
|
|
217
|
+
"total_injections_removed": 0
|
|
218
|
+
}
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
All search result titles and snippets are independently sanitized before reaching the LLM.
|
|
222
|
+
|
|
180
223
|
### `visus_fetch_structured`
|
|
181
224
|
|
|
182
|
-
Extract structured data from a web page according to a schema.
|
|
225
|
+
Extract structured data from a web page according to a schema. Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
|
|
183
226
|
|
|
184
227
|
**Input:**
|
|
185
228
|
```json
|
|
@@ -221,6 +264,418 @@ All extracted fields are individually sanitized.
|
|
|
221
264
|
|
|
222
265
|
---
|
|
223
266
|
|
|
267
|
+
## Threat Reporting
|
|
268
|
+
|
|
269
|
+
When prompt injection or PII is detected, Visus automatically generates a structured threat report with two output layers:
|
|
270
|
+
|
|
271
|
+
### 1. TOON-Formatted Findings (Token-Efficient)
|
|
272
|
+
|
|
273
|
+
Findings are encoded using [TOON format](https://toonformat.dev) for token efficiency while preserving machine readability. Each finding includes:
|
|
274
|
+
|
|
275
|
+
- Pattern ID and category
|
|
276
|
+
- Severity level (CRITICAL, HIGH, MEDIUM, LOW)
|
|
277
|
+
- Confidence score
|
|
278
|
+
- Framework alignments (OWASP LLM Top 10, NIST AI 600-1, MITRE ATLAS)
|
|
279
|
+
- Remediation status
|
|
280
|
+
|
|
281
|
+
### 2. Markdown Compliance Report (Human-Readable)
|
|
282
|
+
|
|
283
|
+
A formatted Markdown table renders cleanly in Claude Desktop and GitHub, showing:
|
|
284
|
+
|
|
285
|
+
- Overall severity assessment
|
|
286
|
+
- Findings summary by severity
|
|
287
|
+
- Detailed findings table with framework mappings
|
|
288
|
+
- PII redaction statistics
|
|
289
|
+
- Remediation confirmation
|
|
290
|
+
|
|
291
|
+
### Framework Alignments
|
|
292
|
+
|
|
293
|
+
Every detected threat is mapped to three compliance frameworks:
|
|
294
|
+
|
|
295
|
+
- **[OWASP LLM Top 10 (2025)](https://owasp.org/www-project-top-10-for-large-language-model-applications/)**: Industry-standard LLM security risks
|
|
296
|
+
- **[NIST AI 600-1](https://csrc.nist.gov/pubs/ai/600/1/final)**: Generative AI Profile for risk management
|
|
297
|
+
- **[MITRE ATLAS](https://atlas.mitre.org/)**: Adversarial Threat Landscape for AI Systems
|
|
298
|
+
|
|
299
|
+
### When Reports Are Generated
|
|
300
|
+
|
|
301
|
+
Threat reports are included in tool responses **only when findings exist**:
|
|
302
|
+
- ✅ Injections detected → Report included
|
|
303
|
+
- ✅ PII redacted → Report included
|
|
304
|
+
- ❌ Clean content → Report omitted (zero overhead)
|
|
305
|
+
|
|
306
|
+
### Example Threat Report
|
|
307
|
+
|
|
308
|
+
When a HIGH severity injection is detected:
|
|
309
|
+
|
|
310
|
+
```markdown
|
|
311
|
+
---
|
|
312
|
+
## 🟠 Visus Threat Report
|
|
313
|
+
**Generated:** 2026-03-23T14:30:00.000Z
|
|
314
|
+
**Source:** https://malicious.example.com
|
|
315
|
+
**Overall Severity:** HIGH
|
|
316
|
+
**Framework:** OWASP LLM Top 10 | NIST AI 600-1 | MITRE ATLAS
|
|
317
|
+
|
|
318
|
+
### Findings Summary
|
|
319
|
+
| Severity | Count |
|
|
320
|
+
|---|---|
|
|
321
|
+
| 🔴 CRITICAL | 0 |
|
|
322
|
+
| 🟠 HIGH | 1 |
|
|
323
|
+
| 🟡 MEDIUM | 0 |
|
|
324
|
+
| 🟢 LOW | 0 |
|
|
325
|
+
|
|
326
|
+
### Findings Detail
|
|
327
|
+
| # | Category | Severity | Confidence | OWASP | MITRE |
|
|
328
|
+
|---|---|---|---|---|---|
|
|
329
|
+
| 1 | role_hijacking | CRITICAL | 95% | LLM01:2025 | AML.T0051.000 |
|
|
330
|
+
|
|
331
|
+
### Remediation Status
|
|
332
|
+
✅ All findings sanitized. Content delivered clean.
|
|
333
|
+
|
|
334
|
+
*Report generated by Visus MCP — Security-first web access for Claude*
|
|
335
|
+
---
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
**Note:** PDF export for compliance artifacts is on the roadmap for a future `visus_report` tool.
|
|
339
|
+
|
|
340
|
+
---
|
|
341
|
+
|
|
342
|
+
## Examples
|
|
343
|
+
|
|
344
|
+
### Example 1: Public Health Page with PII Allowlist
|
|
345
|
+
|
|
346
|
+
Fetching a MedlinePlus health information page demonstrates both injection pattern detection and the domain-scoped PII allowlist feature.
|
|
347
|
+
|
|
348
|
+
**Tool Call:**
|
|
349
|
+
```json
|
|
350
|
+
{
|
|
351
|
+
"url": "https://medlineplus.gov/poisoning.html",
|
|
352
|
+
"format": "markdown"
|
|
353
|
+
}
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
**Sanitized Output (excerpt):**
|
|
357
|
+
```json
|
|
358
|
+
{
|
|
359
|
+
"url": "https://medlineplus.gov/poisoning.html",
|
|
360
|
+
"content": "# Poisoning\n\n**Call 1-800-222-1222** for immediate help...\n\n**Contact:** [REDACTED:EMAIL] for general inquiries...",
|
|
361
|
+
"sanitization": {
|
|
362
|
+
"patterns_detected": [],
|
|
363
|
+
"pii_types_redacted": ["email"],
|
|
364
|
+
"pii_allowlisted": [
|
|
365
|
+
{
|
|
366
|
+
"type": "phone",
|
|
367
|
+
"value": "1-800-222-1222",
|
|
368
|
+
"reason": "Trusted health authority number on medlineplus.gov (Poison Control)"
|
|
369
|
+
}
|
|
370
|
+
],
|
|
371
|
+
"content_modified": true
|
|
372
|
+
},
|
|
373
|
+
"metadata": {
|
|
374
|
+
"title": "Poisoning: MedlinePlus",
|
|
375
|
+
"content_length_original": 15234,
|
|
376
|
+
"content_length_sanitized": 15180
|
|
377
|
+
}
|
|
378
|
+
}
|
|
379
|
+
```
|
|
380
|
+
|
|
381
|
+
**What Visus caught:** Regular email addresses were redacted (`[REDACTED:EMAIL]`), but the Poison Control hotline number was preserved because it appears on a trusted `.gov` health domain. This demonstrates the PII allowlist in action — critical health resources remain accessible while general contact info is scrubbed.
|
|
382
|
+
|
|
383
|
+
---
|
|
384
|
+
|
|
385
|
+
### Example 2: Structured Data Extraction from Documentation
|
|
386
|
+
|
|
387
|
+
Extract navigation links and headings from a documentation page.
|
|
388
|
+
|
|
389
|
+
**Tool Call:**
|
|
390
|
+
```json
|
|
391
|
+
{
|
|
392
|
+
"url": "https://docs.github.com/en",
|
|
393
|
+
"schema": {
|
|
394
|
+
"main_heading": "h1",
|
|
395
|
+
"first_link": "link url",
|
|
396
|
+
"first_link_text": "link text",
|
|
397
|
+
"description": "paragraph text"
|
|
398
|
+
}
|
|
399
|
+
}
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
**Sanitized Output:**
|
|
403
|
+
```json
|
|
404
|
+
{
|
|
405
|
+
"url": "https://docs.github.com/en",
|
|
406
|
+
"data": {
|
|
407
|
+
"main_heading": "GitHub Docs",
|
|
408
|
+
"first_link": "/en/get-started",
|
|
409
|
+
"first_link_text": "Get started",
|
|
410
|
+
"description": "Help for wherever you are on your GitHub journey."
|
|
411
|
+
},
|
|
412
|
+
"sanitization": {
|
|
413
|
+
"patterns_detected": [],
|
|
414
|
+
"pii_types_redacted": [],
|
|
415
|
+
"pii_allowlisted": [],
|
|
416
|
+
"content_modified": false
|
|
417
|
+
},
|
|
418
|
+
"metadata": {
|
|
419
|
+
"title": "GitHub Docs",
|
|
420
|
+
"content_length_original": 45123,
|
|
421
|
+
"content_length_sanitized": 45123
|
|
422
|
+
}
|
|
423
|
+
}
|
|
424
|
+
```
|
|
425
|
+
|
|
426
|
+
**What Visus caught:** This page was clean — no injection patterns or PII detected. The structured extraction returned all requested fields with `content_modified: false`, indicating the sanitizer validated the content but made no changes.
|
|
427
|
+
|
|
428
|
+
---
|
|
429
|
+
|
|
430
|
+
### Example 3: JavaScript-Heavy SPA with Playwright Rendering
|
|
431
|
+
|
|
432
|
+
Modern single-page applications require JavaScript execution. Visus uses headless Chromium via Playwright to render dynamic content before sanitization.
|
|
433
|
+
|
|
434
|
+
**Tool Call:**
|
|
435
|
+
```json
|
|
436
|
+
{
|
|
437
|
+
"url": "https://github.com/anthropics/anthropic-sdk-typescript",
|
|
438
|
+
"format": "markdown",
|
|
439
|
+
"timeout_ms": 15000
|
|
440
|
+
}
|
|
441
|
+
```
|
|
442
|
+
|
|
443
|
+
**Sanitized Output (excerpt):**
|
|
444
|
+
```json
|
|
445
|
+
{
|
|
446
|
+
"url": "https://github.com/anthropics/anthropic-sdk-typescript",
|
|
447
|
+
"content": "# anthropic-sdk-typescript\n\n**Repository:** anthropics/anthropic-sdk-typescript\n\n**Description:** TypeScript SDK for Anthropic's Claude API...\n\n**Latest commit:** [REDACTED:COMMIT_HASH] by [REDACTED:EMAIL]...",
|
|
448
|
+
"sanitization": {
|
|
449
|
+
"patterns_detected": [],
|
|
450
|
+
"pii_types_redacted": ["email"],
|
|
451
|
+
"pii_allowlisted": [],
|
|
452
|
+
"content_modified": true
|
|
453
|
+
},
|
|
454
|
+
"metadata": {
|
|
455
|
+
"title": "GitHub - anthropics/anthropic-sdk-typescript",
|
|
456
|
+
"content_length_original": 23456,
|
|
457
|
+
"content_length_sanitized": 23401
|
|
458
|
+
}
|
|
459
|
+
}
|
|
460
|
+
```
|
|
461
|
+
|
|
462
|
+
**What Visus caught:** The page rendered completely via Playwright (including React components, lazy-loaded content, and dynamic navigation). Email addresses in commit author fields were redacted. No injection patterns were detected in this legitimate repository page.
|
|
463
|
+
|
|
464
|
+
**Key difference from static fetchers:** Tools like `curl` or basic HTTP clients would return an empty `<div id="root">` for SPAs. Visus renders the full JavaScript application before sanitization, ensuring you get the actual page content Claude sees.
|
|
465
|
+
|
|
466
|
+
---
|
|
467
|
+
|
|
468
|
+
### Example 4: Reader Mode for Context-Efficient Article Reading
|
|
469
|
+
|
|
470
|
+
When you need clean article content without navigation clutter, use `visus_read` to extract the main text using Mozilla Readability.
|
|
471
|
+
|
|
472
|
+
**Tool Call:**
|
|
473
|
+
```json
|
|
474
|
+
{
|
|
475
|
+
"url": "https://en.wikipedia.org/wiki/Prompt_injection",
|
|
476
|
+
"timeout_ms": 15000
|
|
477
|
+
}
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
**Sanitized Output (excerpt):**
|
|
481
|
+
```json
|
|
482
|
+
{
|
|
483
|
+
"url": "https://en.wikipedia.org/wiki/Prompt_injection",
|
|
484
|
+
"content": "Prompt injection is a type of cyberattack that involves adding malicious instructions to a prompt for an AI system...\n\n[Main article content continues, stripped of navigation, sidebars, and Wikipedia UI elements]\n\nSee also:\n- AI safety\n- Adversarial machine learning\n- Computer security...",
|
|
485
|
+
"metadata": {
|
|
486
|
+
"title": "Prompt injection - Wikipedia",
|
|
487
|
+
"author": null,
|
|
488
|
+
"published": null,
|
|
489
|
+
"word_count": 892,
|
|
490
|
+
"reader_mode_available": true,
|
|
491
|
+
"sanitized": true,
|
|
492
|
+
"injections_removed": 0,
|
|
493
|
+
"pii_redacted": 0,
|
|
494
|
+
"truncated": false,
|
|
495
|
+
"fetched_at": "2024-01-15T14:22:00.000Z"
|
|
496
|
+
}
|
|
497
|
+
}
|
|
498
|
+
```
|
|
499
|
+
|
|
500
|
+
**What Visus caught:** Readability successfully extracted the main article content, removing Wikipedia's navigation sidebar, footer links, and UI chrome. The extracted text is ~70% smaller than the full page HTML, saving tokens while preserving all essential information. No injection patterns or PII were detected in this educational content.
|
|
501
|
+
|
|
502
|
+
**Use case:** Reader mode is ideal for documentation pages, news articles, blog posts, and any content-heavy page where you want the text without the surrounding UI. The `word_count` field helps you estimate token usage before processing.
|
|
503
|
+
|
|
504
|
+
---
|
|
505
|
+
|
|
506
|
+
### Example 5: Safe Web Search with Injection Detection
|
|
507
|
+
|
|
508
|
+
Search the web safely using `visus_search` with DuckDuckGo, demonstrating how search results are sanitized before reaching the LLM.
|
|
509
|
+
|
|
510
|
+
**Tool Call:**
|
|
511
|
+
```json
|
|
512
|
+
{
|
|
513
|
+
"query": "AI prompt injection attacks",
|
|
514
|
+
"max_results": 3
|
|
515
|
+
}
|
|
516
|
+
```
|
|
517
|
+
|
|
518
|
+
**Sanitized Output (with detected injection):**
|
|
519
|
+
```json
|
|
520
|
+
{
|
|
521
|
+
"query": "AI prompt injection attacks",
|
|
522
|
+
"result_count": 3,
|
|
523
|
+
"sanitized": true,
|
|
524
|
+
"results": [
|
|
525
|
+
{
|
|
526
|
+
"title": "Prompt injection is a type of cyberattack...",
|
|
527
|
+
"url": "https://en.wikipedia.org/wiki/Prompt_injection",
|
|
528
|
+
"snippet": "Prompt injection is a type of cyberattack that involves adding malicious instructions to a prompt...",
|
|
529
|
+
"injections_removed": 0,
|
|
530
|
+
"pii_redacted": 0
|
|
531
|
+
},
|
|
532
|
+
{
|
|
533
|
+
"title": "[REDACTED:INSTRUCTION_INJECTION] for details contact...",
|
|
534
|
+
"url": "https://suspicious-seo-spam.example",
|
|
535
|
+
"snippet": "[REDACTED:INSTRUCTION_INJECTION] [REDACTED:EMAIL]",
|
|
536
|
+
"injections_removed": 2,
|
|
537
|
+
"pii_redacted": 1
|
|
538
|
+
},
|
|
539
|
+
{
|
|
540
|
+
"title": "AI Safety: Understanding Prompt Injection.",
|
|
541
|
+
"url": "https://example.com/ai-safety",
|
|
542
|
+
"snippet": "Learn how to protect your AI systems from prompt injection vulnerabilities...",
|
|
543
|
+
"injections_removed": 0,
|
|
544
|
+
"pii_redacted": 0
|
|
545
|
+
}
|
|
546
|
+
],
|
|
547
|
+
"total_injections_removed": 2
|
|
548
|
+
}
|
|
549
|
+
```
|
|
550
|
+
|
|
551
|
+
**What Visus caught:** The second search result contained both a prompt injection pattern ("Ignore previous instructions and...") and an email address. Both were detected and redacted before the result reached the LLM. The other results were clean and passed through unmodified.
|
|
552
|
+
|
|
553
|
+
**Use case:** Always use `visus_search` before fetching pages to safely discover content. Search results can contain SEO spam, malicious instructions, or PII that would compromise your AI agent.
|
|
554
|
+
|
|
555
|
+
---
|
|
556
|
+
|
|
557
|
+
### Example 6: JSON API Response with Format Detection
|
|
558
|
+
|
|
559
|
+
Fetch JSON data from an API endpoint with automatic formatting and sanitization.
|
|
560
|
+
|
|
561
|
+
**Tool Call:**
|
|
562
|
+
```json
|
|
563
|
+
{
|
|
564
|
+
"url": "https://api.github.com/repos/anthropics/anthropic-sdk-typescript",
|
|
565
|
+
"format": "text"
|
|
566
|
+
}
|
|
567
|
+
```
|
|
568
|
+
|
|
569
|
+
**Sanitized Output (excerpt):**
|
|
570
|
+
```json
|
|
571
|
+
{
|
|
572
|
+
"url": "https://api.github.com/repos/anthropics/anthropic-sdk-typescript",
|
|
573
|
+
"content": "JSON Response:\n\n{\n \"name\": \"anthropic-sdk-typescript\",\n \"full_name\": \"anthropics/anthropic-sdk-typescript\",\n \"description\": \"TypeScript library for the Anthropic API\",\n \"stargazers_count\": 1234,\n \"forks_count\": 89\n}",
|
|
574
|
+
"sanitization": {
|
|
575
|
+
"patterns_detected": [],
|
|
576
|
+
"pii_types_redacted": [],
|
|
577
|
+
"content_modified": false
|
|
578
|
+
},
|
|
579
|
+
"metadata": {
|
|
580
|
+
"title": "",
|
|
581
|
+
"fetched_at": "2024-01-15T16:30:00.000Z",
|
|
582
|
+
"content_length_original": 3456,
|
|
583
|
+
"content_length_sanitized": 3456,
|
|
584
|
+
"format_detected": "json",
|
|
585
|
+
"content_type": "application/json"
|
|
586
|
+
}
|
|
587
|
+
}
|
|
588
|
+
```
|
|
589
|
+
|
|
590
|
+
**What Visus caught:** The Content-Type header `application/json` was detected, and the raw JSON was automatically formatted with 2-space indentation for readability. The sanitizer validated the content and found no injection patterns or PII (clean API response).
|
|
591
|
+
|
|
592
|
+
**Format detection features:**
|
|
593
|
+
- Automatically detects Content-Type from HTTP response headers
|
|
594
|
+
- JSON responses are pretty-printed with indentation
|
|
595
|
+
- XML/RSS feeds are converted to clean Markdown
|
|
596
|
+
- All formats pass through the sanitizer pipeline
|
|
597
|
+
- `format_detected` and `content_type` included in metadata
|
|
598
|
+
|
|
599
|
+
---
|
|
600
|
+
|
|
601
|
+
### Example 7: RSS Feed with Automatic Markdown Conversion
|
|
602
|
+
|
|
603
|
+
Fetch an RSS feed and have it automatically converted to clean Markdown format.
|
|
604
|
+
|
|
605
|
+
**Tool Call:**
|
|
606
|
+
```json
|
|
607
|
+
{
|
|
608
|
+
"url": "https://blog.example.com/feed.xml"
|
|
609
|
+
}
|
|
610
|
+
```
|
|
611
|
+
|
|
612
|
+
**Sanitized Output (excerpt):**
|
|
613
|
+
```json
|
|
614
|
+
{
|
|
615
|
+
"url": "https://blog.example.com/feed.xml",
|
|
616
|
+
"content": "RSS Feed:\n\n# Example Blog\nThe latest news and updates\n\n## Items\n\n### New Feature Release\n\nWe're excited to announce our latest feature update...\n\nLink: https://blog.example.com/new-feature\nPublished: Mon, 15 Jan 2024 10:00:00 GMT\n\n---\n\n### Security Best Practices\n\nLearn about the latest security recommendations...\n\nLink: https://blog.example.com/security\nPublished: Tue, 16 Jan 2024 14:30:00 GMT\n\n---",
|
|
617
|
+
"sanitization": {
|
|
618
|
+
"patterns_detected": [],
|
|
619
|
+
"pii_types_redacted": [],
|
|
620
|
+
"content_modified": false
|
|
621
|
+
},
|
|
622
|
+
"metadata": {
|
|
623
|
+
"title": "",
|
|
624
|
+
"fetched_at": "2024-01-15T16:45:00.000Z",
|
|
625
|
+
"content_length_original": 5678,
|
|
626
|
+
"content_length_sanitized": 5678,
|
|
627
|
+
"format_detected": "rss",
|
|
628
|
+
"content_type": "application/rss+xml"
|
|
629
|
+
}
|
|
630
|
+
}
|
|
631
|
+
```
|
|
632
|
+
|
|
633
|
+
**What Visus caught:** The Content-Type header `application/rss+xml` triggered RSS feed parsing. The feed XML was converted to clean Markdown showing the channel title, description, and up to 10 feed items with titles, links, descriptions (truncated to 200 chars), and publication dates. All content was sanitized for injection patterns.
|
|
634
|
+
|
|
635
|
+
**RSS/Atom support:**
|
|
636
|
+
- RSS 2.0, RSS 1.0 (RDF), and Atom feed formats supported
|
|
637
|
+
- Extracts channel metadata and up to 10 items
|
|
638
|
+
- Converts to clean Markdown with proper formatting
|
|
639
|
+
- Item descriptions truncated to 200 characters for readability
|
|
640
|
+
- Graceful fallback to XML parsing for invalid feeds
|
|
641
|
+
|
|
642
|
+
---
|
|
643
|
+
|
|
644
|
+
### Safe Research Loop (3-Step Workflow)
|
|
645
|
+
|
|
646
|
+
Combine all three tools for safe, context-efficient web research:
|
|
647
|
+
|
|
648
|
+
**Step 1: Discover** – Use `visus_search` to find relevant pages safely:
|
|
649
|
+
```json
|
|
650
|
+
{
|
|
651
|
+
"query": "TypeScript async patterns",
|
|
652
|
+
"max_results": 5
|
|
653
|
+
}
|
|
654
|
+
```
|
|
655
|
+
|
|
656
|
+
**Step 2: Read** – Use `visus_read` to extract clean article content:
|
|
657
|
+
```json
|
|
658
|
+
{
|
|
659
|
+
"url": "https://blog.example.com/typescript-async-guide"
|
|
660
|
+
}
|
|
661
|
+
```
|
|
662
|
+
|
|
663
|
+
**Step 3: Extract** – Use `visus_fetch_structured` to pull specific data:
|
|
664
|
+
```json
|
|
665
|
+
{
|
|
666
|
+
"url": "https://docs.typescript.com/reference/async",
|
|
667
|
+
"schema": {
|
|
668
|
+
"syntax": "async/await syntax",
|
|
669
|
+
"example": "code example",
|
|
670
|
+
"best_practices": "recommended patterns"
|
|
671
|
+
}
|
|
672
|
+
}
|
|
673
|
+
```
|
|
674
|
+
|
|
675
|
+
All three steps run content through the sanitization pipeline, ensuring end-to-end security from search to extraction.
|
|
676
|
+
|
|
677
|
+
---
|
|
678
|
+
|
|
224
679
|
## Environment Variables
|
|
225
680
|
|
|
226
681
|
```bash
|
|
@@ -243,7 +698,7 @@ Visus is part of the **Lateos** platform — a security-by-design AI agent frame
|
|
|
243
698
|
|
|
244
699
|
- **AWS Serverless**: Lambda, Step Functions, API Gateway, Cognito
|
|
245
700
|
- **Security**: Bedrock Guardrails, KMS encryption, Secrets Manager
|
|
246
|
-
- **Validated Patterns**: 43 injection patterns,
|
|
701
|
+
- **Validated Patterns**: 43 injection patterns, 122/122 passing tests
|
|
247
702
|
- **CISSP/CEH-Informed**: Designed by security professionals
|
|
248
703
|
|
|
249
704
|
Learn more: [lateos.ai](https://lateos.ai) (Phase 2)
|
|
@@ -252,6 +707,26 @@ Learn more: [lateos.ai](https://lateos.ai) (Phase 2)
|
|
|
252
707
|
|
|
253
708
|
## Development
|
|
254
709
|
|
|
710
|
+
### Prerequisites
|
|
711
|
+
|
|
712
|
+
**macOS / Windows:** No additional setup required.
|
|
713
|
+
|
|
714
|
+
**Linux:** Playwright requires the following system libraries. Install them before running `npm install`:
|
|
715
|
+
|
|
716
|
+
```bash
|
|
717
|
+
# Ubuntu / Debian
|
|
718
|
+
sudo apt-get install -y \
|
|
719
|
+
libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 \
|
|
720
|
+
libxkbcommon0 libxcomposite1 libxdamage1 libxfixes3 \
|
|
721
|
+
libxrandr2 libgbm1 libnss3 libxss1 libasound2
|
|
722
|
+
|
|
723
|
+
# Fedora / RHEL
|
|
724
|
+
sudo dnf install -y atk at-spi2-atk libXrandr libgbm \
|
|
725
|
+
nss alsa-lib libXss cups-libs libdrm libxkbcommon
|
|
726
|
+
```
|
|
727
|
+
|
|
728
|
+
> If `npm test` fails with a Chromium launch error on Linux, see [TROUBLESHOOT-PLAYWRIGHT.md](./TROUBLESHOOT-PLAYWRIGHT-20260321-1549.md) for detailed troubleshooting steps.
|
|
729
|
+
|
|
255
730
|
```bash
|
|
256
731
|
# Clone repo
|
|
257
732
|
git clone https://github.com/visus-mcp/visus-mcp.git
|