visus-mcp 0.2.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (107) hide show
  1. package/.claude/settings.local.json +22 -0
  2. package/LINKEDIN-STRATEGY.md +367 -0
  3. package/README.md +491 -16
  4. package/ROADMAP.md +214 -34
  5. package/SECURITY-AUDIT-v1.md +277 -0
  6. package/STATUS.md +801 -42
  7. package/TROUBLESHOOT-AUTH-20260322-2019.md +291 -0
  8. package/TROUBLESHOOT-JEST-20260323-1357.md +139 -0
  9. package/TROUBLESHOOT-LAMBDA-20260322-1945.md +183 -0
  10. package/VISUS-CLAUDE-CODE-PROMPT.md +1 -1
  11. package/VISUS-PROJECT-PLAN.md +7 -0
  12. package/dist/browser/playwright-renderer.d.ts.map +1 -1
  13. package/dist/browser/playwright-renderer.js +7 -0
  14. package/dist/browser/playwright-renderer.js.map +1 -1
  15. package/dist/browser/reader.d.ts +31 -0
  16. package/dist/browser/reader.d.ts.map +1 -0
  17. package/dist/browser/reader.js +98 -0
  18. package/dist/browser/reader.js.map +1 -0
  19. package/dist/index.d.ts +1 -1
  20. package/dist/index.d.ts.map +1 -1
  21. package/dist/index.js +37 -5
  22. package/dist/index.js.map +1 -1
  23. package/dist/lambda-handler.d.ts +0 -6
  24. package/dist/lambda-handler.d.ts.map +1 -1
  25. package/dist/lambda-handler.js +97 -25
  26. package/dist/lambda-handler.js.map +1 -1
  27. package/dist/sanitizer/framework-mapper.d.ts +22 -0
  28. package/dist/sanitizer/framework-mapper.d.ts.map +1 -0
  29. package/dist/sanitizer/framework-mapper.js +296 -0
  30. package/dist/sanitizer/framework-mapper.js.map +1 -0
  31. package/dist/sanitizer/index.d.ts +10 -2
  32. package/dist/sanitizer/index.d.ts.map +1 -1
  33. package/dist/sanitizer/index.js +22 -6
  34. package/dist/sanitizer/index.js.map +1 -1
  35. package/dist/sanitizer/patterns.js +1 -1
  36. package/dist/sanitizer/patterns.js.map +1 -1
  37. package/dist/sanitizer/pii-allowlist.d.ts +49 -0
  38. package/dist/sanitizer/pii-allowlist.d.ts.map +1 -0
  39. package/dist/sanitizer/pii-allowlist.js +231 -0
  40. package/dist/sanitizer/pii-allowlist.js.map +1 -0
  41. package/dist/sanitizer/pii-redactor.d.ts +13 -1
  42. package/dist/sanitizer/pii-redactor.d.ts.map +1 -1
  43. package/dist/sanitizer/pii-redactor.js +26 -2
  44. package/dist/sanitizer/pii-redactor.js.map +1 -1
  45. package/dist/sanitizer/severity-classifier.d.ts +33 -0
  46. package/dist/sanitizer/severity-classifier.d.ts.map +1 -0
  47. package/dist/sanitizer/severity-classifier.js +113 -0
  48. package/dist/sanitizer/severity-classifier.js.map +1 -0
  49. package/dist/sanitizer/threat-reporter.d.ts +65 -0
  50. package/dist/sanitizer/threat-reporter.d.ts.map +1 -0
  51. package/dist/sanitizer/threat-reporter.js +160 -0
  52. package/dist/sanitizer/threat-reporter.js.map +1 -0
  53. package/dist/tools/fetch-structured.d.ts +5 -0
  54. package/dist/tools/fetch-structured.d.ts.map +1 -1
  55. package/dist/tools/fetch-structured.js +59 -8
  56. package/dist/tools/fetch-structured.js.map +1 -1
  57. package/dist/tools/fetch.d.ts +5 -0
  58. package/dist/tools/fetch.d.ts.map +1 -1
  59. package/dist/tools/fetch.js +43 -9
  60. package/dist/tools/fetch.js.map +1 -1
  61. package/dist/tools/read.d.ts +51 -0
  62. package/dist/tools/read.d.ts.map +1 -0
  63. package/dist/tools/read.js +127 -0
  64. package/dist/tools/read.js.map +1 -0
  65. package/dist/tools/search.d.ts +45 -0
  66. package/dist/tools/search.d.ts.map +1 -0
  67. package/dist/tools/search.js +220 -0
  68. package/dist/tools/search.js.map +1 -0
  69. package/dist/types.d.ts +74 -0
  70. package/dist/types.d.ts.map +1 -1
  71. package/dist/types.js.map +1 -1
  72. package/dist/utils/format-converter.d.ts +39 -0
  73. package/dist/utils/format-converter.d.ts.map +1 -0
  74. package/dist/utils/format-converter.js +191 -0
  75. package/dist/utils/format-converter.js.map +1 -0
  76. package/dist/utils/truncate.d.ts +26 -0
  77. package/dist/utils/truncate.d.ts.map +1 -0
  78. package/dist/utils/truncate.js +54 -0
  79. package/dist/utils/truncate.js.map +1 -0
  80. package/infrastructure/stack.ts +55 -6
  81. package/jest.config.js +3 -0
  82. package/package.json +9 -2
  83. package/src/browser/playwright-renderer.ts +8 -0
  84. package/src/browser/reader.ts +129 -0
  85. package/src/index.ts +49 -5
  86. package/src/lambda-handler.ts +131 -26
  87. package/src/sanitizer/framework-mapper.ts +347 -0
  88. package/src/sanitizer/index.ts +28 -6
  89. package/src/sanitizer/patterns.ts +1 -1
  90. package/src/sanitizer/pii-allowlist.ts +273 -0
  91. package/src/sanitizer/pii-redactor.ts +43 -2
  92. package/src/sanitizer/severity-classifier.ts +132 -0
  93. package/src/sanitizer/threat-reporter.ts +261 -0
  94. package/src/tools/fetch-structured.ts +63 -8
  95. package/src/tools/fetch.ts +45 -9
  96. package/src/tools/read.ts +143 -0
  97. package/src/tools/search.ts +263 -0
  98. package/src/types.ts +71 -0
  99. package/src/utils/format-converter.ts +236 -0
  100. package/src/utils/truncate.ts +64 -0
  101. package/tests/auth-smoke.test.ts +480 -0
  102. package/tests/fetch-tool.test.ts +595 -2
  103. package/tests/pii-allowlist.test.ts +282 -0
  104. package/tests/reader.test.ts +353 -0
  105. package/tests/sanitizer.test.ts +52 -0
  106. package/tests/search.test.ts +456 -0
  107. package/tests/threat-reporter.test.ts +266 -0
package/README.md CHANGED
@@ -147,13 +147,22 @@ Restart Claude Desktop. Visus tools are now available to Claude.
147
147
 
148
148
  ### `visus_fetch`
149
149
 
150
- Fetch and sanitize a web page.
150
+ Fetch and sanitize a web page with automatic format detection. Supports HTML, JSON, XML, and RSS/Atom feeds. Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
151
+
152
+ **Supported Formats:**
153
+ - **HTML** (`text/html`, `application/xhtml+xml`) - Standard web pages, returned as-is
154
+ - **JSON** (`application/json`) - API responses, formatted with 2-space indentation
155
+ - **XML** (`application/xml`, `text/xml`) - XML documents, converted to clean text representation
156
+ - **RSS/Atom** (`application/rss+xml`, `application/atom+xml`) - Feeds converted to Markdown with up to 10 items
157
+
158
+ ### `visus_read`
159
+
160
+ Extract clean article content from a web page using Mozilla Readability (reader mode). Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
151
161
 
152
162
  **Input:**
153
163
  ```json
154
164
  {
155
- "url": "https://example.com",
156
- "format": "markdown", // or "text"
165
+ "url": "https://example.com/article",
157
166
  "timeout_ms": 10000 // optional
158
167
  }
159
168
  ```
@@ -161,25 +170,59 @@ Fetch and sanitize a web page.
161
170
  **Output:**
162
171
  ```json
163
172
  {
164
- "url": "https://example.com",
165
- "content": "# Page Title\n\nSanitized page content...",
166
- "sanitization": {
167
- "patterns_detected": ["direct_instruction_injection"],
168
- "pii_types_redacted": ["email", "phone"],
169
- "content_modified": true
170
- },
173
+ "url": "https://example.com/article",
174
+ "content": "This is the main article content, stripped of navigation, ads, and boilerplate...",
171
175
  "metadata": {
172
- "title": "Example Domain",
173
- "fetched_at": "2024-01-15T10:30:00.000Z",
174
- "content_length_original": 5000,
175
- "content_length_sanitized": 4800
176
+ "title": "Article Title",
177
+ "author": "Jane Doe",
178
+ "published": "2024-01-15T10:00:00Z",
179
+ "word_count": 1250,
180
+ "reader_mode_available": true,
181
+ "sanitized": true,
182
+ "injections_removed": 0,
183
+ "pii_redacted": 1,
184
+ "truncated": false,
185
+ "fetched_at": "2024-01-15T10:30:00.000Z"
176
186
  }
177
187
  }
178
188
  ```
179
189
 
190
+ ### `visus_search`
191
+
192
+ Search the web via DuckDuckGo and return sanitized results with prompt injection and PII removed. Use before `visus_fetch` or `visus_read` to safely discover and then read pages. Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
193
+
194
+ **Input:**
195
+ ```json
196
+ {
197
+ "query": "TypeScript programming",
198
+ "max_results": 5 // optional, default: 5, max: 10
199
+ }
200
+ ```
201
+
202
+ **Output:**
203
+ ```json
204
+ {
205
+ "query": "TypeScript programming",
206
+ "result_count": 5,
207
+ "sanitized": true,
208
+ "results": [
209
+ {
210
+ "title": "TypeScript is a strongly typed programming language.",
211
+ "url": "https://typescriptlang.org",
212
+ "snippet": "TypeScript is a strongly typed programming language that builds on JavaScript...",
213
+ "injections_removed": 0,
214
+ "pii_redacted": 0
215
+ }
216
+ ],
217
+ "total_injections_removed": 0
218
+ }
219
+ ```
220
+
221
+ All search result titles and snippets are independently sanitized before reaching the LLM.
222
+
180
223
  ### `visus_fetch_structured`
181
224
 
182
- Extract structured data from a web page according to a schema.
225
+ Extract structured data from a web page according to a schema. Includes NIST AI 600-1 / OWASP LLM / MITRE ATLAS aligned threat report when injection or PII is detected.
183
226
 
184
227
  **Input:**
185
228
  ```json
@@ -221,6 +264,418 @@ All extracted fields are individually sanitized.
221
264
 
222
265
  ---
223
266
 
267
+ ## Threat Reporting
268
+
269
+ When prompt injection or PII is detected, Visus automatically generates a structured threat report with two output layers:
270
+
271
+ ### 1. TOON-Formatted Findings (Token-Efficient)
272
+
273
+ Findings are encoded using [TOON format](https://toonformat.dev) for token efficiency while preserving machine readability. Each finding includes:
274
+
275
+ - Pattern ID and category
276
+ - Severity level (CRITICAL, HIGH, MEDIUM, LOW)
277
+ - Confidence score
278
+ - Framework alignments (OWASP LLM Top 10, NIST AI 600-1, MITRE ATLAS)
279
+ - Remediation status
280
+
281
+ ### 2. Markdown Compliance Report (Human-Readable)
282
+
283
+ A formatted Markdown table renders cleanly in Claude Desktop and GitHub, showing:
284
+
285
+ - Overall severity assessment
286
+ - Findings summary by severity
287
+ - Detailed findings table with framework mappings
288
+ - PII redaction statistics
289
+ - Remediation confirmation
290
+
291
+ ### Framework Alignments
292
+
293
+ Every detected threat is mapped to three compliance frameworks:
294
+
295
+ - **[OWASP LLM Top 10 (2025)](https://owasp.org/www-project-top-10-for-large-language-model-applications/)**: Industry-standard LLM security risks
296
+ - **[NIST AI 600-1](https://csrc.nist.gov/pubs/ai/600/1/final)**: Generative AI Profile for risk management
297
+ - **[MITRE ATLAS](https://atlas.mitre.org/)**: Adversarial Threat Landscape for AI Systems
298
+
299
+ ### When Reports Are Generated
300
+
301
+ Threat reports are included in tool responses **only when findings exist**:
302
+ - ✅ Injections detected → Report included
303
+ - ✅ PII redacted → Report included
304
+ - ❌ Clean content → Report omitted (zero overhead)
305
+
306
+ ### Example Threat Report
307
+
308
+ When a HIGH severity injection is detected:
309
+
310
+ ```markdown
311
+ ---
312
+ ## 🟠 Visus Threat Report
313
+ **Generated:** 2026-03-23T14:30:00.000Z
314
+ **Source:** https://malicious.example.com
315
+ **Overall Severity:** HIGH
316
+ **Framework:** OWASP LLM Top 10 | NIST AI 600-1 | MITRE ATLAS
317
+
318
+ ### Findings Summary
319
+ | Severity | Count |
320
+ |---|---|
321
+ | 🔴 CRITICAL | 0 |
322
+ | 🟠 HIGH | 1 |
323
+ | 🟡 MEDIUM | 0 |
324
+ | 🟢 LOW | 0 |
325
+
326
+ ### Findings Detail
327
+ | # | Category | Severity | Confidence | OWASP | MITRE |
328
+ |---|---|---|---|---|---|
329
+ | 1 | role_hijacking | CRITICAL | 95% | LLM01:2025 | AML.T0051.000 |
330
+
331
+ ### Remediation Status
332
+ ✅ All findings sanitized. Content delivered clean.
333
+
334
+ *Report generated by Visus MCP — Security-first web access for Claude*
335
+ ---
336
+ ```
337
+
338
+ **Note:** PDF export for compliance artifacts is on the roadmap for a future `visus_report` tool.
339
+
340
+ ---
341
+
342
+ ## Examples
343
+
344
+ ### Example 1: Public Health Page with PII Allowlist
345
+
346
+ Fetching a MedlinePlus health information page demonstrates both injection pattern detection and the domain-scoped PII allowlist feature.
347
+
348
+ **Tool Call:**
349
+ ```json
350
+ {
351
+ "url": "https://medlineplus.gov/poisoning.html",
352
+ "format": "markdown"
353
+ }
354
+ ```
355
+
356
+ **Sanitized Output (excerpt):**
357
+ ```json
358
+ {
359
+ "url": "https://medlineplus.gov/poisoning.html",
360
+ "content": "# Poisoning\n\n**Call 1-800-222-1222** for immediate help...\n\n**Contact:** [REDACTED:EMAIL] for general inquiries...",
361
+ "sanitization": {
362
+ "patterns_detected": [],
363
+ "pii_types_redacted": ["email"],
364
+ "pii_allowlisted": [
365
+ {
366
+ "type": "phone",
367
+ "value": "1-800-222-1222",
368
+ "reason": "Trusted health authority number on medlineplus.gov (Poison Control)"
369
+ }
370
+ ],
371
+ "content_modified": true
372
+ },
373
+ "metadata": {
374
+ "title": "Poisoning: MedlinePlus",
375
+ "content_length_original": 15234,
376
+ "content_length_sanitized": 15180
377
+ }
378
+ }
379
+ ```
380
+
381
+ **What Visus caught:** Regular email addresses were redacted (`[REDACTED:EMAIL]`), but the Poison Control hotline number was preserved because it appears on a trusted `.gov` health domain. This demonstrates the PII allowlist in action — critical health resources remain accessible while general contact info is scrubbed.
382
+
383
+ ---
384
+
385
+ ### Example 2: Structured Data Extraction from Documentation
386
+
387
+ Extract navigation links and headings from a documentation page.
388
+
389
+ **Tool Call:**
390
+ ```json
391
+ {
392
+ "url": "https://docs.github.com/en",
393
+ "schema": {
394
+ "main_heading": "h1",
395
+ "first_link": "link url",
396
+ "first_link_text": "link text",
397
+ "description": "paragraph text"
398
+ }
399
+ }
400
+ ```
401
+
402
+ **Sanitized Output:**
403
+ ```json
404
+ {
405
+ "url": "https://docs.github.com/en",
406
+ "data": {
407
+ "main_heading": "GitHub Docs",
408
+ "first_link": "/en/get-started",
409
+ "first_link_text": "Get started",
410
+ "description": "Help for wherever you are on your GitHub journey."
411
+ },
412
+ "sanitization": {
413
+ "patterns_detected": [],
414
+ "pii_types_redacted": [],
415
+ "pii_allowlisted": [],
416
+ "content_modified": false
417
+ },
418
+ "metadata": {
419
+ "title": "GitHub Docs",
420
+ "content_length_original": 45123,
421
+ "content_length_sanitized": 45123
422
+ }
423
+ }
424
+ ```
425
+
426
+ **What Visus caught:** This page was clean — no injection patterns or PII detected. The structured extraction returned all requested fields with `content_modified: false`, indicating the sanitizer validated the content but made no changes.
427
+
428
+ ---
429
+
430
+ ### Example 3: JavaScript-Heavy SPA with Playwright Rendering
431
+
432
+ Modern single-page applications require JavaScript execution. Visus uses headless Chromium via Playwright to render dynamic content before sanitization.
433
+
434
+ **Tool Call:**
435
+ ```json
436
+ {
437
+ "url": "https://github.com/anthropics/anthropic-sdk-typescript",
438
+ "format": "markdown",
439
+ "timeout_ms": 15000
440
+ }
441
+ ```
442
+
443
+ **Sanitized Output (excerpt):**
444
+ ```json
445
+ {
446
+ "url": "https://github.com/anthropics/anthropic-sdk-typescript",
447
+ "content": "# anthropic-sdk-typescript\n\n**Repository:** anthropics/anthropic-sdk-typescript\n\n**Description:** TypeScript SDK for Anthropic's Claude API...\n\n**Latest commit:** [REDACTED:COMMIT_HASH] by [REDACTED:EMAIL]...",
448
+ "sanitization": {
449
+ "patterns_detected": [],
450
+ "pii_types_redacted": ["email"],
451
+ "pii_allowlisted": [],
452
+ "content_modified": true
453
+ },
454
+ "metadata": {
455
+ "title": "GitHub - anthropics/anthropic-sdk-typescript",
456
+ "content_length_original": 23456,
457
+ "content_length_sanitized": 23401
458
+ }
459
+ }
460
+ ```
461
+
462
+ **What Visus caught:** The page rendered completely via Playwright (including React components, lazy-loaded content, and dynamic navigation). Email addresses in commit author fields were redacted. No injection patterns were detected in this legitimate repository page.
463
+
464
+ **Key difference from static fetchers:** Tools like `curl` or basic HTTP clients would return an empty `<div id="root">` for SPAs. Visus renders the full JavaScript application before sanitization, ensuring you get the actual page content Claude sees.
465
+
466
+ ---
467
+
468
+ ### Example 4: Reader Mode for Context-Efficient Article Reading
469
+
470
+ When you need clean article content without navigation clutter, use `visus_read` to extract the main text using Mozilla Readability.
471
+
472
+ **Tool Call:**
473
+ ```json
474
+ {
475
+ "url": "https://en.wikipedia.org/wiki/Prompt_injection",
476
+ "timeout_ms": 15000
477
+ }
478
+ ```
479
+
480
+ **Sanitized Output (excerpt):**
481
+ ```json
482
+ {
483
+ "url": "https://en.wikipedia.org/wiki/Prompt_injection",
484
+ "content": "Prompt injection is a type of cyberattack that involves adding malicious instructions to a prompt for an AI system...\n\n[Main article content continues, stripped of navigation, sidebars, and Wikipedia UI elements]\n\nSee also:\n- AI safety\n- Adversarial machine learning\n- Computer security...",
485
+ "metadata": {
486
+ "title": "Prompt injection - Wikipedia",
487
+ "author": null,
488
+ "published": null,
489
+ "word_count": 892,
490
+ "reader_mode_available": true,
491
+ "sanitized": true,
492
+ "injections_removed": 0,
493
+ "pii_redacted": 0,
494
+ "truncated": false,
495
+ "fetched_at": "2024-01-15T14:22:00.000Z"
496
+ }
497
+ }
498
+ ```
499
+
500
+ **What Visus caught:** Readability successfully extracted the main article content, removing Wikipedia's navigation sidebar, footer links, and UI chrome. The extracted text is ~70% smaller than the full page HTML, saving tokens while preserving all essential information. No injection patterns or PII were detected in this educational content.
501
+
502
+ **Use case:** Reader mode is ideal for documentation pages, news articles, blog posts, and any content-heavy page where you want the text without the surrounding UI. The `word_count` field helps you estimate token usage before processing.
503
+
504
+ ---
505
+
506
+ ### Example 5: Safe Web Search with Injection Detection
507
+
508
+ Search the web safely using `visus_search` with DuckDuckGo, demonstrating how search results are sanitized before reaching the LLM.
509
+
510
+ **Tool Call:**
511
+ ```json
512
+ {
513
+ "query": "AI prompt injection attacks",
514
+ "max_results": 3
515
+ }
516
+ ```
517
+
518
+ **Sanitized Output (with detected injection):**
519
+ ```json
520
+ {
521
+ "query": "AI prompt injection attacks",
522
+ "result_count": 3,
523
+ "sanitized": true,
524
+ "results": [
525
+ {
526
+ "title": "Prompt injection is a type of cyberattack...",
527
+ "url": "https://en.wikipedia.org/wiki/Prompt_injection",
528
+ "snippet": "Prompt injection is a type of cyberattack that involves adding malicious instructions to a prompt...",
529
+ "injections_removed": 0,
530
+ "pii_redacted": 0
531
+ },
532
+ {
533
+ "title": "[REDACTED:INSTRUCTION_INJECTION] for details contact...",
534
+ "url": "https://suspicious-seo-spam.example",
535
+ "snippet": "[REDACTED:INSTRUCTION_INJECTION] [REDACTED:EMAIL]",
536
+ "injections_removed": 2,
537
+ "pii_redacted": 1
538
+ },
539
+ {
540
+ "title": "AI Safety: Understanding Prompt Injection.",
541
+ "url": "https://example.com/ai-safety",
542
+ "snippet": "Learn how to protect your AI systems from prompt injection vulnerabilities...",
543
+ "injections_removed": 0,
544
+ "pii_redacted": 0
545
+ }
546
+ ],
547
+ "total_injections_removed": 2
548
+ }
549
+ ```
550
+
551
+ **What Visus caught:** The second search result contained both a prompt injection pattern ("Ignore previous instructions and...") and an email address. Both were detected and redacted before the result reached the LLM. The other results were clean and passed through unmodified.
552
+
553
+ **Use case:** Always use `visus_search` before fetching pages to safely discover content. Search results can contain SEO spam, malicious instructions, or PII that would compromise your AI agent.
554
+
555
+ ---
556
+
557
+ ### Example 6: JSON API Response with Format Detection
558
+
559
+ Fetch JSON data from an API endpoint with automatic formatting and sanitization.
560
+
561
+ **Tool Call:**
562
+ ```json
563
+ {
564
+ "url": "https://api.github.com/repos/anthropics/anthropic-sdk-typescript",
565
+ "format": "text"
566
+ }
567
+ ```
568
+
569
+ **Sanitized Output (excerpt):**
570
+ ```json
571
+ {
572
+ "url": "https://api.github.com/repos/anthropics/anthropic-sdk-typescript",
573
+ "content": "JSON Response:\n\n{\n \"name\": \"anthropic-sdk-typescript\",\n \"full_name\": \"anthropics/anthropic-sdk-typescript\",\n \"description\": \"TypeScript library for the Anthropic API\",\n \"stargazers_count\": 1234,\n \"forks_count\": 89\n}",
574
+ "sanitization": {
575
+ "patterns_detected": [],
576
+ "pii_types_redacted": [],
577
+ "content_modified": false
578
+ },
579
+ "metadata": {
580
+ "title": "",
581
+ "fetched_at": "2024-01-15T16:30:00.000Z",
582
+ "content_length_original": 3456,
583
+ "content_length_sanitized": 3456,
584
+ "format_detected": "json",
585
+ "content_type": "application/json"
586
+ }
587
+ }
588
+ ```
589
+
590
+ **What Visus caught:** The Content-Type header `application/json` was detected, and the raw JSON was automatically formatted with 2-space indentation for readability. The sanitizer validated the content and found no injection patterns or PII (clean API response).
591
+
592
+ **Format detection features:**
593
+ - Automatically detects Content-Type from HTTP response headers
594
+ - JSON responses are pretty-printed with indentation
595
+ - XML/RSS feeds are converted to clean Markdown
596
+ - All formats pass through the sanitizer pipeline
597
+ - `format_detected` and `content_type` included in metadata
598
+
599
+ ---
600
+
601
+ ### Example 7: RSS Feed with Automatic Markdown Conversion
602
+
603
+ Fetch an RSS feed and have it automatically converted to clean Markdown format.
604
+
605
+ **Tool Call:**
606
+ ```json
607
+ {
608
+ "url": "https://blog.example.com/feed.xml"
609
+ }
610
+ ```
611
+
612
+ **Sanitized Output (excerpt):**
613
+ ```json
614
+ {
615
+ "url": "https://blog.example.com/feed.xml",
616
+ "content": "RSS Feed:\n\n# Example Blog\nThe latest news and updates\n\n## Items\n\n### New Feature Release\n\nWe're excited to announce our latest feature update...\n\nLink: https://blog.example.com/new-feature\nPublished: Mon, 15 Jan 2024 10:00:00 GMT\n\n---\n\n### Security Best Practices\n\nLearn about the latest security recommendations...\n\nLink: https://blog.example.com/security\nPublished: Tue, 16 Jan 2024 14:30:00 GMT\n\n---",
617
+ "sanitization": {
618
+ "patterns_detected": [],
619
+ "pii_types_redacted": [],
620
+ "content_modified": false
621
+ },
622
+ "metadata": {
623
+ "title": "",
624
+ "fetched_at": "2024-01-15T16:45:00.000Z",
625
+ "content_length_original": 5678,
626
+ "content_length_sanitized": 5678,
627
+ "format_detected": "rss",
628
+ "content_type": "application/rss+xml"
629
+ }
630
+ }
631
+ ```
632
+
633
+ **What Visus caught:** The Content-Type header `application/rss+xml` triggered RSS feed parsing. The feed XML was converted to clean Markdown showing the channel title, description, and up to 10 feed items with titles, links, descriptions (truncated to 200 chars), and publication dates. All content was sanitized for injection patterns.
634
+
635
+ **RSS/Atom support:**
636
+ - RSS 2.0, RSS 1.0 (RDF), and Atom feed formats supported
637
+ - Extracts channel metadata and up to 10 items
638
+ - Converts to clean Markdown with proper formatting
639
+ - Item descriptions truncated to 200 characters for readability
640
+ - Graceful fallback to XML parsing for invalid feeds
641
+
642
+ ---
643
+
644
+ ### Safe Research Loop (3-Step Workflow)
645
+
646
+ Combine all three tools for safe, context-efficient web research:
647
+
648
+ **Step 1: Discover** – Use `visus_search` to find relevant pages safely:
649
+ ```json
650
+ {
651
+ "query": "TypeScript async patterns",
652
+ "max_results": 5
653
+ }
654
+ ```
655
+
656
+ **Step 2: Read** – Use `visus_read` to extract clean article content:
657
+ ```json
658
+ {
659
+ "url": "https://blog.example.com/typescript-async-guide"
660
+ }
661
+ ```
662
+
663
+ **Step 3: Extract** – Use `visus_fetch_structured` to pull specific data:
664
+ ```json
665
+ {
666
+ "url": "https://docs.typescript.com/reference/async",
667
+ "schema": {
668
+ "syntax": "async/await syntax",
669
+ "example": "code example",
670
+ "best_practices": "recommended patterns"
671
+ }
672
+ }
673
+ ```
674
+
675
+ All three steps run content through the sanitization pipeline, ensuring end-to-end security from search to extraction.
676
+
677
+ ---
678
+
224
679
  ## Environment Variables
225
680
 
226
681
  ```bash
@@ -243,7 +698,7 @@ Visus is part of the **Lateos** platform — a security-by-design AI agent frame
243
698
 
244
699
  - **AWS Serverless**: Lambda, Step Functions, API Gateway, Cognito
245
700
  - **Security**: Bedrock Guardrails, KMS encryption, Secrets Manager
246
- - **Validated Patterns**: 43 injection patterns, 73/73 passing tests
701
+ - **Validated Patterns**: 43 injection patterns, 122/122 passing tests
247
702
  - **CISSP/CEH-Informed**: Designed by security professionals
248
703
 
249
704
  Learn more: [lateos.ai](https://lateos.ai) (Phase 2)
@@ -252,6 +707,26 @@ Learn more: [lateos.ai](https://lateos.ai) (Phase 2)
252
707
 
253
708
  ## Development
254
709
 
710
+ ### Prerequisites
711
+
712
+ **macOS / Windows:** No additional setup required.
713
+
714
+ **Linux:** Playwright requires the following system libraries. Install them before running `npm install`:
715
+
716
+ ```bash
717
+ # Ubuntu / Debian
718
+ sudo apt-get install -y \
719
+ libatk1.0-0 libatk-bridge2.0-0 libcups2 libdrm2 \
720
+ libxkbcommon0 libxcomposite1 libxdamage1 libxfixes3 \
721
+ libxrandr2 libgbm1 libnss3 libxss1 libasound2
722
+
723
+ # Fedora / RHEL
724
+ sudo dnf install -y atk at-spi2-atk libXrandr libgbm \
725
+ nss alsa-lib libXss cups-libs libdrm libxkbcommon
726
+ ```
727
+
728
+ > If `npm test` fails with a Chromium launch error on Linux, see [TROUBLESHOOT-PLAYWRIGHT.md](./TROUBLESHOOT-PLAYWRIGHT-20260321-1549.md) for detailed troubleshooting steps.
729
+
255
730
  ```bash
256
731
  # Clone repo
257
732
  git clone https://github.com/visus-mcp/visus-mcp.git