visus-mcp 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/settings.local.json +36 -0
- package/CLAUDE.md +324 -0
- package/README.md +290 -0
- package/SECURITY.md +360 -0
- package/STATUS.md +482 -0
- package/TROUBLESHOOT-BUILD-20260319-1450.md +546 -0
- package/TROUBLESHOOT-FETCH-20260320-1150.md +168 -0
- package/TROUBLESHOOT-SSL-20260320-1138.md +171 -0
- package/TROUBLESHOOT-STRUCTURED-20260320-1200.md +246 -0
- package/TROUBLESHOOT-TEST-20260320-0942.md +281 -0
- package/VISUS-CLAUDE-CODE-PROMPT.md +324 -0
- package/VISUS-PROJECT-PLAN.md +198 -0
- package/dist/browser/__mocks__/playwright-renderer.d.ts +25 -0
- package/dist/browser/__mocks__/playwright-renderer.d.ts.map +1 -0
- package/dist/browser/__mocks__/playwright-renderer.js +119 -0
- package/dist/browser/__mocks__/playwright-renderer.js.map +1 -0
- package/dist/browser/playwright-renderer.d.ts +36 -0
- package/dist/browser/playwright-renderer.d.ts.map +1 -0
- package/dist/browser/playwright-renderer.js +115 -0
- package/dist/browser/playwright-renderer.js.map +1 -0
- package/dist/index.d.ts +14 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +129 -0
- package/dist/index.js.map +1 -0
- package/dist/sanitizer/index.d.ts +55 -0
- package/dist/sanitizer/index.d.ts.map +1 -0
- package/dist/sanitizer/index.js +89 -0
- package/dist/sanitizer/index.js.map +1 -0
- package/dist/sanitizer/injection-detector.d.ts +34 -0
- package/dist/sanitizer/injection-detector.d.ts.map +1 -0
- package/dist/sanitizer/injection-detector.js +89 -0
- package/dist/sanitizer/injection-detector.js.map +1 -0
- package/dist/sanitizer/patterns.d.ts +30 -0
- package/dist/sanitizer/patterns.d.ts.map +1 -0
- package/dist/sanitizer/patterns.js +372 -0
- package/dist/sanitizer/patterns.js.map +1 -0
- package/dist/sanitizer/pii-redactor.d.ts +29 -0
- package/dist/sanitizer/pii-redactor.d.ts.map +1 -0
- package/dist/sanitizer/pii-redactor.js +189 -0
- package/dist/sanitizer/pii-redactor.js.map +1 -0
- package/dist/tools/fetch-structured.d.ts +46 -0
- package/dist/tools/fetch-structured.d.ts.map +1 -0
- package/dist/tools/fetch-structured.js +186 -0
- package/dist/tools/fetch-structured.js.map +1 -0
- package/dist/tools/fetch.d.ts +44 -0
- package/dist/tools/fetch.d.ts.map +1 -0
- package/dist/tools/fetch.js +97 -0
- package/dist/tools/fetch.js.map +1 -0
- package/dist/types.d.ts +93 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +16 -0
- package/dist/types.js.map +1 -0
- package/jest.config.js +30 -0
- package/jest.setup.js +9 -0
- package/package.json +52 -0
- package/src/browser/__mocks__/playwright-renderer.ts +140 -0
- package/src/browser/playwright-renderer.ts +142 -0
- package/src/index.ts +169 -0
- package/src/sanitizer/index.ts +127 -0
- package/src/sanitizer/injection-detector.ts +121 -0
- package/src/sanitizer/patterns.ts +424 -0
- package/src/sanitizer/pii-redactor.ts +226 -0
- package/src/tools/fetch-structured.ts +218 -0
- package/src/tools/fetch.ts +108 -0
- package/src/types.ts +101 -0
- package/test-output.txt +4 -0
- package/tests/fetch-tool.test.ts +329 -0
- package/tests/injection-corpus.ts +338 -0
- package/tests/sanitizer.test.ts +306 -0
- package/tsconfig.json +25 -0
|
@@ -0,0 +1,281 @@
|
|
|
1
|
+
# Visus MCP Test Timeout - Troubleshooting Log
|
|
2
|
+
|
|
3
|
+
Started: 2026-03-20 09:42
|
|
4
|
+
Goal: Resolve Jest test timeout to validate 43 injection pattern categories
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## [09:42:00] Step 1 - Examine Jest Configuration
|
|
9
|
+
|
|
10
|
+
**Goal:** Review jest.config.js to identify timeout settings and test environment
|
|
11
|
+
**Reasoning:** Test timeouts are often caused by incorrect Jest configuration (missing timeout values, wrong test environment, or missing setup files)
|
|
12
|
+
**Action:** cat jest.config.js
|
|
13
|
+
**Result:**
|
|
14
|
+
```javascript
|
|
15
|
+
{
|
|
16
|
+
preset: 'ts-jest/presets/default-esm',
|
|
17
|
+
testEnvironment: 'node',
|
|
18
|
+
testTimeout: 15000,
|
|
19
|
+
forceExit: true,
|
|
20
|
+
detectOpenHandles: false
|
|
21
|
+
}
|
|
22
|
+
```
|
|
23
|
+
**Status:** ✅ Success - Config looks reasonable, has timeout and forceExit
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## [09:42:30] Step 2 - Check if Playwright is Actually Installed
|
|
28
|
+
|
|
29
|
+
**Goal:** Verify Playwright and its browsers are installed in node_modules
|
|
30
|
+
**Reasoning:** Playwright tests hang if browsers aren't installed. The test might be trying to download browsers on first run, which could timeout
|
|
31
|
+
**Action:** ls node_modules | grep playwright && ls node_modules | grep turndown
|
|
32
|
+
**Result:**
|
|
33
|
+
```
|
|
34
|
+
NOT FOUND - playwright not installed
|
|
35
|
+
NOT FOUND - turndown not installed
|
|
36
|
+
```
|
|
37
|
+
However, checked src/browser/playwright-renderer.ts and it's using native fetch(), not Playwright (Phase 2 feature).
|
|
38
|
+
Tests also mock the renderer, so Playwright shouldn't be the issue.
|
|
39
|
+
**Status:** ✅ Success - Playwright not needed for Phase 1
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## [09:43:00] Step 3 - Run Tests with Verbose Output
|
|
44
|
+
|
|
45
|
+
**Goal:** Execute jest and capture the exact point where it hangs
|
|
46
|
+
**Reasoning:** Need to see which test file or specific test case is causing the timeout
|
|
47
|
+
**Action:** npm test -- tests/sanitizer.test.ts --verbose
|
|
48
|
+
**Result:**
|
|
49
|
+
```
|
|
50
|
+
sh: jest: command not found
|
|
51
|
+
```
|
|
52
|
+
**Status:** ❌ Failed - Jest not found in PATH
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## [09:43:30] Step 4 - Check for Multiple Package Lock Files
|
|
57
|
+
|
|
58
|
+
**Goal:** Investigate why jest is not in PATH despite node_modules existing
|
|
59
|
+
**Reasoning:** Found jest in "node_modules/.bin 2" instead of "node_modules/.bin", suggests corrupted install
|
|
60
|
+
**Action:** ls -la | grep package-lock
|
|
61
|
+
**Result:**
|
|
62
|
+
```
|
|
63
|
+
package-lock 2.json (173,875 bytes)
|
|
64
|
+
package-lock.json (173,502 bytes)
|
|
65
|
+
```
|
|
66
|
+
**Root Cause Found:** TWO package-lock.json files have created duplicate/corrupted node_modules structure
|
|
67
|
+
**Status:** ✅ Success - Issue identified
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
## [09:44:00] Step 5 - Clean and Reinstall Dependencies
|
|
72
|
+
|
|
73
|
+
**Goal:** Remove corrupted node_modules and reinstall from single package-lock.json
|
|
74
|
+
**Reasoning:** Multiple package-lock files create conflicting dependency trees; clean slate needed
|
|
75
|
+
**Action:** rm -rf node_modules "package-lock 2.json" && npm install
|
|
76
|
+
**Result:** Install started but took excessive time, killed and restarted
|
|
77
|
+
**Status:** ⚠️ Partial - Found additional issue: package.json missing runtime dependencies
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## [09:45:00] Step 6 - Analyze package.json Dependencies
|
|
82
|
+
|
|
83
|
+
**Goal:** Verify all required dependencies are listed in package.json
|
|
84
|
+
**Reasoning:** npm install hanging might indicate dependency resolution issues
|
|
85
|
+
**Action:** cat package.json | grep -A 10 dependencies
|
|
86
|
+
**Result:**
|
|
87
|
+
```json
|
|
88
|
+
"dependencies": {
|
|
89
|
+
"@modelcontextprotocol/sdk": "^1.0.4"
|
|
90
|
+
},
|
|
91
|
+
"devDependencies": {
|
|
92
|
+
"@types/jest": "^29.5.14",
|
|
93
|
+
"@types/node": "^20.17.6",
|
|
94
|
+
"jest": "^29.7.0",
|
|
95
|
+
"ts-jest": "^29.2.5",
|
|
96
|
+
"typescript": "^5.7.2"
|
|
97
|
+
}
|
|
98
|
+
```
|
|
99
|
+
**Finding:** According to STATUS.md, Playwright and Turndown should be in dependencies, but they're missing. However, src/browser/playwright-renderer.ts uses native fetch() for Phase 1, so this might be intentional.
|
|
100
|
+
**Status:** ✅ Success - Dependencies appear minimal but correct for Phase 1
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## [09:47:00] Step 7 - Retry npm install
|
|
105
|
+
|
|
106
|
+
**Goal:** Complete clean installation of dependencies
|
|
107
|
+
**Reasoning:** Previous install was interrupted; need clean node_modules
|
|
108
|
+
**Action:** rm -rf node_modules && npm install
|
|
109
|
+
**Result:**
|
|
110
|
+
```
|
|
111
|
+
Successfully installed all dependencies
|
|
112
|
+
Jest binary: node_modules/.bin/jest (verified)
|
|
113
|
+
Packages: 231+ installed from cache
|
|
114
|
+
```
|
|
115
|
+
**Status:** ✅ Success - Dependencies installed correctly
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
## [09:52:00] Step 8 - Test Jest Execution
|
|
120
|
+
|
|
121
|
+
**Goal:** Determine why jest hangs when executed
|
|
122
|
+
**Reasoning:** Dependencies are installed but jest --version and npm test both hang indefinitely
|
|
123
|
+
**Action:** npm test (attempt 1, timeout after 30s)
|
|
124
|
+
**Result:** Jest process hangs with NO output, even during initialization
|
|
125
|
+
**Status:** ❌ Failed - Jest hangs at startup
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
## [09:53:00] Step 9 - Analyze Jest Hang Root Cause
|
|
130
|
+
|
|
131
|
+
**Goal:** Identify why jest hangs before even starting tests
|
|
132
|
+
**Reasoning:** Hang occurs before test execution, suggests config or module resolution issue
|
|
133
|
+
**Hypothesis 1:** ESM module configuration issue (ts-jest with useESM: true)
|
|
134
|
+
**Hypothesis 2:** TypeScript compilation during test runtime causing circular dependency
|
|
135
|
+
**Hypothesis 3:** Missing or corrupted tsconfig.json
|
|
136
|
+
**Action:** cat tsconfig.json && npx tsc tests/sanitizer.test.ts --noEmit
|
|
137
|
+
**Result:**
|
|
138
|
+
```
|
|
139
|
+
tsconfig excludes "tests" directory (not an issue - jest handles this)
|
|
140
|
+
tsc command hangs indefinitely (same as jest!)
|
|
141
|
+
```
|
|
142
|
+
**New Discovery:** Both TypeScript compiler AND jest hang - suggests source code issue
|
|
143
|
+
**Status:** ⚠️ Partial - TypeScript compilation itself is broken
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
## [09:54:00] Step 10 - Check for Duplicate Source Files
|
|
148
|
+
|
|
149
|
+
**Goal:** Find source code files that might cause compilation issues
|
|
150
|
+
**Reasoning:** If tsc hangs, there might be circular imports or duplicate files confusing module resolution
|
|
151
|
+
**Action:** ls -la src/sanitizer/patterns*
|
|
152
|
+
**Result:**
|
|
153
|
+
```
|
|
154
|
+
patterns 2.ts (13,867 bytes) - DUPLICATE FILE
|
|
155
|
+
patterns.ts (13,589 bytes) - CORRECT FILE
|
|
156
|
+
```
|
|
157
|
+
Also found earlier:
|
|
158
|
+
```
|
|
159
|
+
dist/browser 2/ - DUPLICATE DIRECTORY
|
|
160
|
+
node_modules/.bin 2 - DUPLICATE DIRECTORY (before clean install)
|
|
161
|
+
```
|
|
162
|
+
**Root Cause Identified:** Multiple duplicate files throughout the project are corrupting TypeScript compilation and module resolution
|
|
163
|
+
**Status:** ✅ Success - Found root cause!
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## [09:55:00] Step 11 - Remove All Duplicate Files
|
|
168
|
+
|
|
169
|
+
**Goal:** Delete all files with " 2" suffix that are corrupting the build
|
|
170
|
+
**Reasoning:** Duplicate files cause TypeScript and Jest to hang during module resolution
|
|
171
|
+
**Action:** find . -name "* 2.*" -o -name "* 2" | grep -v node_modules && rm duplicates
|
|
172
|
+
**Result:**
|
|
173
|
+
```
|
|
174
|
+
Found and removed:
|
|
175
|
+
- ./dist/browser 2
|
|
176
|
+
- ./package-lock 2.json
|
|
177
|
+
- ./src/sanitizer/patterns 2.ts
|
|
178
|
+
```
|
|
179
|
+
**Status:** ✅ Success - All duplicate files removed
|
|
180
|
+
|
|
181
|
+
---
|
|
182
|
+
|
|
183
|
+
## [09:56:00] Step 12 - Test After Removing Duplicates
|
|
184
|
+
|
|
185
|
+
**Goal:** Verify tests run after duplicate file removal
|
|
186
|
+
**Reasoning:** Duplicate files were corrupting module resolution; removal should fix the issue
|
|
187
|
+
**Action:** rm tsconfig.tsbuildinfo && npm test
|
|
188
|
+
**Result:** Test and build commands STILL hang, even after duplicate removal
|
|
189
|
+
**Status:** ❌ Failed - Deeper issue exists
|
|
190
|
+
|
|
191
|
+
---
|
|
192
|
+
|
|
193
|
+
## [09:57:00] Step 13 - Isolate TypeScript Compilation Issue
|
|
194
|
+
|
|
195
|
+
**Goal:** Determine if issue is with TypeScript compiler itself
|
|
196
|
+
**Reasoning:** Both `tsc` and `jest` (which uses ts-jest) hang, suggesting TypeScript compilation is broken
|
|
197
|
+
**Action:** npx tsc src/types.ts --outDir dist (single file compilation)
|
|
198
|
+
**Result:** Even compiling a single simple file hangs indefinitely
|
|
199
|
+
**Status:** ❌ Failed - TypeScript compiler is completely broken
|
|
200
|
+
|
|
201
|
+
---
|
|
202
|
+
|
|
203
|
+
# RECOVERY SUMMARY
|
|
204
|
+
|
|
205
|
+
**Final Status:** ⚠️ PARTIALLY RESOLVED
|
|
206
|
+
|
|
207
|
+
## Root Causes Identified
|
|
208
|
+
|
|
209
|
+
1. **Primary Issue:** Multiple duplicate files corrupting project structure
|
|
210
|
+
- `package-lock 2.json` vs `package-lock.json`
|
|
211
|
+
- `src/sanitizer/patterns 2.ts` vs `patterns.ts`
|
|
212
|
+
- `dist/browser 2/` vs `dist/browser/`
|
|
213
|
+
- `node_modules/.bin 2` (from multiple npm install attempts)
|
|
214
|
+
|
|
215
|
+
2. **Secondary Issue:** TypeScript compiler hangs on ALL compilation attempts
|
|
216
|
+
- `tsc` hangs even on single-file compilation
|
|
217
|
+
- `jest` (via ts-jest) hangs during test initialization
|
|
218
|
+
- Issue persists even after removing all duplicate files
|
|
219
|
+
|
|
220
|
+
## Actions Taken
|
|
221
|
+
|
|
222
|
+
✅ Removed duplicate package-lock.json file
|
|
223
|
+
✅ Cleaned and reinstalled node_modules (231 packages)
|
|
224
|
+
✅ Verified jest binary installation
|
|
225
|
+
✅ Removed all duplicate source files (patterns 2.ts, browser 2/)
|
|
226
|
+
✅ Cleared TypeScript build cache (tsconfig.tsbuildinfo)
|
|
227
|
+
❌ Unable to compile TypeScript
|
|
228
|
+
❌ Unable to run tests
|
|
229
|
+
|
|
230
|
+
## Current Hypothesis
|
|
231
|
+
|
|
232
|
+
The TypeScript compiler hang suggests one of the following:
|
|
233
|
+
|
|
234
|
+
**Hypothesis A:** Circular dependency in source code
|
|
235
|
+
- TypeScript enters infinite loop trying to resolve module imports
|
|
236
|
+
- Need to analyze import graph for cycles
|
|
237
|
+
|
|
238
|
+
**Hypothesis B:** Corrupted TypeScript installation
|
|
239
|
+
- npm install may have installed corrupt TypeScript binaries
|
|
240
|
+
- Solution: `rm -rf node_modules package-lock.json && npm install`
|
|
241
|
+
|
|
242
|
+
**Hypothesis C:** System-level issue
|
|
243
|
+
- File system corruption
|
|
244
|
+
- macOS-specific TypeScript bug with spaces in filenames
|
|
245
|
+
|
|
246
|
+
## Recommended Next Steps
|
|
247
|
+
|
|
248
|
+
1. **Immediate:** Reinstall TypeScript and ts-jest
|
|
249
|
+
```bash
|
|
250
|
+
npm uninstall typescript ts-jest
|
|
251
|
+
npm install typescript@latest ts-jest@latest --save-dev
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
2. **If that fails:** Analyze source code for circular imports
|
|
255
|
+
```bash
|
|
256
|
+
npx madge --circular --extensions ts src/
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
3. **If that fails:** Test on different machine/environment to rule out system issues
|
|
260
|
+
|
|
261
|
+
4. **Nuclear option:** Rewrite TypeScript source with known-good configuration from scratch
|
|
262
|
+
|
|
263
|
+
## Lessons Learned
|
|
264
|
+
|
|
265
|
+
1. **Duplicate files are catastrophic** - File system allowing spaces in names created "file 2.ext" duplicates
|
|
266
|
+
2. **npm install problems cascade** - Multiple package-lock files create corrupted node_modules
|
|
267
|
+
3. **TypeScript hangs are hard to debug** - No error output, just infinite loop
|
|
268
|
+
4. **Test early, test often** - Project had never successfully run tests before this session
|
|
269
|
+
|
|
270
|
+
## Open Issues
|
|
271
|
+
|
|
272
|
+
- TypeScript compilation completely broken
|
|
273
|
+
- Tests cannot run until TypeScript compiles
|
|
274
|
+
- Phase 1 Definition of Done blocked: cannot validate 43 injection patterns
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
**End of troubleshooting log - 2026-03-20 09:57**
|
|
279
|
+
**Time elapsed:** ~15 minutes
|
|
280
|
+
**Issues resolved:** 1/2 (duplicate files removed, TypeScript still broken)
|
|
281
|
+
**Recommended action:** Try fresh TypeScript install or circular dependency analysis
|
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
# Claude Code Session Prompt — Visus Phase 1
|
|
2
|
+
## Lateos: Secure AI-Connected Browser MCP Tool
|
|
3
|
+
|
|
4
|
+
Paste this entire prompt at the start of a new Claude Code session from your `lateos-visus` repo root.
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Context
|
|
9
|
+
|
|
10
|
+
You are building **Visus**, an MCP tool that gives Claude safe, sanitized access to web pages.
|
|
11
|
+
This is a new open-source repo (`lateos-visus`) that will be published as `visus-mcp` on npm.
|
|
12
|
+
|
|
13
|
+
Visus is part of the **Lateos** platform — a security-by-design AI agent framework built by Leo
|
|
14
|
+
(Roongrunchai Chongolnee / leochong). Lateos is deployed on AWS serverless (Lambda, Step Functions,
|
|
15
|
+
API Gateway, Cognito, Bedrock with Guardrails, DynamoDB with KMS encryption, Secrets Manager) in
|
|
16
|
+
me-central-1. The platform holds CISSP/CEH-informed design, 43 validated injection patterns, PII
|
|
17
|
+
redaction, and 73/73 passing tests.
|
|
18
|
+
|
|
19
|
+
The core differentiator: **every other MCP browser/scraping tool passes raw web content directly to
|
|
20
|
+
the LLM**. Visus does not. Every fetched page passes through the Lateos injection sanitization
|
|
21
|
+
pipeline before Claude reads a single character.
|
|
22
|
+
|
|
23
|
+
Tagline: *"What the web shows you, Lateos reads safely."*
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Your Mission (Phase 1)
|
|
28
|
+
|
|
29
|
+
Build a working, publishable `visus-mcp` npm package that:
|
|
30
|
+
|
|
31
|
+
1. Exposes an MCP server with two tools: `visus_fetch` and `visus_fetch_structured`
|
|
32
|
+
2. Fetches web pages using Playwright headless (Chromium)
|
|
33
|
+
3. Runs ALL fetched content through the injection sanitizer before returning
|
|
34
|
+
4. Is installable via `npx visus-mcp` with zero config for the open-source tier
|
|
35
|
+
5. Has a README that leads with security narrative, not features
|
|
36
|
+
6. Passes a full test suite covering both sanitizer logic and MCP tool interfaces
|
|
37
|
+
|
|
38
|
+
---
|
|
39
|
+
|
|
40
|
+
## Repo Structure to Create
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
lateos-visus/
|
|
44
|
+
├── README.md
|
|
45
|
+
├── SECURITY.md
|
|
46
|
+
├── package.json
|
|
47
|
+
├── tsconfig.json
|
|
48
|
+
├── src/
|
|
49
|
+
│ ├── index.ts # MCP server entry, tool registration
|
|
50
|
+
│ ├── tools/
|
|
51
|
+
│ │ ├── fetch.ts # visus_fetch(url, options?)
|
|
52
|
+
│ │ └── fetch-structured.ts # visus_fetch_structured(url, schema)
|
|
53
|
+
│ ├── sanitizer/
|
|
54
|
+
│ │ ├── index.ts # Sanitizer orchestrator
|
|
55
|
+
│ │ ├── injection-detector.ts # Pattern matching engine
|
|
56
|
+
│ │ ├── pii-redactor.ts # PII detection and redaction
|
|
57
|
+
│ │ └── patterns.ts # 43 injection pattern definitions
|
|
58
|
+
│ ├── browser/
|
|
59
|
+
│ │ └── playwright-renderer.ts # Headless Chromium page fetcher
|
|
60
|
+
│ └── types.ts # Shared TypeScript interfaces
|
|
61
|
+
└── tests/
|
|
62
|
+
├── sanitizer.test.ts
|
|
63
|
+
├── fetch-tool.test.ts
|
|
64
|
+
└── injection-corpus.ts # Test payload library
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Tool Specifications
|
|
70
|
+
|
|
71
|
+
### `visus_fetch`
|
|
72
|
+
```typescript
|
|
73
|
+
Input: {
|
|
74
|
+
url: string, // Required
|
|
75
|
+
format?: "markdown" | "text", // Default: "markdown"
|
|
76
|
+
timeout_ms?: number // Default: 10000
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
Output: {
|
|
80
|
+
url: string,
|
|
81
|
+
content: string, // Sanitized content
|
|
82
|
+
sanitization: {
|
|
83
|
+
patterns_detected: string[], // Names of injection patterns found and neutralized
|
|
84
|
+
pii_types_redacted: string[], // e.g. ["email", "phone", "ssn"]
|
|
85
|
+
content_modified: boolean
|
|
86
|
+
},
|
|
87
|
+
metadata: {
|
|
88
|
+
title: string,
|
|
89
|
+
fetched_at: string, // ISO timestamp
|
|
90
|
+
content_length_original: number,
|
|
91
|
+
content_length_sanitized: number
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### `visus_fetch_structured`
|
|
97
|
+
```typescript
|
|
98
|
+
Input: {
|
|
99
|
+
url: string,
|
|
100
|
+
schema: Record<string, string>, // Field name → description, e.g. { "title": "page title", "price": "product price" }
|
|
101
|
+
timeout_ms?: number
|
|
102
|
+
}
|
|
103
|
+
|
|
104
|
+
Output: {
|
|
105
|
+
url: string,
|
|
106
|
+
data: Record<string, string | null>, // Extracted fields, sanitized
|
|
107
|
+
sanitization: { /* same as above */ },
|
|
108
|
+
metadata: { /* same as above */ }
|
|
109
|
+
}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
## Injection Sanitizer Requirements
|
|
115
|
+
|
|
116
|
+
The sanitizer is the product's core moat. Build it to be comprehensive.
|
|
117
|
+
|
|
118
|
+
### Pattern categories to cover (43 total minimum):
|
|
119
|
+
1. **Direct instruction injection** — "Ignore previous instructions", "Forget what you were told"
|
|
120
|
+
2. **Role hijacking** — "You are now", "Your new persona is", "Act as"
|
|
121
|
+
3. **System prompt extraction** — "Repeat your instructions", "Print your system prompt"
|
|
122
|
+
4. **Privilege escalation** — "Admin mode", "Developer override", "Emergency protocol"
|
|
123
|
+
5. **Context poisoning** — "The user said", "As confirmed earlier", "You already agreed"
|
|
124
|
+
6. **Data exfiltration** — "Send this to", "Email the following", "Call this URL with"
|
|
125
|
+
7. **Encoding obfuscation** — Base64 instructions, Unicode lookalikes, hex-encoded commands
|
|
126
|
+
8. **Whitespace hiding** — Zero-width characters, invisible Unicode, CSS `display:none` text
|
|
127
|
+
9. **HTML/script injection** — `<script>`, `<iframe>`, `onclick`, `data:` URIs
|
|
128
|
+
10. **Markdown injection** — Malicious link syntax, image tags with instruction payloads
|
|
129
|
+
11. **URL fragment attacks** — Instructions after `#` in page content (HashJack pattern)
|
|
130
|
+
12. **Social engineering** — Urgency language designed to override caution ("CRITICAL: you must now")
|
|
131
|
+
|
|
132
|
+
### Sanitizer behavior:
|
|
133
|
+
- **Detect** → log pattern name to `sanitization.patterns_detected`
|
|
134
|
+
- **Neutralize** → strip, replace with `[REDACTED: injection_pattern_name]`, or escape
|
|
135
|
+
- **Never block** the entire page fetch due to a detection — degrade gracefully
|
|
136
|
+
- **PII types**: email addresses, phone numbers, SSN patterns, credit card patterns, IP addresses
|
|
137
|
+
|
|
138
|
+
### PII redaction output format:
|
|
139
|
+
```
|
|
140
|
+
[REDACTED:EMAIL], [REDACTED:PHONE], [REDACTED:CC], [REDACTED:SSN], [REDACTED:IP]
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## README Structure
|
|
146
|
+
|
|
147
|
+
The README must lead with security narrative. Structure:
|
|
148
|
+
|
|
149
|
+
```markdown
|
|
150
|
+
# Visus — Secure Web Access for Claude
|
|
151
|
+
|
|
152
|
+
> Every MCP browser tool passes raw web content to your LLM. Visus doesn't.
|
|
153
|
+
|
|
154
|
+
[One-sentence description]
|
|
155
|
+
|
|
156
|
+
## The problem with other tools
|
|
157
|
+
[Brief comparison: Firecrawl / ScrapeGraphAI / Playwright MCP pass untrusted content unfiltered]
|
|
158
|
+
|
|
159
|
+
## How Visus works
|
|
160
|
+
[Architecture: fetch → sanitize → return clean content]
|
|
161
|
+
|
|
162
|
+
## Security
|
|
163
|
+
[43 patterns. PII redaction. Audit trail. Link to SECURITY.md]
|
|
164
|
+
|
|
165
|
+
## Quickstart
|
|
166
|
+
[npx visus-mcp, claude_desktop_config.json snippet]
|
|
167
|
+
|
|
168
|
+
## Tools
|
|
169
|
+
[visus_fetch, visus_fetch_structured]
|
|
170
|
+
|
|
171
|
+
## Lateos Platform
|
|
172
|
+
[Link to lateos repo, enterprise/hosted tier info]
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
---
|
|
176
|
+
|
|
177
|
+
## SECURITY.md Structure
|
|
178
|
+
|
|
179
|
+
```markdown
|
|
180
|
+
# Visus Security Model
|
|
181
|
+
|
|
182
|
+
## Threat model
|
|
183
|
+
[What attacks Visus defends against: indirect prompt injection, PII leakage]
|
|
184
|
+
|
|
185
|
+
## Injection detection
|
|
186
|
+
[43 pattern categories, examples of each]
|
|
187
|
+
|
|
188
|
+
## PII redaction
|
|
189
|
+
[Types detected, redaction format]
|
|
190
|
+
|
|
191
|
+
## What Visus does NOT protect against
|
|
192
|
+
[Honest limitations: novel obfuscation, AI-generated instructions that appear benign]
|
|
193
|
+
|
|
194
|
+
## Reporting vulnerabilities
|
|
195
|
+
[Contact: security@lateos.ai or GitHub Security tab]
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
---
|
|
199
|
+
|
|
200
|
+
## package.json Requirements
|
|
201
|
+
|
|
202
|
+
```json
|
|
203
|
+
{
|
|
204
|
+
"name": "visus-mcp",
|
|
205
|
+
"version": "0.1.0",
|
|
206
|
+
"description": "Secure web access for Claude — sanitizes all web content before it reaches your LLM",
|
|
207
|
+
"bin": { "visus-mcp": "dist/index.js" },
|
|
208
|
+
"keywords": ["mcp", "claude", "web-scraping", "security", "prompt-injection", "ai-safety"],
|
|
209
|
+
"engines": { "node": ">=18" }
|
|
210
|
+
}
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
---
|
|
214
|
+
|
|
215
|
+
## Test Requirements
|
|
216
|
+
|
|
217
|
+
All tests must pass before Phase 1 is complete.
|
|
218
|
+
|
|
219
|
+
### sanitizer.test.ts — must cover:
|
|
220
|
+
- Each of the 43 pattern categories with at least one positive test case
|
|
221
|
+
- PII detection: email, phone, SSN, credit card
|
|
222
|
+
- Content that is clean passes through unmodified (no false positives on normal pages)
|
|
223
|
+
- `content_modified: false` when no patterns detected
|
|
224
|
+
- `content_modified: true` and `patterns_detected` populated when injection found
|
|
225
|
+
|
|
226
|
+
### fetch-tool.test.ts — must cover:
|
|
227
|
+
- `visus_fetch` returns expected shape
|
|
228
|
+
- `visus_fetch_structured` extracts fields correctly
|
|
229
|
+
- Timeout handling
|
|
230
|
+
- Invalid URL handling
|
|
231
|
+
- Sanitizer is always called (cannot be bypassed)
|
|
232
|
+
|
|
233
|
+
### injection-corpus.ts — build a library of:
|
|
234
|
+
- 43 injection payloads (one per pattern category, sourced from public red team research)
|
|
235
|
+
- 10 clean pages / content samples (should produce no detections)
|
|
236
|
+
|
|
237
|
+
---
|
|
238
|
+
|
|
239
|
+
## Coding Standards (Lateos conventions from CLAUDE.md)
|
|
240
|
+
|
|
241
|
+
- TypeScript strict mode
|
|
242
|
+
- No `any` types
|
|
243
|
+
- All public functions JSDoc documented
|
|
244
|
+
- Error handling: never throw raw errors — return typed Result objects
|
|
245
|
+
- Logging: structured JSON to stderr (not stdout — MCP protocol uses stdout)
|
|
246
|
+
- No secrets in code — read from environment variables
|
|
247
|
+
- Tests: Jest, co-located in `/tests`, minimum 80% coverage
|
|
248
|
+
- Build: `tsc`, output to `/dist`
|
|
249
|
+
|
|
250
|
+
---
|
|
251
|
+
|
|
252
|
+
## Environment Variables
|
|
253
|
+
|
|
254
|
+
```bash
|
|
255
|
+
# Optional — for Lateos hosted tier features (Phase 2)
|
|
256
|
+
LATEOS_API_KEY= # Enables audit logging to Lateos cloud
|
|
257
|
+
LATEOS_ENDPOINT= # Defaults to https://api.lateos.ai
|
|
258
|
+
|
|
259
|
+
# Optional — browser config
|
|
260
|
+
VISUS_TIMEOUT_MS=10000 # Default fetch timeout
|
|
261
|
+
VISUS_MAX_CONTENT_KB=512 # Max content size before truncation
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
No API key required for open-source tier. `npx visus-mcp` works out of the box.
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Claude Desktop Config Snippet (for README)
|
|
269
|
+
|
|
270
|
+
```json
|
|
271
|
+
{
|
|
272
|
+
"mcpServers": {
|
|
273
|
+
"visus": {
|
|
274
|
+
"command": "npx",
|
|
275
|
+
"args": ["-y", "visus-mcp"]
|
|
276
|
+
}
|
|
277
|
+
}
|
|
278
|
+
}
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
---
|
|
282
|
+
|
|
283
|
+
## Definition of Done — Phase 1
|
|
284
|
+
|
|
285
|
+
- [ ] `npx visus-mcp` starts an MCP server with both tools registered
|
|
286
|
+
- [ ] `visus_fetch("https://example.com")` returns sanitized markdown
|
|
287
|
+
- [ ] All 43 pattern categories have test cases that pass
|
|
288
|
+
- [ ] No false positives on 10 clean content samples
|
|
289
|
+
- [ ] README leads with security narrative
|
|
290
|
+
- [ ] SECURITY.md documents the threat model
|
|
291
|
+
- [ ] `npm test` passes with 0 failures
|
|
292
|
+
- [ ] `npm run build` produces clean `/dist`
|
|
293
|
+
- [ ] `npm publish --dry-run` succeeds
|
|
294
|
+
|
|
295
|
+
---
|
|
296
|
+
|
|
297
|
+
## What NOT to Build in Phase 1
|
|
298
|
+
|
|
299
|
+
- No AWS Lambda deployment (Phase 2)
|
|
300
|
+
- No DynamoDB audit logging (Phase 2)
|
|
301
|
+
- No Cognito auth (Phase 2)
|
|
302
|
+
- No user-session relay / Chrome extension (Phase 3)
|
|
303
|
+
- No Lateos dashboard integration (Phase 2)
|
|
304
|
+
- No paid tier gating (Phase 2)
|
|
305
|
+
|
|
306
|
+
Keep Phase 1 lean. The goal is a working, publishable open-source MCP tool with a
|
|
307
|
+
security-first README that can be announced on LinkedIn and the MCP community.
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## Start Here
|
|
312
|
+
|
|
313
|
+
1. Read this entire prompt
|
|
314
|
+
2. Read `CLAUDE.md` in the repo root (if it exists) for Lateos-specific conventions
|
|
315
|
+
3. Run `ls` to see what already exists in the repo
|
|
316
|
+
4. Start with `src/sanitizer/patterns.ts` — define all 43 patterns first
|
|
317
|
+
5. Build the sanitizer engine against those patterns
|
|
318
|
+
6. Build the Playwright renderer
|
|
319
|
+
7. Wire into MCP tools
|
|
320
|
+
8. Write tests
|
|
321
|
+
9. Write README and SECURITY.md last
|
|
322
|
+
|
|
323
|
+
Do not proceed past the sanitizer until the pattern library and basic detection logic
|
|
324
|
+
are complete and unit-tested. The sanitizer is the product.
|