npm - @bryan-thompson/inspector-assessment - Versions diffs - 1.43.2 → 1.43.4 - Mend

@bryan-thompson/inspector-assessment 1.43.2 → 1.43.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/README.md CHANGED Viewed

@@ -3,364 +3,1202 @@
 [![npm version](https://badge.fury.io/js/@bryan-thompson%2Finspector-assessment.svg)](https://www.npmjs.com/package/@bryan-thompson/inspector-assessment)
 [![npm downloads](https://img.shields.io/npm/dm/@bryan-thompson/inspector-assessment.svg)](https://www.npmjs.com/package/@bryan-thompson/inspector-assessment)
-**Comprehensive MCP server validation with 18 automated assessment modules (17 active + 1 deprecated).**
-Test functionality, security, documentation, code quality, and policy compliance from the command line.
+The MCP inspector is a developer tool for testing and debugging MCP servers with comprehensive assessment capabilities for validating server functionality, security, documentation, and compliance.
 ![MCP Inspector Screenshot](./mcp-inspector.png)
----
 ## Installation
+**npm (global installation):**
 ```bash
-# Install globally
 npm install -g @bryan-thompson/inspector-assessment
+```
+**Or use directly with bunx (no installation):**
-# Or use directly with bunx (no installation)
+```bash
 bunx @bryan-thompson/inspector-assessment
 ```
----
+**Local installation for development:**
-## Quick Start: Assess an MCP Server
+```bash
+git clone https://github.com/triepod-ai/inspector-assessment.git
+cd inspector-assessment
+npm install
+npm run build
+npm run dev
+```
-Run a full assessment on any MCP server:
+## Quick Start
+After installation, launch the inspector:
 ```bash
-# Create a config file
-cat > /tmp/config.json << 'EOF'
-{
-  "transport": "http",
-  "url": "http://localhost:8000/mcp"
-}
-EOF
+# Using global install
+mcp-inspector-assess
+# Using bunx
+bunx @bryan-thompson/inspector-assessment
+```
+The web interface will open at http://localhost:6274
+## For MCP Directory Reviewers
+If you're reviewing MCP servers for the Anthropic MCP Directory, see our **[Reviewer Quick Start Guide](docs/REVIEWER_QUICK_START.md)** for:
+- **60-second fast screening** workflow for approve/reject decisions
+- **5-minute detailed review** process for borderline cases
+- **Common pitfalls** explanation (false positives in security, informational vs scored tests)
+- **Decision matrix** with clear approval criteria
+- **Fast CLI analysis** commands for troubleshooting
+The quick start guide is optimized for fast reviewer onboarding and provides clear guidance on interpreting assessment results.
+## About This Fork
+This is an enhanced fork of [Anthropic's MCP Inspector](https://github.com/modelcontextprotocol/inspector) with significantly expanded assessment capabilities for MCP server validation and testing.
+**Original Repository**: https://github.com/modelcontextprotocol/inspector
+**Our Enhanced Fork**: https://github.com/triepod-ai/inspector-assessment
+**⚠️ Important**: This is a published fork with assessment enhancements. If you want the official Anthropic inspector without assessment features, use `npx @modelcontextprotocol/inspector`.
+### What We Added
+We've built a comprehensive assessment framework on top of the original inspector that transforms it from a debugging tool into a full validation suite for MCP servers. Our enhancements focus on accuracy, depth, and actionable insights for MCP server developers.
+## Key Features
+- **Interactive Testing**: Visual interface for testing MCP server tools, resources, and prompts
+- **Comprehensive Assessment**: Automated validation of server functionality, error handling, documentation, security, and usability using multi-scenario testing with progressive complexity
+- **Business Logic Validation**: Distinguishes between proper error handling and unintended failures
+- **Detailed Test Reports**: Confidence scoring, test scenario details, and actionable recommendations
+- **Multiple Transport Support**: STDIO, SSE, and Streamable HTTP transports
+## Quality Metrics
+Our enhanced fork maintains high code quality standards with comprehensive testing and validation:
+- **Test Coverage**: ✅ 665/665 tests passing (100% pass rate)
+  - **Assessment Module Tests**: 291 tests specifically validating our assessment enhancements (including 83 new MCP Directory compliance tests)
+    - Business logic error detection with confidence scoring
+    - Progressive complexity testing (2 levels: minimal → simple)
+    - Context-aware security testing with zero false positives
+    - Realistic test data generation and boundary testing
+  - **Total Project Tests**: 582 tests including assessment modules, UI components, and core inspector functionality
+- All tests updated to reflect focused backend testing (8 security patterns × 3 payloads per tool)
+  - Test files: `client/src/services/__tests__/` and `client/src/services/assessment/__tests__/`
+- **Code Quality**: ✅ Production code uses proper TypeScript types
+  - 229 lint issues remaining (down 18% from 280 after recent cleanup)
+  - All source files migrated from `any` to `unknown` or proper types
+  - React Hooks follow best practices with proper dependency arrays
+- **Build Status**: ✅ Production builds pass cleanly
+  - TypeScript compilation successful for all production code
+  - Vite build optimized and validated
+- **Upstream Sync**: ✅ Up-to-date with v0.17.0
+  - Successfully integrated 121 commits from upstream
+  - New features: CustomHeaders, OAuth improvements, parameter validation
+  - All enhancements preserved during merge
+**Testing Commands**:
+```bash
+npm test                         # Run all 665 tests
+npm test -- assessment           # Run all 291 assessment module tests
+npm test -- assessmentService    # Run assessment service integration tests (54 tests)
+npm test -- SecurityAssessor     # Run security assessment tests (16 tests)
+npm test -- FunctionalityAssessor # Run functionality tests (11 tests)
+npm test -- AUPCompliance        # Run AUP compliance tests (26 tests)
+npm test -- ToolAnnotation       # Run tool annotation tests (13 tests)
+npm run coverage                 # Generate coverage report
+npm run lint                     # Check code quality
+```
+## Our Enhancements to the MCP Inspector
+We've significantly expanded the original MCP Inspector's capabilities with advanced assessment features that go far beyond basic debugging. Here's what makes our fork unique:
+### 1. Enhanced Business Logic Error Detection
+**Problem**: The original inspector couldn't distinguish between broken tools and tools that correctly validate input. A tool returning "user not found" would be marked as broken.
+**Our Solution**: Confidence-based validation system (ResponseValidator.ts:client/src/services/assessment/ResponseValidator.ts)
+- **MCP Standard Error Code Recognition**: Properly identifies error codes like `-32602` (Invalid params) as successful validation
+- **Confidence Scoring**: Multi-factor analysis determines if errors represent proper business logic
+- **Tool Type Awareness**: Different validation thresholds for CRUD vs utility tools
+- **Impact**: Estimated 80% reduction in false positives for resource-based tools (based on analysis in [FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md](docs/FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md#key-problems-addressed))
+### 2. Optimized Progressive Complexity Testing
+**Problem**: Testing tools with only complex inputs makes it hard to identify where functionality breaks down.
+**Our Solution**: Two-level progressive diagnostic testing (TestScenarioEngine.ts:client/src/services/assessment/TestScenarioEngine.ts)
+1. **Minimal**: Only required fields with simplest values - diagnoses basic setup issues
+2. **Simple**: Required fields with realistic simple values - validates core functionality
+Combined with multi-scenario comprehensive testing (Happy Path, Edge Cases, Boundary Testing) for full coverage.
+**Benefits**:
+- **50% faster** than previous 4-level approach (removed redundant typical/maximum tests)
+- Identifies exact complexity level where tools fail (minimal vs simple)
+- Provides specific, actionable recommendations
+- Zero coverage loss - comprehensive scenarios provide full validation
+- Helps developers understand tool limitations and requirements
+**Performance**: For 10-tool servers, comprehensive testing now runs in ~4.2-8.3 minutes (down from ~7.5-11.7 minutes)
+### 3. Realistic Test Data Generation
+**Problem**: Generic test data like "test_value" and fake IDs trigger validation errors, causing false failures.
+**Our Solution**: Context-aware test data generation (TestDataGenerator.ts:client/src/services/assessment/TestDataGenerator.ts)
+- **Publicly Accessible URLs**: `https://www.google.com`, `https://api.github.com/users/octocat`
+- **Real API Endpoints**: Uses actual test APIs like jsonplaceholder.typicode.com
+- **Valid UUIDs**: Properly formatted identifiers that won't trigger format validation
+- **Context Awareness**: Generates appropriate data based on field names (email, url, id, etc.)
+### 4. Context-Aware Security Assessment with Zero False Positives
+**Problem**: Traditional security scanners flag tools as vulnerable when they safely store or echo malicious input as data. For example, a database tool that stores `"<script>alert(1)"` would be incorrectly flagged as vulnerable to XSS.
+**Our Solution**: Intelligent reflection detection (SecurityAssessor.ts:client/src/services/assessment/modules/SecurityAssessor.ts)
+**Key Innovation**: Distinguishing **data reflection** (safe) from **command execution** (vulnerable)
+**Examples**:
+✅ **SAFE - Data Reflection**:
+```
+Payload: "<script>alert(1)</script>"
+Response: "Stored in collection: <script>alert(1)</script>"
+→ Tool is just storing data, not executing it
+```
+❌ **VULNERABLE - Command Execution** (Calculator Injection):
+```
+Payload: "2+2"
+Response: "The answer is 4"
+→ Tool executed the arithmetic expression via eval()!
+```
+**Detection Approach**:
+1. **Reflection Pattern Recognition**: Identifies safe data operations through patterns like "stored", "created", "error getting info for", "collection doesn't exist"
+2. **Execution Evidence Detection**: Only flags as vulnerable when actual execution is detected (calculator returning "4", API keys leaked, admin mode activated)
+3. **Error Message Handling**: Recognizes that error messages echoing invalid input are safe reflection, not vulnerabilities
+**Impact**:
+- **Zero false positives** on data storage/retrieval tools (qdrant, databases, file systems)
+- **18 injection patterns tested** (9 original + 9 advanced patterns)
+- **Dual-mode testing**: Reviewer mode (3 critical patterns, fast) + Developer mode (all 13 patterns, comprehensive)
+- **Real vulnerabilities still detected**: 100% test pass rate on detecting actual command injection, calculator injection, role override, data exfiltration
+**Supported Injection Types**:
+- **Command Injection**: System commands (whoami, ls -la, pwd)
+- **Calculator Injection**: Arithmetic expressions and code injection via eval() (NEW - 7 payloads)
+- **SQL Injection**: Database command injection
+- **Path Traversal**: File system access outside intended directory
+- Plus 9 additional patterns (Unicode Bypass, Nested Injection, Package Squatting, etc.)
+**Validation**: See [VULNERABILITY_TESTING.md](VULNERABILITY_TESTING.md) for detailed testing guide and examples.
+### 5. Streamlined Assessment Architecture
+**Based on Real-World Testing**: Our methodology has been validated through systematic testing using the taskmanager MCP server as a case study (11 tools tested with 8 backend security patterns, detailed in [ASSESSMENT_METHODOLOGY.md](docs/ASSESSMENT_METHODOLOGY.md)).
+**Eleven Core Assessors** aligned with Anthropic's MCP directory submission requirements:
+**Original MCP Inspector Assessors:**
+1. **FunctionalityAssessor** (225 lines)
+   - Multi-scenario validation with progressive complexity
+   - Coverage tracking and reliability scoring
+   - Business logic error detection
+   - Performance measurement
+2. **SecurityAssessor** (443 lines)
+   - 13 distinct injection attack patterns (including Calculator Injection) with context-aware reflection detection
+   - Direct command injection, calculator injection (eval detection), role override, data exfiltration detection
+   - Vulnerability analysis with risk levels (HIGH/MEDIUM/LOW)
+   - Zero false positives through intelligent distinction between data reflection and command execution
+3. **ErrorHandlingAssessor** (692 lines)
+   - MCP protocol compliance scoring
+   - Error response quality analysis
+   - Invalid input resilience testing
+4. **DocumentationAssessor** (274 lines)
+   - README structure and completeness
+   - Code example extraction and validation
+   - API reference quality assessment
+5. **UsabilityAssessor** (290 lines)
+   - Naming convention consistency
+   - Parameter clarity assessment
+   - Best practices compliance
+6. **MCPSpecComplianceAssessor** (560 lines) - Extended
+   - JSON-RPC 2.0 compliance validation
+   - Protocol message format verification
+   - MCP specification adherence
+**NEW: MCP Directory Compliance Assessors** (added 2025-12):
+7. **AUPComplianceAssessor** - Policy compliance
+   - 14 AUP category violation detection (A-N)
+   - High-risk domain identification (Healthcare, Financial, Legal, Children)
+   - Tool name/description pattern analysis
+   - Source code scanning (enhanced mode)
+8. **ToolAnnotationAssessor** - Policy #17 compliance
+   - readOnlyHint/destructiveHint verification
+   - Tool behavior inference from name patterns
+   - Annotation misalignment detection
+   - Policy #17 compliance reporting
+9. **ProhibitedLibrariesAssessor** - Policy #28-30 compliance
+   - Financial library detection (Stripe, PayPal, Plaid, etc.)
+   - Media library detection (Sharp, FFmpeg, OpenCV, PIL)
+   - package.json and requirements.txt scanning
+   - Source code import analysis
+10. **ManifestValidationAssessor** - MCPB manifest compliance
+    - manifest_version 0.3 validation
+    - Required field verification (name, version, mcp_config)
+    - Icon presence check
+    - ${BUNDLE_ROOT} anti-pattern detection
+11. **PortabilityAssessor** - Cross-platform compatibility
+    - Hardcoded path detection (/Users/, /home/, C:\)
+    - Platform-specific code patterns
+    - ${\_\_dirname} usage validation
+**Recent Updates**:
+- (2025-12-07): Added 5 new MCP Directory compliance assessors with 83 tests
+- (2025-10-05): Removed 2,707 lines of out-of-scope assessment modules to focus on core MCP validation requirements
+### 6. Advanced Assessment Components
+We've built a complete assessment architecture with specialized modules:
+- **AssessmentOrchestrator.ts**: Coordinates multi-phase testing across all assessment dimensions
+- **ResponseValidator.ts**: Advanced response validation with confidence scoring and business logic detection
+- **TestScenarioEngine.ts**: Generates and executes optimized 2-level progressive complexity tests
+- **TestDataGenerator.ts**: Context-aware realistic test data generation with conditional boundary testing
+- **Assessment UI Components**: Rich visualization of test results and recommendations
+### Documentation
+Our enhancements include comprehensive documentation:
+- **ASSESSMENT_METHODOLOGY.md**: Complete methodology with examples and best practices
+- **FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md**: Implementation details and impact analysis
+- **COMPREHENSIVE_TESTING_OPTIMIZATION_PLAN.md**: Detailed optimization strategy (Phases 1-2 complete)
+- **PHASE1_OPTIMIZATION_COMPLETED.md**: Progressive complexity optimization (50% faster)
+- **PHASE2_OPTIMIZATION_COMPLETED.md**: Business logic error detection enhancements
+- **PROJECT_STATUS.md**: Current status, recent changes, and development roadmap
+- **Test Coverage Reports**: Detailed validation of our assessment accuracy
+## Architecture Overview
+The MCP Inspector consists of two main components that work together:
+- **MCP Inspector Client (MCPI)**: A React-based web UI that provides an interactive interface for testing and debugging MCP servers
+- **MCP Proxy (MCPP)**: A Node.js server that acts as a protocol bridge, connecting the web UI to MCP servers via various transport methods (stdio, SSE, streamable-http)
+Note that the proxy is not a network proxy for intercepting traffic. Instead, it functions as both an MCP client (connecting to your MCP server) and an HTTP server (serving the web UI), enabling browser-based interaction with MCP servers that use different transport protocols.
+## Assessment Capabilities
+Our enhanced MCP Inspector includes a comprehensive assessment system that validates MCP servers against Anthropic's directory submission requirements and MCP protocol standards:
+### Assessment Categories
+1. **Functionality Testing**
+   - Multi-scenario validation with happy path, edge cases, and boundary testing
+   - Optimized progressive complexity testing (2 levels: minimal → simple)
+   - Business logic validation to distinguish proper error handling from failures
+   - Confidence scoring based on test coverage and consistency
+   - Evidence-based recommendations with transparent methodology (e.g., "5/5 scenarios verified (happy path, edge cases, boundaries, error handling)")
+2. **Error Handling**
+   - Invalid input resilience testing
+   - Comprehensive error message analysis
+   - Resource validation vs. unintended failures
+   - MCP protocol compliance scoring
+   - Quality scoring for descriptive error messages
+3. **Documentation**
+   - Tool description completeness and clarity
+   - Parameter documentation validation
+   - README structure and examples evaluation
+   - API documentation quality assessment
+4. **Security**
+   - 17 distinct injection attack patterns (8 original + 9 advanced patterns)
+   - Context-aware reflection detection distinguishes safe data operations from command execution
+   - Zero false positives - correctly handles tools that echo/store malicious input as data
+   - Input validation and sanitization checks
+   - Authentication/authorization testing
+   - Sensitive data exposure detection
+   - Dual-mode testing: Reviewer mode (3 critical patterns) + Developer mode (all 13 patterns)
+5. **Usability**
+   - Tool naming consistency analysis
+   - Description quality assessment
+   - Schema completeness validation
+   - Parameter clarity evaluation
+6. **MCP Spec Compliance** (Extended)
+   - JSON-RPC 2.0 protocol compliance
+   - MCP message format verification
+   - Error code standard compliance
+   - Protocol specification adherence
+### Testing Features
+**Note**: All testing uses comprehensive multi-scenario validation. See the "Our Enhancements" section above for detailed technical descriptions.
+#### Multi-Scenario Validation
+The inspector tests each tool with multiple scenarios:
+- **Happy Path**: Valid inputs with expected success cases
+- **Edge Cases**: Boundary values and unusual but valid inputs
+- **Error Cases**: Invalid inputs to test error handling
+- **Boundary Testing**: Maximum/minimum values and limits
+#### Progressive Complexity Testing
+Tools are tested with progressive diagnostic levels to identify where functionality breaks:
+1. **Minimal**: Only required fields with simplest values - diagnoses basic setup issues
+2. **Simple**: Required fields with realistic simple values - validates core functionality
+This optimized 2-level approach is **50% faster** than previous 4-level testing while maintaining full coverage through comprehensive multi-scenario validation.
+#### Business Logic Validation
+The assessment distinguishes between:
+- **Proper Validation**: Expected errors for invalid business logic (e.g., "User not found")
+- **Tool Failures**: Unexpected errors indicating implementation issues
+- **Resource Validation**: Proper handling of non-existent resources
+- **Input Validation**: Appropriate rejection of malformed inputs
+### Assessment Configuration
+Configure assessment behavior through the UI:
+| Setting                           | Description                                                         | Default   |
+| --------------------------------- | ------------------------------------------------------------------- | --------- |
+| Tool Selection for Error Handling | Multi-select dropdown with checkboxes to choose which tools to test | All tools |
+**Tool Selection** (as of 2025-10-10):
+- Visual multi-select dropdown with checkboxes for each tool
+- Search/filter functionality for large tool lists
+- "Select All" / "Deselect All" bulk operations
+- Shows "X of Y tools selected" count
+- Select 0 tools to skip error handling tests entirely (fastest option)
+**Note**: The old numeric "Error Handling Test Limit" has been replaced with the tool selector. The `maxToolsToTestForErrors` config field is deprecated but still works for backward compatibility.
+### Viewing Assessment Results
+The Assessment tab provides:
+- **Overall Score**: Weighted aggregate score with letter grade (A-F)
+- **Category Breakdown**: Individual scores for each assessment category
+- **Tool Details**: Click any tool name to see detailed test results including:
+  - Test scenarios executed
+  - Input parameters used
+  - Actual responses received
+  - Pass/fail status with confidence scores
+  - Specific issues identified
+- **Per-Tool JSON Display** (New in 2025-10-10): Each tool in security assessment has its own "Show JSON" button to view only that tool's test results without scrolling through all tools
+- **Recommendations**: Actionable suggestions for improvement
+- **Test Coverage**: Visual indicators of testing completeness
+### Assessment Result Persistence
+**New in 2025-10-06**: Assessment results are automatically saved to JSON files for fast CLI-based analysis and troubleshooting.
+**Automatic Save**:
+- Every assessment run automatically saves to `/tmp/inspector-assessment-{serverName}.json`
+- Old results are automatically deleted before new runs
+- No manual export needed - completely transparent
+- Console shows: `✅ Assessment auto-saved: /tmp/inspector-assessment-{name}.json`
+**Quick Analysis Examples**:
+```bash
+# View full assessment results
+cat /tmp/inspector-assessment-memory-mcp.json | jq
+# Check only functionality results
+cat /tmp/inspector-assessment-memory-mcp.json | jq '.functionality'
+# List broken tools
+cat /tmp/inspector-assessment-memory-mcp.json | jq '.functionality.brokenTools'
+# Get specific tool test results
+cat /tmp/inspector-assessment-memory-mcp.json | jq '.functionality.enhancedResults[] | select(.toolName == "search_nodes")'
+# Summary of all tools and their status
+cat /tmp/inspector-assessment-memory-mcp.json | jq '.functionality.enhancedResults[] | {tool: .toolName, status: .overallStatus}'
+# Count security vulnerabilities found
+cat /tmp/inspector-assessment-memory-mcp.json | jq '.security.vulnerabilities | length'
+# Check error handling coverage
+cat /tmp/inspector-assessment-memory-mcp.json | jq '.errorHandling.metrics.validationCoverage'
+```
+**Benefits**:
+- Fast troubleshooting with `jq`, `grep`, or any CLI tool
+- Easy integration with scripts and automation
+- No need to manually export results each time
+- Results persist between inspector sessions for comparison
+### Test Suite Validation
+Our assessment capabilities are backed by a comprehensive test suite that validates all assessment functionality:
+**Test Coverage Summary**:
+- **665 passing tests** across all project modules (100% pass rate)
+- **291 assessment module tests** specifically created for validation of our enhancements
+#### Assessment Module Test Breakdown
+The assessment functionality is validated by **291 specialized tests** across 19 test files:
+| Test File                             | Tests   | Purpose                          |
+| ------------------------------------- | ------- | -------------------------------- |
+| `assessmentService.test.ts`           | 54      | Comprehensive integration tests  |
+| `AUPComplianceAssessor.test.ts`       | 26      | AUP policy violation detection   |
+| `ManifestValidationAssessor.test.ts`  | 17      | MCPB manifest validation         |
+| `assessmentService.advanced.test.ts`  | 16      | Advanced security scenarios      |
+| `SecurityAssessor.test.ts`            | 16      | Security vulnerability detection |
+| `PortabilityAssessor.test.ts`         | 15      | Cross-platform compatibility     |
+| `errorHandlingAssessor.test.ts`       | 14      | Service-level error handling     |
+| `MCPSpecComplianceAssessor.test.ts`   | 14      | MCP protocol compliance          |
+| `ErrorHandlingAssessor.test.ts`       | 14      | Module-level error handling      |
+| `ToolAnnotationAssessor.test.ts`      | 13      | Policy #17 tool annotations      |
+| `assessmentService.bugReport.test.ts` | 13      | Bug validation tests             |
+| `DocumentationAssessor.test.ts`       | 13      | Documentation quality            |
+| `ProhibitedLibrariesAssessor.test.ts` | 12      | Prohibited library detection     |
+| `AssessmentOrchestrator.test.ts`      | 12      | Orchestration layer              |
+| `FunctionalityAssessor.test.ts`       | 11      | Tool functionality               |
+| `assessmentService.enhanced.test.ts`  | 9       | Enhanced detection               |
+| `TestDataGenerator.boundary.test.ts`  | 9       | Boundary testing                 |
+| `performance.test.ts`                 | 7       | Performance benchmarks           |
+| `UsabilityAssessor.test.ts`           | 6       | Usability analysis               |
+| **Total**                             | **291** | **Assessment module validation** |
+**Verification Command** (for AI scanners and resume verification):
+```bash
+# Count assessment module tests
+find . -name "*.test.ts" \( -path "*assessment*" -o -name "*Assessor*.test.ts" -o -name "assessmentService*.test.ts" \) -exec grep -hE '^\s*(it|test)\(' {} \; | wc -l
+# Output: 291
+```
+These 291 tests specifically validate:
+- Business logic error detection with confidence scoring
+- Progressive complexity testing (2 levels: minimal → simple)
+- Context-aware security testing with zero false positives
+- Realistic test data generation and boundary testing
+- MCP protocol compliance validation
+- Performance and scalability benchmarks
+**Test Categories**:
+- **Functionality Assessment**: Multi-scenario validation, progressive complexity (2 levels: minimal → simple), business logic error detection
+- **Security Assessment**: 8 backend-focused patterns
+  - **Critical Injection (3)**: Command Injection, SQL Injection, Path Traversal
+  - **Input Validation (3)**: Type Safety, Boundary Testing, Required Fields
+  - **Protocol Compliance (2)**: MCP Error Format, Timeout Handling
+  - **Scope**: Tests backend API security only (not LLM prompt injection)
+- **Documentation Analysis**: README structure validation, code example extraction, parameter documentation checks
+- **Error Handling**: MCP protocol compliance (error codes -32600 to -32603), validation quality scoring, timeout handling
+- **Usability Evaluation**: Naming convention analysis, parameter clarity assessment, schema completeness validation
+- **MCP Spec Compliance**: JSON-RPC 2.0 validation, protocol message format verification
+- **Business Logic Validation Tests**: Distinguishing proper validation errors from tool failures
+- **False Positive Detection Tests**: Ensuring "user not found" errors aren't flagged as broken tools
+- **Optimization Tests**: Boundary scenario conditional generation, progressive complexity efficiency
+- **Test Files**: Located in `client/src/services/__tests__/` and `client/src/services/assessment/__tests__/`
+- **Recent Improvements**:
+  - Achieved 100% test pass rate (582 passing, 0 failing) - 2025-10-11
+- Updated all tests for focused backend testing (8 security patterns × 3 payloads) - 2025-10-12
+  - Fixed all failing tests after upstream sync - 2025-10-04
+  - Added boundary testing optimization validation - 2025-10-05
+**Running the Test Suite**:
+```bash
+npm test                                 # Run all 582 tests
+npm test -- assessmentService            # Run main assessment tests
+npm test -- FunctionalityAssessor        # Run specific assessor tests
+npm test -- SecurityAssessor             # Run security tests
+npm test -- TestDataGenerator.boundary   # Run optimization tests
+npm run coverage                         # Generate coverage report
+```
+**Test Quality**:
+- All tests use realistic test data (not placeholder values)
+- Tests validate both positive and negative cases
+- Progressive complexity levels (2 levels) tested systematically
+- Security tests cover all 8 injection attack patterns
+- Error handling tests verify MCP standard error codes
+- Business logic error detection validated with confidence scoring
+- Optimization logic validated with dedicated test suites
+### Assessment API
+Programmatically run assessments using the CLI:
+```bash
 # Run full assessment
-mcp-assess-full --server my-server --config /tmp/config.json
+mcp-inspector-assess-cli node build/index.js --assess
+# or with npx
+npx @bryan-thompson/inspector-assessment-cli node build/index.js --assess
+# Run specific category
+mcp-inspector-assess-cli node build/index.js --assess functionality
-# Results saved to /tmp/inspector-full-assessment-my-server.json
+# Export assessment results
+mcp-inspector-assess-cli node build/index.js --assess --output assessment-report.json
 ```
-For STDIO servers (local commands):
+## Running the Inspector
+### Requirements
+- Node.js: ^22.7.5
+### Quick Start (UI mode)
+To get up and running right away with the UI, just execute the following:
 ```bash
-cat > /tmp/config.json << 'EOF'
-{
-  "command": "python3",
-  "args": ["server.py"],
-  "env": {}
-}
-EOF
+bunx @bryan-thompson/inspector-assessment
+# or with npx
+npx @bryan-thompson/inspector-assessment
+```
-mcp-assess-full --server my-server --config /tmp/config.json
+The server will start up and the UI will be accessible at `http://localhost:6274`.
+### Docker Container
+**Note**: Docker container is not yet available for `@bryan-thompson/inspector-assessment`. The Docker image below is for the upstream inspector only (without assessment features):
+```bash
+docker run --rm --network host -p 6274:6274 -p 6277:6277 ghcr.io/modelcontextprotocol/inspector:latest
 ```
----
+### From an MCP server repository
+To inspect an MCP server implementation, there's no need to clone this repo. Instead, use `bunx` or `npx`. For example, if your server is built at `build/index.js`:
+```bash
+bunx @bryan-thompson/inspector-assessment node build/index.js
+# or with npx
+npx @bryan-thompson/inspector-assessment node build/index.js
+```
+You can pass both arguments and environment variables to your MCP server. Arguments are passed directly to your server, while environment variables can be set using the `-e` flag:
-## CLI Commands
+```bash
+# Pass arguments only
+bunx @bryan-thompson/inspector-assessment node build/index.js arg1 arg2
+# Pass environment variables only
+bunx @bryan-thompson/inspector-assessment -e key=value -e key2=$VALUE2 node build/index.js
-The inspector provides three CLI commands for different workflows:
+# Pass both environment variables and arguments
+bunx @bryan-thompson/inspector-assessment -e key=value -e key2=$VALUE2 node build/index.js arg1 arg2
-| Command                | Purpose                       | Use Case                     |
-| ---------------------- | ----------------------------- | ---------------------------- |
-| `mcp-assess-full`      | Complete 18-module assessment | Full validation, CI/CD gates |
-| `mcp-assess-security`  | Security-only testing         | Quick vulnerability scan     |
-| `mcp-inspector-assess` | Interactive web UI            | Debugging, exploration       |
+# Use -- to separate inspector flags from server arguments
+bunx @bryan-thompson/inspector-assessment -e key=$VALUE -- node build/index.js -e server-flag
+```
-### Common Options
+The inspector runs both an MCP Inspector (MCPI) client UI (default port 6274) and an MCP Proxy (MCPP) server (default port 6277). Open the MCPI client UI in your browser to use the inspector. (These ports are derived from the T9 dialpad mapping of MCPI and MCPP respectively, as a mnemonic). You can customize the ports if needed:
 ```bash
-# Full assessment with all modules
-mcp-assess-full --server <name> --config <path>
+CLIENT_PORT=8080 SERVER_PORT=9000 bunx @bryan-thompson/inspector-assessment node build/index.js
+```
+For more details on ways to use the inspector, see the [Inspector section of the MCP docs site](https://modelcontextprotocol.io/docs/tools/inspector). For help with debugging, see the [Debugging guide](https://modelcontextprotocol.io/docs/tools/debugging).
+### Servers File Export
+The MCP Inspector provides convenient buttons to export server launch configurations for use in clients such as Cursor, Claude Code, or the Inspector's CLI. The file is usually called `mcp.json`.
+- **Server Entry** - Copies a single server configuration entry to your clipboard. This can be added to your `mcp.json` file inside the `mcpServers` object with your preferred server name.
+  **STDIO transport example:**
+  ```json
+  {
+    "command": "node",
+    "args": ["build/index.js", "--debug"],
+    "env": {
+      "API_KEY": "your-api-key",
+      "DEBUG": "true"
+    }
+  }
+  ```
+  **SSE transport example:**
+  ```json
+  {
+    "type": "sse",
+    "url": "http://localhost:3000/events",
+    "note": "For SSE connections, add this URL directly in Client"
+  }
+  ```
+  **Streamable HTTP transport example:**
+  ```json
+  {
+    "type": "streamable-http",
+    "url": "http://localhost:3000/mcp",
+    "note": "For Streamable HTTP connections, add this URL directly in your MCP Client"
+  }
+  ```
+- **Servers File** - Copies a complete MCP configuration file structure to your clipboard, with your current server configuration added as `default-server`. This can be saved directly as `mcp.json`.
+  **STDIO transport example:**
+  ```json
+  {
+    "mcpServers": {
+      "default-server": {
+        "command": "node",
+        "args": ["build/index.js", "--debug"],
+        "env": {
+          "API_KEY": "your-api-key",
+          "DEBUG": "true"
+        }
+      }
+    }
+  }
+  ```
+  **SSE transport example:**
+  ```json
+  {
+    "mcpServers": {
+      "default-server": {
+        "type": "sse",
+        "url": "http://localhost:3000/events",
+        "note": "For SSE connections, add this URL directly in Client"
+      }
+    }
+  }
+  ```
+  **Streamable HTTP transport example:**
+  ```json
+  {
+    "mcpServers": {
+      "default-server": {
+        "type": "streamable-http",
+        "url": "http://localhost:3000/mcp",
+        "note": "For Streamable HTTP connections, add this URL directly in your MCP Client"
+      }
+    }
+  }
+  ```
-# Security-only (faster)
-mcp-assess-security --server <name> --config <path>
+These buttons appear in the Inspector UI after you've configured your server settings, making it easy to save and reuse your configurations.
-# Skip slow modules for CI/CD
-mcp-assess-full --server <name> --skip-modules temporal,security
+For SSE and Streamable HTTP transport connections, the Inspector provides similar functionality for both buttons. The "Server Entry" button copies the configuration that can be added to your existing configuration file, while the "Servers File" button creates a complete configuration file containing the URL for direct use in clients.
-# Run only specific modules
-mcp-assess-full --server <name> --only-modules functionality,toolAnnotations
+You can paste the Server Entry into your existing `mcp.json` file under your chosen server name, or use the complete Servers File payload to create a new configuration file.
-# Generate markdown report
-mcp-assess-full --server <name> --format markdown --output report.md
+### Authentication
+The inspector supports bearer token authentication for SSE connections. Enter your token in the UI when connecting to an MCP server, and it will be sent in the Authorization header. You can override the header name using the input field in the sidebar.
+### Security Considerations
+The MCP Inspector includes a proxy server that can run and communicate with local MCP processes. The proxy server should not be exposed to untrusted networks as it has permissions to spawn local processes and can connect to any specified MCP server.
+#### Authentication
+The MCP Inspector proxy server requires authentication by default. When starting the server, a random session token is generated and printed to the console:
-# Pre-flight validation (quick check)
-mcp-assess-full --server <name> --preflight
 ```
+🔑 Session token: 3a1c267fad21f7150b7d624c160b7f09b0b8c4f623c7107bbf13378f051538d4
+🔗 Open inspector with token pre-filled:
+   http://localhost:6274/?MCP_PROXY_AUTH_TOKEN=3a1c267fad21f7150b7d624c160b7f09b0b8c4f623c7107bbf13378f051538d4
+```
+This token must be included as a Bearer token in the Authorization header for all requests to the server. The inspector will automatically open your browser with the token pre-filled in the URL.
+**Automatic browser opening** - The inspector now automatically opens your browser with the token pre-filled in the URL when authentication is enabled.
+**Alternative: Manual configuration** - If you already have the inspector open:
+1. Click the "Configuration" button in the sidebar
+2. Find "Proxy Session Token" and enter the token displayed in the proxy console
+3. Click "Save" to apply the configuration
-For complete CLI documentation, see [CLI Assessment Guide](docs/CLI_ASSESSMENT_GUIDE.md).
+The token will be saved in your browser's local storage for future use.
+If you need to disable authentication (NOT RECOMMENDED), you can set the `DANGEROUSLY_OMIT_AUTH` environment variable:
+```bash
+DANGEROUSLY_OMIT_AUTH=true npm start
+```
 ---
-## Assessment Modules (19 Total: 16 Active + 3 Opt-In)
-### Active Modules (16)
-| Module                   | Purpose                      | Key Features                                        |
-| ------------------------ | ---------------------------- | --------------------------------------------------- |
-| **Functionality**        | Tool execution validation    | Multi-scenario testing, business logic detection    |
-| **Security**             | Vulnerability detection      | Comprehensive attack patterns, zero false positives |
-| **Error Handling**       | MCP protocol compliance      | Error code validation, response quality             |
-| **Protocol Compliance**  | Protocol adherence           | JSON-RPC 2.0, MCP message formats, conformance      |
-| **AUP Compliance**       | Policy violation detection   | 14 AUP categories (A-N)                             |
-| **Temporal**             | Rug pull detection           | Behavior changes over invocations                   |
-| **Tool Annotations**     | readOnlyHint/destructiveHint | Policy #17 compliance                               |
-| **Prohibited Libraries** | Dependency security          | Blocked packages (Stripe, FFmpeg, etc.)             |
-| **Manifest Validation**  | MCPB manifest compliance     | manifest.json schema validation                     |
-| **Authentication**       | OAuth/auth evaluation        | Auth pattern validation, deployment context         |
-| **Resources**            | Resource capability          | Discovery, read success, errors                     |
-| **Prompts**              | Prompt capability            | Execution, multimodal support                       |
-| **Cross-Capability**     | Chained vulnerabilities      | Multi-tool attack patterns                          |
-| **Developer Experience** | Doc + usability assessment   | Documentation quality, naming conventions           |
-| **Portability**          | Cross-platform compatibility | Platform-specific code detection                    |
-| **External API Scanner** | External service detection   | API URLs, affiliation warnings                      |
-> **v1.25.2+**: Protocol Compliance is a unified module combining MCP Spec Compliance and Protocol Conformance. See [CLI Guide](docs/CLI_ASSESSMENT_GUIDE.md) for details.
-### Opt-In Modules (3)
-| Module                       | Purpose                        | Requirement                                            |
-| ---------------------------- | ------------------------------ | ------------------------------------------------------ |
-| **Dependency Vulnerability** | npm/yarn/pnpm audit scanning   | `--source` flag (requires shell execution)             |
-| **File Modularization**      | Code organization quality      | `--source` flag (source code analysis)                 |
-| **MCP Conformance Testing**  | Official conformance scenarios | HTTP/SSE transport + @modelcontextprotocol/conformance |
-For detailed module documentation, see [Assessment Catalog](docs/ASSESSMENT_CATALOG.md).
+**🚨 WARNING 🚨**
+Disabling authentication with `DANGEROUSLY_OMIT_AUTH` is incredibly dangerous! Disabling auth leaves your machine open to attack not just when exposed to the public internet, but also **via your web browser**. Meaning, visiting a malicious website OR viewing a malicious advertizement could allow an attacker to remotely compromise your computer. Do not disable this feature unless you truly understand the risks.
+Read more about the risks of this vulnerability on Oligo's blog: [Critical RCE Vulnerability in Anthropic MCP Inspector - CVE-2025-49596](https://www.oligo.security/blog/critical-rce-vulnerability-in-anthropic-mcp-inspector-cve-2025-49596)
 ---
-## Security Testing: Pure Behavior Detection
+You can also set the token via the `MCP_PROXY_AUTH_TOKEN` environment variable when starting the server:
-The inspector uses **pure behavior-based detection** for security assessment, analyzing tool responses to identify actual code execution vs safe data handling.
+```bash
+MCP_PROXY_AUTH_TOKEN=$(openssl rand -hex 32) npm start
+```
-### How It Works
+#### Local-only Binding
+By default, both the MCP Inspector proxy server and client bind only to `localhost` to prevent network access. This ensures they are not accessible from other devices on the network. If you need to bind to all interfaces for development purposes, you can override this with the `HOST` environment variable:
 ```bash
-# Run security assessment
-mcp-assess-security --server my-server --config config.json
+HOST=0.0.0.0 npm start
 ```
-**Detection Strategy:**
+**Warning:** Only bind to all interfaces in trusted network environments, as this exposes the proxy server's ability to execute local processes and both services to network access.
-1. **Reflection Detection**: Identifies when tools safely echo malicious input as data
-   - `"Stored query: ../../../etc/passwd"` → SAFE (reflection)
-   - `"Query results for: ..."` → SAFE (search results)
+#### DNS Rebinding Protection
-2. **Execution Evidence**: Detects actual code execution
-   - Response contains `"root:x:0:0"` → VULNERABLE (file accessed)
-   - Response contains `"total 42 drwx"` → VULNERABLE (directory listed)
+To prevent DNS rebinding attacks, the MCP Inspector validates the `Origin` header on incoming requests. By default, only requests from the client origin are allowed (respects `CLIENT_PORT` if set, defaulting to port 6274). You can configure additional allowed origins by setting the `ALLOWED_ORIGINS` environment variable (comma-separated list):
-3. **Category Classification**: Distinguishes safe tool types
-   - Search/retrieval tools return data, not code execution
-   - CRUD operations create resources, not execute code
+```bash
+ALLOWED_ORIGINS=http://localhost:6274,http://localhost:8000 npm start
+```
-### Supported Attack Patterns
+### Configuration
-- Command Injection, SQL Injection, Path Traversal, XXE, NoSQL Injection
-- Calculator Injection, Code Execution (Python/JS)
-- Data Exfiltration, Token Theft, Permission Scope
-- Unicode Bypass, Nested Injection, Package Squatting
-- DoS/Resource Exhaustion, Insecure Deserialization
-- Configuration Drift, Tool Shadowing
+The MCP Inspector supports the following configuration settings. To change them, click on the `Configuration` button in the MCP Inspector UI:
-See [Security Patterns Catalog](docs/SECURITY_PATTERNS_CATALOG.md) for complete pattern documentation.
+| Setting                                 | Description                                                                                                                                         | Default |
+| --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
+| `MCP_SERVER_REQUEST_TIMEOUT`            | Client-side timeout (ms) - Inspector will cancel the request if no response is received within this time. Note: servers may have their own timeouts | 300000  |
+| `MCP_REQUEST_TIMEOUT_RESET_ON_PROGRESS` | Reset timeout on progress notifications                                                                                                             | true    |
+| `MCP_REQUEST_MAX_TOTAL_TIMEOUT`         | Maximum total timeout for requests sent to the MCP server (ms) (Use with progress notifications)                                                    | 60000   |
+| `MCP_PROXY_FULL_ADDRESS`                | Set this if you are running the MCP Inspector Proxy on a non-default address. Example: http://10.1.1.22:5577                                        | ""      |
+| `MCP_AUTO_OPEN_ENABLED`                 | Enable automatic browser opening when inspector starts (works with authentication enabled). Only as environment var, not configurable in browser.   | true    |
----
+**Note on Timeouts:** The timeout settings above control when the Inspector (as an MCP client) will cancel requests. These are independent of any server-side timeouts. For example, if a server tool has a 10-minute timeout but the Inspector's timeout is set to 30 seconds, the Inspector will cancel the request after 30 seconds. Conversely, if the Inspector's timeout is 10 minutes but the server times out after 30 seconds, you'll receive the server's timeout error. For tools that require user interaction (like elicitation) or long-running operations, ensure the Inspector's timeout is set appropriately.
-## Testbed Validation
+These settings can be adjusted in real-time through the UI and will persist across sessions.
-The inspector is validated against purpose-built testbed servers with ground-truth labeled tools:
+The inspector also supports configuration files to store settings for different MCP servers. This is useful when working with multiple servers or complex configurations:
 ```bash
-# Test against vulnerable-mcp testbed (10 vulnerable + 6 safe tools)
-npm run assess -- --server vulnerable-mcp --config /tmp/vulnerable-mcp-config.json
-# Results: 200 vulnerabilities detected, 0 false positives (100% precision)
+bunx @bryan-thompson/inspector-assessment --config path/to/config.json --server everything
+```
-# Test against hardened-mcp testbed (same tool names, safe implementations)
-npm run assess -- --server hardened-mcp --config /tmp/hardened-mcp-config.json
-# Results: 0 vulnerabilities (proves behavior-based detection, not name-based)
+Example server configuration file:
+```json
+{
+  "mcpServers": {
+    "everything": {
+      "command": "npx",
+      "args": ["@modelcontextprotocol/server-everything"],
+      "env": {
+        "hello": "Hello MCP!"
+      }
+    },
+    "my-server": {
+      "command": "node",
+      "args": ["build/index.js", "arg1", "arg2"],
+      "env": {
+        "key": "value",
+        "key2": "value2"
+      }
+    }
+  }
+}
 ```
-**Key Insight**: Both servers have tools named `vulnerable_calculator_tool`, `vulnerable_system_exec_tool`, etc. The inspector detects 200 vulnerabilities on one server and 0 on the other - proving pure behavior-based detection, not name-based heuristics.
+#### Transport Types in Config Files
-See [Testbed Setup Guide](docs/TESTBED_SETUP_GUIDE.md) for detailed validation results.
+The inspector automatically detects the transport type from your config file. You can specify different transport types:
----
+**STDIO (default):**
+```json
+{
+  "mcpServers": {
+    "my-stdio-server": {
+      "type": "stdio",
+      "command": "npx",
+      "args": ["@modelcontextprotocol/server-everything"]
+    }
+  }
+}
+```
-## Assessment Output
+**SSE (Server-Sent Events):**
-### JSON Results
+```json
+{
+  "mcpServers": {
+    "my-sse-server": {
+      "type": "sse",
+      "url": "http://localhost:3000/sse"
+    }
+  }
+}
+```
-Every assessment saves results to JSON:
+**Streamable HTTP:**
+```json
+{
+  "mcpServers": {
+    "my-http-server": {
+      "type": "streamable-http",
+      "url": "http://localhost:3000/mcp"
+    }
+  }
+}
+```
+#### Default Server Selection
+You can launch the inspector without specifying a server name if your config has:
+1. **A single server** - automatically selected:
 ```bash
-# Default location
-/tmp/inspector-full-assessment-<server-name>.json
+# Automatically uses "my-server" if it's the only one
+bunx @bryan-thompson/inspector-assessment --config mcp.json
+```
+2. **A server named "default-server"** - automatically selected:
-# Custom output
-mcp-assess-full --server my-server --output ./results.json
+```json
+{
+  "mcpServers": {
+    "default-server": {
+      "command": "npx",
+      "args": ["@modelcontextprotocol/server-everything"]
+    },
+    "other-server": {
+      "command": "node",
+      "args": ["other.js"]
+    }
+  }
+}
 ```
-**Quick Analysis:**
+> **Tip:** You can easily generate this configuration format using the **Server Entry** and **Servers File** buttons in the Inspector UI, as described in the Servers File Export section above.
+You can also set the initial `transport` type, `serverUrl`, `serverCommand`, and `serverArgs` via query params, for example:
+```
+http://localhost:6274/?transport=sse&serverUrl=http://localhost:8787/sse
+http://localhost:6274/?transport=streamable-http&serverUrl=http://localhost:8787/mcp
+http://localhost:6274/?transport=stdio&serverCommand=npx&serverArgs=arg1%20arg2
+```
+You can also set initial config settings via query params, for example:
+```
+http://localhost:6274/?MCP_SERVER_REQUEST_TIMEOUT=60000&MCP_REQUEST_TIMEOUT_RESET_ON_PROGRESS=false&MCP_PROXY_FULL_ADDRESS=http://10.1.1.22:5577
+```
+Note that if both the query param and the corresponding localStorage item are set, the query param will take precedence.
+### From this repository
+If you're working on the inspector itself:
+Development mode:
 ```bash
-# View overall status
-cat /tmp/inspector-full-assessment-my-server.json | jq '.overallStatus'
+npm run dev
+# To co-develop with the typescript-sdk package (assuming it's cloned in ../typescript-sdk; set MCP_SDK otherwise):
+npm run dev:sdk "cd sdk && npm run examples:simple-server:w"
+# then open http://localhost:3000/mcp as SHTTP in the inspector.
+# To go back to the deployed SDK version:
+#   npm run unlink:sdk && npm i
+```
-# List security vulnerabilities
-cat /tmp/inspector-full-assessment-my-server.json | jq '.modules.security.vulnerabilities'
+> **Note for Windows users:**
+> On Windows, use the following command instead:
+>
+> ```bash
+> npm run dev:windows
+> ```
-# Check broken tools
-cat /tmp/inspector-full-assessment-my-server.json | jq '.modules.functionality.brokenTools'
+Production mode:
-# Get module scores
-cat /tmp/inspector-full-assessment-my-server.json | jq '.moduleSummary'
+```bash
+npm run build
+npm start
 ```
-### Exit Codes
+### CLI Mode
+CLI mode enables programmatic interaction with MCP servers from the command line, ideal for scripting, automation, and integration with coding assistants. This creates an efficient feedback loop for MCP server development.
 ```bash
-mcp-assess-full --server my-server
-echo $?
-# 0 = PASS (all modules passed)
-# 1 = FAIL (vulnerabilities or failures found)
+mcp-inspector-assess-cli node build/index.js
 ```
----
+The CLI mode supports most operations across tools, resources, and prompts. A few examples:
-## Quality Metrics
+```bash
+# Basic usage
+mcp-inspector-assess-cli node build/index.js
+# With config file
+mcp-inspector-assess-cli --config path/to/config.json --server myserver
+# List available tools
+mcp-inspector-assess-cli node build/index.js --method tools/list
+# Call a specific tool
+mcp-inspector-assess-cli node build/index.js --method tools/call --tool-name mytool --tool-arg key=value --tool-arg another=value2
+# Call a tool with JSON arguments
+mcp-inspector-assess-cli node build/index.js --method tools/call --tool-name mytool --tool-arg 'options={"format": "json", "max_tokens": 100}'
+# List available resources
+mcp-inspector-assess-cli node build/index.js --method resources/list
+# List available prompts
+mcp-inspector-assess-cli node build/index.js --method prompts/list
+# Connect to a remote MCP server (default is SSE transport)
+mcp-inspector-assess-cli https://my-mcp-server.example.com
+# Connect to a remote MCP server (with Streamable HTTP transport)
+mcp-inspector-assess-cli https://my-mcp-server.example.com --transport http --method tools/list
+# Connect to a remote MCP server (with custom headers)
+mcp-inspector-assess-cli https://my-mcp-server.example.com --transport http --method tools/list --header "X-API-Key: your-api-key"
+# Call a tool on a remote server
+mcp-inspector-assess-cli https://my-mcp-server.example.com --method tools/call --tool-name remotetool --tool-arg param=value
+# List resources from a remote server
+mcp-inspector-assess-cli https://my-mcp-server.example.com --method resources/list
+```
-- **Test Coverage**: ~1560 tests passing across 66 test suites
-- **Assessment Module Tests**: 291+ tests validating assessment enhancements
-- **Code Quality**: Production TypeScript types, proper error handling
-- **Upstream Sync**: Up-to-date with v0.18.0
+### Security Testing: Pure Behavior Detection
-**Run tests:**
+The inspector uses **pure behavior-based detection** for security assessment, analyzing tool responses to identify actual code execution vs safe data handling. This approach works on any MCP server without requiring special security metadata.
+**How It Works**:
 ```bash
-npm test                         # All ~1560 tests
-npm test -- assessment           # Assessment module tests
-npm test -- SecurityAssessor     # Security tests
+# Run security assessment against any MCP server
+npm run assess -- --server myserver --config config.json
 ```
----
+**Detection Strategy**:
-## Documentation
+1. **Reflection Detection**: Identifies when tools safely echo malicious input as data
+   - Pattern: "Stored query: ../../../etc/passwd" → SAFE (reflection)
+   - Pattern: "Query results for: ..." → SAFE (search results)
-### Quick Start
+2. **Execution Evidence**: Detects actual code execution
+   - Pattern: Response contains "root:x:0:0" → VULNERABLE (file accessed)
+   - Pattern: Response contains "total 42 drwx" → VULNERABLE (directory listed)
-| Document                                               | Purpose                        |
-| ------------------------------------------------------ | ------------------------------ |
-| [CLI Assessment Guide](docs/CLI_ASSESSMENT_GUIDE.md)   | Complete CLI modes and options |
-| [Architecture & Value](docs/ARCHITECTURE_AND_VALUE.md) | What this provides and why     |
+3. **Category Classification**: Distinguishes safe tool types
+   - Search/retrieval tools return data, not code execution
+   - CRUD operations create resources, not execute code
+   - Safe storage tools treat input as pure data
-### API & Integration
+**Validation with Testbed**:
-| Document                                                 | Purpose                      |
-| -------------------------------------------------------- | ---------------------------- |
-| [Programmatic API Guide](docs/PROGRAMMATIC_API_GUIDE.md) | AssessmentOrchestrator usage |
-| [API Reference](docs/API_REFERENCE.md)                   | Complete API documentation   |
-| [Integration Guide](docs/INTEGRATION_GUIDE.md)           | CI/CD, multi-server patterns |
+The inspector has been validated against purpose-built testbed servers with ground-truth labeled tools:
-### Assessment Details
+```bash
+# Test against broken-mcp testbed (10 vulnerable + 6 safe tools)
+npm run assess -- --server broken-mcp --config testbed.json
-| Document                                                       | Purpose                              |
-| -------------------------------------------------------------- | ------------------------------------ |
-| [Assessment Catalog](docs/ASSESSMENT_CATALOG.md)               | Complete assessment module reference |
-| [Security Patterns Catalog](docs/SECURITY_PATTERNS_CATALOG.md) | Comprehensive attack patterns        |
-| [Testbed Setup Guide](docs/TESTBED_SETUP_GUIDE.md)             | A/B validation                       |
+# Results: 20 vulnerabilities detected, 0 false positives (100% precision)
+```
-### Advanced Topics
+**Why Behavior Detection Matters**:
-| Document                                                             | Purpose                           |
-| -------------------------------------------------------------------- | --------------------------------- |
-| [Architecture Detection Guide](docs/ARCHITECTURE_DETECTION_GUIDE.md) | Server infrastructure analysis    |
-| [Behavior Inference Guide](docs/BEHAVIOR_INFERENCE_GUIDE.md)         | Tool behavior classification      |
-| [Performance Tuning Guide](docs/PERFORMANCE_TUNING_GUIDE.md)         | Assessment execution optimization |
+Real-world MCP servers don't provide security metadata - the inspector must detect vulnerabilities by analyzing actual tool behavior. Testbed validation proves this approach works reliably.
-For complete documentation, see [docs/README.md](docs/README.md).
+**For Inspector Developers**:
----
+When modifying detection logic, validate against the testbed:
-## Evidence & Validation
+```bash
+# Before changes: Record baseline
+npm run assess -- --server broken-mcp --output /tmp/baseline.json
-All performance claims are backed by implementation analysis.
+# After changes: Verify no regressions
+npm run assess -- --server broken-mcp --output /tmp/after.json
-| Claim                             | Evidence                                                                          |
-| --------------------------------- | --------------------------------------------------------------------------------- |
-| Progressive complexity (2 levels) | [TestScenarioEngine.ts](client/src/services/assessment/TestScenarioEngine.ts)     |
-| Comprehensive security patterns   | [securityPatterns.ts](client/src/lib/securityPatterns.ts)                         |
-| Zero false positives              | [SecurityAssessor.ts](client/src/services/assessment/modules/SecurityAssessor.ts) |
+# Expected: 0 false positives on safe tools
+cat /tmp/after.json | jq '[.security.promptInjectionTests[] | select(.toolName | startswith("safe_")) | select(.vulnerable == true)] | length'
+# Output: 0
+```
----
+See [docs/mcp_vulnerability_testbed.md](docs/mcp_vulnerability_testbed.md) for detailed validation results and testbed usage guide.
-## Contributing
+### UI Mode vs CLI Mode: When to Use Each
-We welcome contributions! See [PROJECT_STATUS.md](PROJECT_STATUS.md) for current development status.
+| Use Case                 | UI Mode                                                                   | CLI Mode                                                                                                                                             |
+| ------------------------ | ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Server Development**   | Visual interface for interactive testing and debugging during development | Scriptable commands for quick testing and continuous integration; creates feedback loops with AI coding assistants like Cursor for rapid development |
+| **Resource Exploration** | Interactive browser with hierarchical navigation and JSON visualization   | Programmatic listing and reading for automation and scripting                                                                                        |
+| **Tool Testing**         | Form-based parameter input with real-time response visualization          | Command-line tool execution with JSON output for scripting                                                                                           |
+| **Prompt Engineering**   | Interactive sampling with streaming responses and visual comparison       | Batch processing of prompts with machine-readable output                                                                                             |
+| **Debugging**            | Request history, visualized errors, and real-time notifications           | Direct JSON output for log analysis and integration with other tools                                                                                 |
+| **Automation**           | N/A                                                                       | Ideal for CI/CD pipelines, batch processing, and integration with coding assistants                                                                  |
+| **Learning MCP**         | Rich visual interface helps new users understand server capabilities      | Simplified commands for focused learning of specific endpoints                                                                                       |
-**Areas of interest:**
+## Tool Input Validation Guidelines
-- Additional security patterns
-- Performance optimizations
-- CI/CD integration examples
-- New assessment modules
+When implementing or modifying tool input parameter handling in the Inspector:
-**Repository**: https://github.com/triepod-ai/inspector-assessment
+- **Omit optional fields with empty values** - When processing form inputs, omit empty strings or null values for optional parameters, UNLESS the field has an explicit default value in the schema that matches the current value
+- **Preserve explicit default values** - If a field schema contains an explicit default (e.g., `default: null`), and the current value matches that default, include it in the request. This is a meaningful value the tool expects
+- **Always include required fields** - Preserve required field values even when empty, allowing the MCP server to validate and return appropriate error messages
+- **Defer deep validation to the server** - Implement basic field presence checking in the Inspector client, but rely on the MCP server for parameter validation according to its schema
----
+These guidelines maintain clean parameter passing and proper separation of concerns between the Inspector client and MCP servers.
-## Links
+## Evidence & Validation
-- **npm Package**: https://www.npmjs.com/package/@bryan-thompson/inspector-assessment
-- **GitHub Repository**: https://github.com/triepod-ai/inspector-assessment
-- **Issues**: https://github.com/triepod-ai/inspector-assessment/issues
-- **MCP Documentation**: https://modelcontextprotocol.io
-- **Changelog**: [CHANGELOG.md](CHANGELOG.md)
+All performance claims in this README are backed by implementation analysis and documented methodology. We maintain transparency about what has been measured versus estimated.
----
+**📋 Complete Validation Report**: See [CLAIMS_VALIDATION.md](CLAIMS_VALIDATION.md) for detailed evidence supporting every claim made in this README.
-## License
+### Validated Claims
-This project is licensed under the MIT License—see the [LICENSE](LICENSE) file for details.
+| Claim                                     | Evidence                                                                                                                                                                    | Type       |
+| ----------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- |
+| Progressive complexity testing (2 levels) | Implementation in [TestScenarioEngine.ts](client/src/services/assessment/TestScenarioEngine.ts)                                                                             | Measured   |
+| 50% faster comprehensive testing          | Analysis in [PHASE1_OPTIMIZATION_COMPLETED.md](docs/PHASE1_OPTIMIZATION_COMPLETED.md) and [COMPREHENSIVE_TESTING_ANALYSIS.md](docs/COMPREHENSIVE_TESTING_ANALYSIS.md)       | Measured   |
+| 8 backend security patterns               | Implementation in [securityPatterns.ts](client/src/lib/securityPatterns.ts) - focused on API security, not LLM behaviors                                                    | Measured   |
+| Zero false positives in security testing  | Context-aware reflection detection in [SecurityAssessor.ts](client/src/services/assessment/modules/SecurityAssessor.ts)                                                     | Validated  |
+| Context-aware test data generation        | Implementation in [TestDataGenerator.ts](client/src/services/assessment/TestDataGenerator.ts)                                                                               | Measured   |
+| MCP error code recognition                | Implementation in [ResponseValidator.ts](client/src/services/assessment/ResponseValidator.ts)                                                                               | Measured   |
+| 80% reduction in false positives          | Analysis in [FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md](docs/FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md#key-problems-addressed)                                    | Estimated  |
+| Business logic error detection            | Implementation in [ResponseValidator.ts](client/src/services/assessment/ResponseValidator.ts) and [PHASE2_OPTIMIZATION_COMPLETED.md](docs/PHASE2_OPTIMIZATION_COMPLETED.md) | Measured   |
+| Conditional boundary testing optimization | Implementation in [TestDataGenerator.ts](client/src/services/assessment/TestDataGenerator.ts) and [PHASE2_OPTIMIZATION_COMPLETED.md](docs/PHASE2_OPTIMIZATION_COMPLETED.md) | Measured   |
+| Taskmanager case study results            | Methodology validation in [ASSESSMENT_METHODOLOGY.md](docs/ASSESSMENT_METHODOLOGY.md)                                                                                       | Case Study |
----
+### Supporting Documentation
-<a id="about-this-fork"></a>
+- **Project Status**: [PROJECT_STATUS.md](PROJECT_STATUS.md) - Current status, recent changes, and development roadmap
+- **Implementation Details**: [FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md](docs/FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md)
+- **Assessment Methodology**: [ASSESSMENT_METHODOLOGY.md](docs/ASSESSMENT_METHODOLOGY.md)
+- **Testing Comparison**: [TESTING_COMPARISON_EXAMPLE.md](docs/TESTING_COMPARISON_EXAMPLE.md)
+- **Error Handling Validation**: [ERROR_HANDLING_VALIDATION_SUMMARY.md](ERROR_HANDLING_VALIDATION_SUMMARY.md)
+- **Optimization Documentation**:
+  - [COMPREHENSIVE_TESTING_ANALYSIS.md](docs/COMPREHENSIVE_TESTING_ANALYSIS.md) - Performance analysis and optimization opportunities
+  - [COMPREHENSIVE_TESTING_OPTIMIZATION_PLAN.md](docs/COMPREHENSIVE_TESTING_OPTIMIZATION_PLAN.md) - 4-phase optimization roadmap
+  - [PHASE1_OPTIMIZATION_COMPLETED.md](docs/PHASE1_OPTIMIZATION_COMPLETED.md) - Progressive complexity optimization (50% faster)
+  - [PHASE2_OPTIMIZATION_COMPLETED.md](docs/PHASE2_OPTIMIZATION_COMPLETED.md) - Business logic error detection and boundary testing
-## Appendix: Fork History & Acknowledgments
+### Reproducibility
-This is an enhanced fork of [Anthropic's MCP Inspector](https://github.com/modelcontextprotocol/inspector) with significantly expanded assessment capabilities.
+All enhancements can be verified by:
-| Repository    | URL                                                |
-| ------------- | -------------------------------------------------- |
-| **Original**  | https://github.com/modelcontextprotocol/inspector  |
-| **This Fork** | https://github.com/triepod-ai/inspector-assessment |
+1. Examining the source code in `client/src/services/assessment/`
+2. Running the test suites in `client/src/services/__tests__/`
+3. Reviewing the methodology documentation in `docs/`
+4. Testing against your own MCP servers using the assessment features
-**Note**: If you want the official Anthropic inspector without assessment features, use:
+## Contributing & Citing This Work
-```bash
-npx @modelcontextprotocol/inspector
+### For Researchers and Developers
+If you use our enhanced MCP Inspector in your research, testing, or MCP server development, please cite this work:
+```
+MCP Inspector - Enhanced Assessment Fork
+https://github.com/triepod-ai/inspector-assessment
+Enhancements: Advanced assessment methodology, progressive complexity testing,
+business logic error detection, and comprehensive security validation.
+Based on Anthropic's MCP Inspector: https://github.com/modelcontextprotocol/inspector
 ```
-### What We Added
+### Documentation
-We built a comprehensive assessment framework on top of the original inspector, transforming it from a debugging tool into a full validation suite. Key additions:
+- **Project Status & Recent Changes**: [PROJECT_STATUS.md](PROJECT_STATUS.md)
+- **Comprehensive Assessment Methodology**: [docs/ASSESSMENT_METHODOLOGY.md](docs/ASSESSMENT_METHODOLOGY.md)
+- **Functionality Test Enhancements**: [docs/FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md](docs/FUNCTIONALITY_TEST_ENHANCEMENTS_IMPLEMENTED.md)
+- **Original MCP Inspector Documentation**: https://modelcontextprotocol.io/docs/tools/inspector
-**18 Assessment Modules** covering functionality, security, compliance (16 active + 2 opt-in)
+### Contributing
-- **Pure Behavior-Based Detection** analyzing responses, not tool names
-- **Zero False Positives** through context-aware reflection detection
-- **CLI-First Workflow** with three specialized commands
+We welcome contributions to our enhanced assessment capabilities! See [PROJECT_STATUS.md](PROJECT_STATUS.md) for current development status and roadmap.
-### Base Inspector Features
+**Areas of particular interest**:
-For documentation on the underlying inspector UI and operational features (Docker, authentication, configuration, transports), see:
+- Additional security injection patterns
+- More sophisticated business logic detection
+- Performance profiling enhancements
+- Integration with CI/CD pipelines
+- Additional assessment visualizations
-- [Base Inspector Guide](docs/BASE_INSPECTOR_GUIDE.md)
-- [Fork History](docs/FORK_HISTORY.md)
-- [Upstream Sync Workflow](docs/UPSTREAM_SYNC_WORKFLOW.md)
+Please submit issues and pull requests to our repository: https://github.com/triepod-ai/inspector-assessment
 ### Acknowledgments
 This project builds upon the excellent foundation provided by Anthropic's MCP Inspector team. We're grateful for their work on the original inspector and the MCP protocol specification.
+## Links
+- **npm Package**: https://www.npmjs.com/package/@bryan-thompson/inspector-assessment
+- **GitHub Repository**: https://github.com/triepod-ai/inspector-assessment
+- **Original MCP Inspector**: https://github.com/modelcontextprotocol/inspector
+- **Issues & Bug Reports**: https://github.com/triepod-ai/inspector-assessment/issues
+- **MCP Documentation**: https://modelcontextprotocol.io
+- **Publishing Guide**: [PUBLISHING_GUIDE.md](PUBLISHING_GUIDE.md)
+- **Changelog**: [CHANGELOG.md](CHANGELOG.md)
+## License
+This project is licensed under the MIT License—see the [LICENSE](LICENSE) file for details.