PyPI - awslabs.cloudwatch-appsignals-mcp-server - Versions diffs - 0.1.9__tar.gz → 0.1.11__tar.gz - Mend

awslabs.cloudwatch-appsignals-mcp-server 0.1.9tar.gz → 0.1.11tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

{awslabs_cloudwatch_appsignals_mcp_server-0.1.9 → awslabs_cloudwatch_appsignals_mcp_server-0.1.11}/Dockerfile RENAMED Viewed

@@ -13,7 +13,7 @@
 # limitations under the License.
 # dependabot should continue to update this to the latest hash.
-FROM public.ecr.aws/docker/library/python:3.13.5-alpine3.21@sha256:c9a09c45a4bcc618c7f7128585b8dd0d41d0c31a8a107db4c8255ffe0b69375d AS uv
+FROM public.ecr.aws/docker/library/python:3.13-alpine@sha256:070342a0cc1011532c0e69972cce2bbc6cc633eba294bae1d12abea8bd05303b AS uv
 # Install the project into `/app`
 WORKDIR /app
@@ -61,7 +61,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
 # Make the directory just in case it doesn't exist
 RUN mkdir -p /root/.local
-FROM public.ecr.aws/docker/library/python:3.13.5-alpine3.21@sha256:c9a09c45a4bcc618c7f7128585b8dd0d41d0c31a8a107db4c8255ffe0b69375d
+FROM public.ecr.aws/docker/library/python:3.13-alpine@sha256:070342a0cc1011532c0e69972cce2bbc6cc633eba294bae1d12abea8bd05303b
 # Place executables in the environment at the front of the path and include other binaries
 ENV PATH="/app/.venv/bin:$PATH" \

{awslabs_cloudwatch_appsignals_mcp_server-0.1.9 → awslabs_cloudwatch_appsignals_mcp_server-0.1.11}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: awslabs.cloudwatch-appsignals-mcp-server
-Version: 0.1.9
+Version: 0.1.11
 Summary: An AWS Labs Model Context Protocol (MCP) server for AWS Application Signals
 Project-URL: Homepage, https://awslabs.github.io/mcp/
 Project-URL: Documentation, https://awslabs.github.io/mcp/servers/cloudwatch-appsignals-mcp-server/
@@ -181,7 +181,35 @@ FILTER attributes.aws.local.service = "payment-service" and attributes.aws.local
 - `duration > 5` - Find slow requests (over 5 seconds)
 - `annotation[aws.local.operation]="GET /api/orders"` - Filter by specific operation
-#### 12. **`list_slis`** - Legacy SLI Status Report (Specialized Tool)
+#### 12. **`analyze_canary_failures`** - Comprehensive Canary Failure Analysis
+**Deep dive into CloudWatch Synthetics canary failures with root cause identification**
+- Comprehensive canary failure analysis with deep dive into issues
+- Analyze historical patterns and specific incident details
+- Get comprehensive artifact analysis including logs, screenshots, and HAR files
+- Receive actionable recommendations based on AWS debugging methodology
+- Correlate canary failures with Application Signals telemetry data
+- Identify performance degradation and availability issues across service dependencies
+**Key Features:**
+- **Failure Pattern Analysis**: Identifies recurring failure modes and temporal patterns
+- **Artifact Deep Dive**: Analyzes canary logs, screenshots, and network traces for root causes
+- **Service Correlation**: Links canary failures to upstream/downstream service issues using Application Signals
+- **Performance Insights**: Detects latency spikes, fault rates, and connection issues
+- **Actionable Remediation**: Provides specific steps based on AWS operational best practices
+- **IAM Analysis**: Validates IAM roles and permissions for common canary access issues
+- **Backend Service Integration**: Correlates canary failures with backend service errors and exceptions
+**Common Use Cases:**
+- Incident Response: Rapid diagnosis of canary failures during outages
+- Performance Investigation: Understanding latency and availability degradation
+- Dependency Analysis: Identifying which services are causing canary failures
+- Historical Trending: Analyzing failure patterns over time for proactive improvements
+- Root Cause Analysis: Deep dive into specific failure scenarios with full context
+- Infrastructure Issues: Diagnose S3 access, VPC connectivity, and browser target problems
+- Backend Service Debugging: Identify application code issues affecting canary success
+#### 13. **`list_slis`** - Legacy SLI Status Report (Specialized Tool)
 **Use `audit_services()` as the PRIMARY tool for service auditing**
 - Basic report showing summary counts (total, healthy, breached, insufficient data)
@@ -526,6 +554,68 @@ Found 8 services being monitored:
 3. Review payment-api-service timeout configurations
 ```
+### Example 6: Canary Failure Analysis and Root Cause Investigation
+```
+User: "My Pet Clinic canaries are failing. Can you help me diagnose the issues?"
+Assistant: I'll analyze your Pet Clinic canaries to identify the root causes of failures.
+[Uses analyze_canary_failures for comprehensive canary analysis]
+analyze_canary_failures(canary_name="pc-visit-vet")
+analyze_canary_failures(canary_name="pc-add-visit")
+analyze_canary_failures(canary_name="webapp-erorrpagecanary")
+🔍 CANARY FAILURE ANALYSIS RESULTS:
+🔴 CRITICAL ISSUES IDENTIFIED:
+**pc-visit-vet canary:**
+• Root Cause: S3 bucket access issue
+• Error Pattern: Exit status 127, "No such file or directory"
+• Failure Count: 5 consecutive failures
+• IAM Analysis: ✅ Role exists but S3 bucket ARN patterns incorrect in policies
+**pc-add-visit canary:**
+• Root Cause: Selector timeout + backend service errors
+• Error Pattern: 30000ms timeout waiting for UI element + MissingFormatArgumentException
+• Backend Issue: Format specifier '% o' error in BedrockRuntimeV1Service.invokeTitanModel()
+• Performance: 34 second average response time, 0% success rate
+**webapp-erorrpagecanary:**
+• Root Cause: Browser target close during selector wait
+• Error Pattern: "Target closed" waiting for `#jsError` selector
+• Failure Count: 5 consecutive failures with 60000ms connection timeouts
+🔍 BACKEND SERVICE CORRELATION:
+• MissingFormatArgumentException detected in Pet Clinic backend
+• Location: org.springframework.samples.petclinic.customers.aws.BedrockRuntimeV1Service.invokeTitanModel (line 75)
+• Impact: Affects multiple canaries testing Pet Clinic functionality
+• 20% fault rate on GET /api/customer/diagnose/owners/{ownerId}/pets/{petId}
+🛠️ RECOMMENDED ACTIONS:
+**Immediate (Critical):**
+1. Fix S3 bucket ARN patterns in pc-visit-vet IAM policy
+2. Fix format string bug in BedrockRuntimeV1Service: change '% o' to '%s' or correct format
+3. Add VPC permissions to canary IAM roles if Lambda runs in VPC
+**Infrastructure (High Priority):**
+4. Investigate browser target stability issues (webapp-erorrpagecanary)
+5. Review canary timeout configurations - consider increasing from 30s to 60s
+6. Implement circuit breaker pattern for external service dependencies
+**Monitoring (Medium Priority):**
+7. Add Application Signals monitoring for canary success rates
+8. Set up alerts for consecutive canary failures (>3 failures)
+9. Implement canary health dashboard with real-time status
+🎯 EXPECTED OUTCOMES:
+• S3 access fix: Immediate resolution of pc-visit-vet failures
+• Backend service fix: 80%+ improvement in Pet Clinic canary success rates
+• Infrastructure improvements: Reduced browser target close errors
+• Enhanced monitoring: Proactive failure detection and faster resolution
+```
 ## Recommended Workflows
 ### 🎯 Primary Audit Workflow (Most Common)

{awslabs_cloudwatch_appsignals_mcp_server-0.1.9 → awslabs_cloudwatch_appsignals_mcp_server-0.1.11}/README.md RENAMED Viewed

@@ -151,7 +151,35 @@ FILTER attributes.aws.local.service = "payment-service" and attributes.aws.local
 - `duration > 5` - Find slow requests (over 5 seconds)
 - `annotation[aws.local.operation]="GET /api/orders"` - Filter by specific operation
-#### 12. **`list_slis`** - Legacy SLI Status Report (Specialized Tool)
+#### 12. **`analyze_canary_failures`** - Comprehensive Canary Failure Analysis
+**Deep dive into CloudWatch Synthetics canary failures with root cause identification**
+- Comprehensive canary failure analysis with deep dive into issues
+- Analyze historical patterns and specific incident details
+- Get comprehensive artifact analysis including logs, screenshots, and HAR files
+- Receive actionable recommendations based on AWS debugging methodology
+- Correlate canary failures with Application Signals telemetry data
+- Identify performance degradation and availability issues across service dependencies
+**Key Features:**
+- **Failure Pattern Analysis**: Identifies recurring failure modes and temporal patterns
+- **Artifact Deep Dive**: Analyzes canary logs, screenshots, and network traces for root causes
+- **Service Correlation**: Links canary failures to upstream/downstream service issues using Application Signals
+- **Performance Insights**: Detects latency spikes, fault rates, and connection issues
+- **Actionable Remediation**: Provides specific steps based on AWS operational best practices
+- **IAM Analysis**: Validates IAM roles and permissions for common canary access issues
+- **Backend Service Integration**: Correlates canary failures with backend service errors and exceptions
+**Common Use Cases:**
+- Incident Response: Rapid diagnosis of canary failures during outages
+- Performance Investigation: Understanding latency and availability degradation
+- Dependency Analysis: Identifying which services are causing canary failures
+- Historical Trending: Analyzing failure patterns over time for proactive improvements
+- Root Cause Analysis: Deep dive into specific failure scenarios with full context
+- Infrastructure Issues: Diagnose S3 access, VPC connectivity, and browser target problems
+- Backend Service Debugging: Identify application code issues affecting canary success
+#### 13. **`list_slis`** - Legacy SLI Status Report (Specialized Tool)
 **Use `audit_services()` as the PRIMARY tool for service auditing**
 - Basic report showing summary counts (total, healthy, breached, insufficient data)
@@ -496,6 +524,68 @@ Found 8 services being monitored:
 3. Review payment-api-service timeout configurations
 ```
+### Example 6: Canary Failure Analysis and Root Cause Investigation
+```
+User: "My Pet Clinic canaries are failing. Can you help me diagnose the issues?"
+Assistant: I'll analyze your Pet Clinic canaries to identify the root causes of failures.
+[Uses analyze_canary_failures for comprehensive canary analysis]
+analyze_canary_failures(canary_name="pc-visit-vet")
+analyze_canary_failures(canary_name="pc-add-visit")
+analyze_canary_failures(canary_name="webapp-erorrpagecanary")
+🔍 CANARY FAILURE ANALYSIS RESULTS:
+🔴 CRITICAL ISSUES IDENTIFIED:
+**pc-visit-vet canary:**
+• Root Cause: S3 bucket access issue
+• Error Pattern: Exit status 127, "No such file or directory"
+• Failure Count: 5 consecutive failures
+• IAM Analysis: ✅ Role exists but S3 bucket ARN patterns incorrect in policies
+**pc-add-visit canary:**
+• Root Cause: Selector timeout + backend service errors
+• Error Pattern: 30000ms timeout waiting for UI element + MissingFormatArgumentException
+• Backend Issue: Format specifier '% o' error in BedrockRuntimeV1Service.invokeTitanModel()
+• Performance: 34 second average response time, 0% success rate
+**webapp-erorrpagecanary:**
+• Root Cause: Browser target close during selector wait
+• Error Pattern: "Target closed" waiting for `#jsError` selector
+• Failure Count: 5 consecutive failures with 60000ms connection timeouts
+🔍 BACKEND SERVICE CORRELATION:
+• MissingFormatArgumentException detected in Pet Clinic backend
+• Location: org.springframework.samples.petclinic.customers.aws.BedrockRuntimeV1Service.invokeTitanModel (line 75)
+• Impact: Affects multiple canaries testing Pet Clinic functionality
+• 20% fault rate on GET /api/customer/diagnose/owners/{ownerId}/pets/{petId}
+🛠️ RECOMMENDED ACTIONS:
+**Immediate (Critical):**
+1. Fix S3 bucket ARN patterns in pc-visit-vet IAM policy
+2. Fix format string bug in BedrockRuntimeV1Service: change '% o' to '%s' or correct format
+3. Add VPC permissions to canary IAM roles if Lambda runs in VPC
+**Infrastructure (High Priority):**
+4. Investigate browser target stability issues (webapp-erorrpagecanary)
+5. Review canary timeout configurations - consider increasing from 30s to 60s
+6. Implement circuit breaker pattern for external service dependencies
+**Monitoring (Medium Priority):**
+7. Add Application Signals monitoring for canary success rates
+8. Set up alerts for consecutive canary failures (>3 failures)
+9. Implement canary health dashboard with real-time status
+🎯 EXPECTED OUTCOMES:
+• S3 access fix: Immediate resolution of pc-visit-vet failures
+• Backend service fix: 80%+ improvement in Pet Clinic canary success rates
+• Infrastructure improvements: Reduced browser target close errors
+• Enhanced monitoring: Proactive failure detection and faster resolution
+```
 ## Recommended Workflows
 ### 🎯 Primary Audit Workflow (Most Common)

{awslabs_cloudwatch_appsignals_mcp_server-0.1.9 → awslabs_cloudwatch_appsignals_mcp_server-0.1.11}/awslabs/cloudwatch_appsignals_mcp_server/__init__.py RENAMED Viewed

@@ -14,4 +14,4 @@
 """AWS Application Signals MCP Server."""
-__version__ = '0.1.9'
+__version__ = '0.1.11'

{awslabs_cloudwatch_appsignals_mcp_server-0.1.9 → awslabs_cloudwatch_appsignals_mcp_server-0.1.11}/awslabs/cloudwatch_appsignals_mcp_server/audit_utils.py RENAMED Viewed

@@ -654,7 +654,12 @@ def expand_service_operation_wildcard_patterns(
                                 # Check if this operation has the required metric type
                                 metric_refs = operation.get('MetricReferences', [])
                                 has_metric_type = any(
-                                    ref.get('MetricType', '') == metric_type for ref in metric_refs
+                                    ref.get('MetricType', '') == metric_type
+                                    or (
+                                        metric_type == 'Availability'
+                                        and ref.get('MetricType', '') == 'Fault'
+                                    )
+                                    for ref in metric_refs
                                 )
                                 if has_metric_type:

{awslabs_cloudwatch_appsignals_mcp_server-0.1.9 → awslabs_cloudwatch_appsignals_mcp_server-0.1.11}/awslabs/cloudwatch_appsignals_mcp_server/aws_clients.py RENAMED Viewed

@@ -35,6 +35,7 @@ def _initialize_aws_clients():
     logs_endpoint = os.environ.get('MCP_LOGS_ENDPOINT')
     cloudwatch_endpoint = os.environ.get('MCP_CLOUDWATCH_ENDPOINT')
     xray_endpoint = os.environ.get('MCP_XRAY_ENDPOINT')
+    synthetics_endpoint = os.environ.get('MCP_SYNTHETICS_ENDPOINT')
     # Log endpoint overrides
     if appsignals_endpoint:
@@ -45,6 +46,8 @@ def _initialize_aws_clients():
         logger.debug(f'Using CloudWatch endpoint override: {cloudwatch_endpoint}')
     if xray_endpoint:
         logger.debug(f'Using X-Ray endpoint override: {xray_endpoint}')
+    if synthetics_endpoint:
+        logger.debug(f'Using Synthetics endpoint override: {synthetics_endpoint}')
     # Check for AWS_PROFILE environment variable
     if aws_profile := os.environ.get('AWS_PROFILE'):
@@ -59,6 +62,11 @@ def _initialize_aws_clients():
         )
         cloudwatch = session.client('cloudwatch', config=config, endpoint_url=cloudwatch_endpoint)
         xray = session.client('xray', config=config, endpoint_url=xray_endpoint)
+        synthetics = session.client('synthetics', config=config, endpoint_url=synthetics_endpoint)
+        s3 = session.client('s3', config=config)
+        iam = session.client('iam', config=config)
+        lambda_client = session.client('lambda', config=config)
+        sts = session.client('sts', config=config)
     else:
         logs = boto3.client(
             'logs', region_name=AWS_REGION, config=config, endpoint_url=logs_endpoint
@@ -75,14 +83,32 @@ def _initialize_aws_clients():
         xray = boto3.client(
             'xray', region_name=AWS_REGION, config=config, endpoint_url=xray_endpoint
         )
+        # Additional clients for canary functionality
+        synthetics = boto3.client(
+            'synthetics', region_name=AWS_REGION, config=config, endpoint_url=synthetics_endpoint
+        )
+        s3 = boto3.client('s3', region_name=AWS_REGION, config=config)
+        iam = boto3.client('iam', region_name=AWS_REGION, config=config)
+        lambda_client = boto3.client('lambda', region_name=AWS_REGION, config=config)
+        sts = boto3.client('sts', region_name=AWS_REGION, config=config)
     logger.debug('AWS clients initialized successfully')
-    return logs, appsignals, cloudwatch, xray
+    return logs, appsignals, cloudwatch, xray, synthetics, s3, iam, lambda_client, sts
 # Initialize clients at module level
 try:
-    logs_client, appsignals_client, cloudwatch_client, xray_client = _initialize_aws_clients()
+    (
+        logs_client,
+        appsignals_client,
+        cloudwatch_client,
+        xray_client,
+        synthetics_client,
+        s3_client,
+        iam_client,
+        lambda_client,
+        sts_client,
+    ) = _initialize_aws_clients()
 except Exception as e:
     logger.error(f'Failed to initialize AWS clients: {str(e)}')
     raise

awslabs.cloudwatch-appsignals-mcp-server 0.1.9__tar.gz → 0.1.11__tar.gz

awslabs.cloudwatch-appsignals-mcp-server 0.1.9tar.gz → 0.1.11tar.gz