@gonzih/skills-devops 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1 +1 @@
1
- {"name":"@gonzih/skills-devops","version":"1.1.0","description":"DevOps skills for Claude Code","type":"module","scripts":{"postinstall":"node install.js"},"files":["skills/","install.js","README.md"],"keywords":["claude","mcp","skills","devops"],"license":"MIT"}
1
+ {"name":"@gonzih/skills-devops","version":"1.2.0","description":"DevOps skills for Claude Code","type":"module","scripts":{"postinstall":"node install.js"},"files":["skills/","install.js","README.md"],"keywords":["claude","mcp","skills","devops"],"license":"MIT"}
@@ -20,6 +20,12 @@ Analyze a set of incoming monitoring alerts, determine severity and urgency, ide
20
20
  4. Identify the root cause alert and downstream effects
21
21
  5. Output a triage summary: what to act on now, what to monitor, what to silence
22
22
 
23
+ ## Live Data Sources
24
+ - **Prometheus alerting rule examples**: `https://github.com/samber/awesome-prometheus-alerts` — community-maintained library of Prometheus alerting rules by category (infra, databases, Kubernetes, etc.)
25
+ - **AWS status page API**: `https://health.aws.amazon.com/health/status` — current AWS service health; programmatic access via AWS Health API (`aws health describe-events`)
26
+ - **GCP status page API**: `https://status.cloud.google.com/incidents.json` — machine-readable GCP incident feed
27
+ - **Azure status API**: `https://azure.status.microsoft/api/v2/status.json` — Azure service health JSON feed
28
+
23
29
  ## Example
24
30
  User: "I have 47 alerts firing: disk full on db-01, high latency on api-gateway, 5xx spike on checkout"
25
31
  → Identifies disk-full on db-01 as root cause driving the cascade, recommends clearing disk space first, provides ordered action plan.
@@ -29,6 +29,11 @@ When the pipeline deploys infrastructure with Pulumi, check for:
29
29
  - **Config missing**: stack config not committed or secret not set — run `pulumi config set <key>`
30
30
  - **Passphrase prompt**: `PULUMI_CONFIG_PASSPHRASE` env var missing for self-managed backends
31
31
 
32
+ ## Live Data Sources
33
+ - **GitHub Actions API**: `https://api.github.com/repos/{owner}/{repo}/actions/runs` — fetch recent workflow run status, logs, and failure reasons
34
+ - **Docker Hub vulnerability feeds**: `https://hub.docker.com/v2/repositories/{namespace}/{repo}/tags` — check image tag availability; cross-reference CVEs via Docker Scout or Snyk
35
+ - **npm audit patterns**: Run `npm audit --json` for dependency vulnerability data; common failure patterns include `EAUDITNOPJSON`, peer conflict errors, and lock file drift
36
+
32
37
  ## Example
33
38
  User: "My GitHub Actions deploy job is failing with 'Error: Unable to locate executable file: docker'"
34
39
  → Diagnose missing Docker setup step, suggest `docker/setup-buildx-action`, apply to workflow YAML.
@@ -27,6 +27,11 @@ When the incident involves infrastructure managed by Pulumi:
27
27
  2. Re-deploy the previous version: check out the corresponding commit and run `pulumi up`, or use `pulumi up --target-replace <urn>` for surgical replacement of a single resource
28
28
  3. Use `pulumi refresh` after rollback to confirm state matches real infrastructure
29
29
 
30
+ ## Live Data Sources
31
+ - **AWS Service Health Dashboard API**: `https://health.aws.amazon.com/health/status` — real-time AWS service events; use `aws health describe-events --filter eventTypeCategories=issue` for programmatic querying
32
+ - **GCP status API**: `https://status.cloud.google.com/incidents.json` — current and historical GCP incidents in JSON format
33
+ - **PagerDuty webhook patterns**: Inbound webhooks deliver `incident.trigger`, `incident.acknowledge`, `incident.resolve` payloads — use these to auto-populate incident timelines and sync state
34
+
30
35
  ## Example
31
36
  User: "Production database is down, started 14:32 UTC, ~10k users affected"
32
37
  → Declares SEV1, drafts initial stakeholder update, starts timeline, prompts for on-call contacts, and guides through mitigation steps to resolution and postmortem.
@@ -26,6 +26,11 @@ When the runbook covers an infrastructure change managed by Pulumi, include:
26
26
  - Link to the stack in Pulumi Cloud (e.g. `https://app.pulumi.com/<org>/<project>/<stack>`)
27
27
  - Rollback step using `pulumi stack history` to identify the previous deployment and re-run it
28
28
 
29
+ ## Live Data Sources
30
+ - **AWS Runbook templates**: `https://docs.aws.amazon.com/systems-manager/latest/userguide/automation-documents.html` — official AWS Systems Manager Automation runbook library
31
+ - **PagerDuty runbook community**: `https://community.pagerduty.com` — real-world runbook examples shared by practitioners
32
+ - **SRE Workbook public patterns**: `https://sre.google/workbook/table-of-contents/` — Google SRE Workbook chapters on on-call and operational procedures
33
+
29
34
  ## Example
30
35
  User: "Write a runbook for restarting the payment-service in Kubernetes"
31
36
  → Produces a runbook covering health checks, drain, rolling restart, verification, and rollback with `kubectl` commands.