npm - @sentry/junior-datadog - Versions diffs - 0.39.0 → 0.40.0 - Mend

@sentry/junior-datadog 0.39.0 → 0.40.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +29 -12
package/package.json +1 -1
package/plugin.yaml +49 -26
package/skills/datadog/SKILL.md +25 -23
package/skills/datadog/references/api-surface.md +43 -45
package/skills/datadog/references/common-use-cases.md +40 -35
package/skills/datadog/references/query-syntax.md +54 -36
package/skills/datadog/references/troubleshooting-workarounds.md +21 -12

package/README.md CHANGED Viewed

@@ -1,11 +1,6 @@
 # @sentry/junior-datadog
-> [!WARNING]
-> **This plugin does not currently work.** Datadog's hosted MCP server requires OAuth Dynamic Client Registration (DCR, [RFC 7591](https://www.rfc-editor.org/rfc/rfc7591)) for third-party clients like Junior, and DCR is locked down on Datadog's side. Until Datadog exposes DCR (or an equivalent registration path) on `mcp.datadoghq.com`, Junior cannot complete the OAuth handshake and every Datadog tool call will fail.
->
-> The package is kept in-tree so the integration is ready to ship the moment Datadog unblocks DCR. Do not add it to a production deployment in the meantime.
-`@sentry/junior-datadog` adds read-only Datadog telemetry workflows to Junior through Datadog's hosted MCP server.
+`@sentry/junior-datadog` adds read-only Datadog telemetry workflows to Junior through Datadog's Pup CLI.
 Install it alongside `@sentry/junior`:
@@ -21,13 +16,35 @@ juniorNitro({
 });
 ```
-This package does not use `DD_API_KEY`, `DD_APP_KEY`, or a shared workspace integration. Each user connects their own Datadog account the first time Junior calls a Datadog MCP tool. Junior sends the OAuth link privately and resumes the thread automatically after the user authorizes.
+Set Datadog credentials in the Junior deployment environment:
+```bash
+DATADOG_API_KEY=...
+DATADOG_APP_KEY=...
+DATADOG_SITE=datadoghq.com # optional; defaults to US1
+```
+Use `DATADOG_API_KEY`, `DATADOG_APP_KEY`, and `DATADOG_SITE` in the Junior deployment environment. The plugin maps those host-side `DATADOG_*` values to Datadog API headers and Pup's sandbox `DD_*` env values.
+The real API and application keys stay host-side. Junior injects them into matching Datadog API requests as `DD-API-KEY` and `DD-APPLICATION-KEY` headers; the sandbox only receives non-secret placeholder values so Pup can perform its normal auth checks.
-Junior intentionally keeps this package read-only by limiting the MCP tool surface to search, fetch, and log analytics tools. The plugin does not expose notebook writes, monitor edits, or other mutating Datadog tools.
+Junior keeps this package read-only by setting Pup's read-only mode and by guiding the skill to use `pup --read-only --agent` commands. The plugin is intended for searches, fetches, and analytics across logs, metrics, traces/spans, monitors, incidents, dashboards, hosts, services, and RUM.
 ## Datadog site
-The packaged manifest defaults to the US1 endpoint (`mcp.datadoghq.com`) and enables the `core`, `apm`, and `error-tracking` toolsets. Teams on other Datadog sites (US3, US5, EU, AP1, AP2, GovCloud) set `DATADOG_SITE` in their Junior deployment env to their site host (e.g. `us5.datadoghq.com`, `datadoghq.eu`, `ddog-gov.com`). No code changes or plugin copy needed. See the [Datadog plugin docs](https://junior.sentry.dev/extend/datadog-plugin/) for the full site table.
+The packaged manifest defaults to the US1 API endpoint. Teams on other Datadog sites set `DATADOG_SITE` in their Junior deployment env to their site host. Setting deployment `DD_SITE` alone has no effect.
+| Datadog site | `DATADOG_SITE` value                 |
+| ------------ | ------------------------------------ |
+| US1          | _unset_ (default) or `datadoghq.com` |
+| US3          | `us3.datadoghq.com`                  |
+| US5          | `us5.datadoghq.com`                  |
+| EU           | `datadoghq.eu`                       |
+| AP1          | `ap1.datadoghq.com`                  |
+| AP2          | `ap2.datadoghq.com`                  |
+| GovCloud     | `ddog-gov.com`                       |
+The packaged API allowlist covers those standard Datadog sites. Custom or staging Datadog domains require a manifest change so the sandbox network header transform is allowed for that host.
 ## Optional channel defaults
@@ -42,8 +59,8 @@ These defaults are optional fallbacks. If a user names a different env or servic
 ## Auth model
-- Datadog MCP requires user-based OAuth (OAuth 2.1 + PKCE) and does not accept shared bearer tokens here.
-- This package is not suitable for fully headless or unattended automation.
-- Users can disconnect from Junior App Home with `Unlink`, or by asking Junior to disconnect Datadog.
+- This package uses deployment-level Datadog API and application keys, not per-user OAuth.
+- Use a Datadog application key with the smallest read scopes/role that covers the telemetry users need.
+- Real key values never enter the sandbox env, files, or command arguments.
 Full setup guide: https://junior.sentry.dev/extend/datadog-plugin/

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@sentry/junior-datadog",
-  "version": "0.39.0",
+  "version": "0.40.0",
   "private": false,
   "publishConfig": {
     "access": "public"

package/plugin.yaml CHANGED Viewed

@@ -1,36 +1,59 @@
 name: datadog
-description: Query Datadog telemetry (logs, metrics, traces, monitors, incidents, dashboards) via Datadog's hosted MCP server
+description: Query Datadog telemetry (logs, metrics, traces, monitors, incidents, dashboards) with Datadog's Pup CLI
 config-keys:
   - env
   - service
-# Datadog orgs are region-pinned. The MCP hostname must match the customer's
-# Datadog site. Non-US1 operators set DATADOG_SITE to their site host (e.g.
-# `us5.datadoghq.com`, `datadoghq.eu`, `ap1.datadoghq.com`, `ddog-gov.com`).
-# US1 operators can leave DATADOG_SITE unset and the default applies.
+capabilities:
+  - api
+# Datadog orgs are region-pinned. Pup routes requests to api.${DATADOG_SITE}.
+# Deployment env vars use DATADOG_* names; Pup receives DD_* command env.
+# Non-US1 operators set DATADOG_SITE to their site host (e.g. us5.datadoghq.com,
+# datadoghq.eu, ap1.datadoghq.com, ddog-gov.com). US1 operators can leave
+# DATADOG_SITE unset and the default applies.
 env-vars:
+  DATADOG_API_KEY:
+  DATADOG_APP_KEY:
   DATADOG_SITE:
     default: datadoghq.com
-mcp:
-  url: https://mcp.${DATADOG_SITE}/api/unstable/mcp-server/mcp?toolsets=core,apm,error-tracking
-  allowed-tools:
-    - analyze_datadog_logs
-    - get_datadog_incident
-    - get_datadog_metric
-    - get_datadog_metric_context
-    - get_datadog_notebook
-    - get_datadog_trace
-    - search_datadog_dashboards
-    - search_datadog_events
-    - search_datadog_hosts
-    - search_datadog_incidents
-    - search_datadog_logs
-    - search_datadog_metrics
-    - search_datadog_monitors
-    - search_datadog_notebooks
-    - search_datadog_rum_events
-    - search_datadog_service_dependencies
-    - search_datadog_services
-    - search_datadog_spans
+api-domains:
+  - api.datadoghq.com
+  - api.us3.datadoghq.com
+  - api.us5.datadoghq.com
+  - api.ap1.datadoghq.com
+  - api.ap2.datadoghq.com
+  - api.datadoghq.eu
+  - api.ddog-gov.com
+api-headers:
+  DD-API-KEY: ${DATADOG_API_KEY}
+  DD-APPLICATION-KEY: ${DATADOG_APP_KEY}
+command-env:
+  DD_API_KEY: host_managed_credential
+  DD_APP_KEY: host_managed_credential
+  DD_SITE: ${DATADOG_SITE}
+  DD_READ_ONLY: "1"
+  FORCE_AGENT_MODE: "1"
+runtime-postinstall:
+  - cmd: bash
+    args:
+      - -lc
+      - |
+        set -euo pipefail
+        version=0.58.5
+        archive="pup_${version}_Linux_x86_64.tar.gz"
+        url="https://github.com/DataDog/pup/releases/download/v${version}/${archive}"
+        sha256="9543d968a6bd3b00da7ef20053717494beba7962e6cea01368d82857c8ea926b"
+        tmp="$(mktemp -d)"
+        trap 'rm -rf "$tmp"' EXIT
+        curl -fsSL "$url" -o "$tmp/$archive"
+        echo "${sha256}  $tmp/$archive" | sha256sum -c -
+        tar -xzf "$tmp/$archive" -C "$tmp"
+        mkdir -p /vercel/sandbox/.junior/bin
+        install -m 0755 "$tmp/pup" /vercel/sandbox/.junior/bin/pup
+        pup --version

package/skills/datadog/SKILL.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
 name: datadog
-description: Query live Datadog telemetry (logs, metrics, traces, spans, monitors, incidents, dashboards, services, hosts) through Datadog's hosted MCP server. Use when users ask to investigate production behavior in Datadog — searching logs, checking monitor status, inspecting traces or spans, looking up incidents, finding services, or correlating metrics. Do not use it for Sentry issues, repository/source-code work, or ticketing.
+description: Query live Datadog telemetry (logs, metrics, traces, spans, monitors, incidents, dashboards, services, hosts) through Datadog's Pup CLI. Use when users ask to investigate production behavior in Datadog, including searching logs, checking monitor status, inspecting traces or spans, looking up incidents, finding services, or correlating metrics. Do not use it for Sentry issues, repository/source-code work, or ticketing.
 ---
 # Datadog Operations
-Use this skill for Datadog observability investigations.
+Use this skill for read-only Datadog observability investigations.
 ## Reference loading
@@ -25,41 +25,43 @@ Load references conditionally based on the request:
 - Prefer explicit env, service, host, monitor/incident IDs, trace IDs, or Datadog URLs when the user provides them.
 - When the user did not specify a scope, treat `datadog.env` and `datadog.service` conversation config as optional defaults. Explicit user input always wins over config.
 - Only set or change `datadog.env` and `datadog.service` when the user explicitly asks to store a default for this conversation or channel.
-- If the request refers to an earlier telemetry item indirectly (an incident, trace, or monitor already mentioned in the thread), inspect the current thread for the existing ID or URL before asking the user to restate it.
+- If the request refers to an earlier telemetry item indirectly, inspect the current thread for the existing ID or URL before asking the user to restate it.
 - Ask one concise follow-up only when a search is genuinely under-specified, for example when the user asks about "errors" with no env, service, or time window hint and the thread has no prior context.
-2. Use the active Datadog tools:
-- Start narrow: pick the single most direct tool for the request before reaching for broader search.
-  - Known incident ID → `get_datadog_incident`
-  - Known trace ID → `get_datadog_trace`
-  - Known notebook ID → `get_datadog_notebook`
-  - Known metric name → `get_datadog_metric` (and `get_datadog_metric_context` when the user wants available tags or dimensions)
-- For exploratory questions, prefer one `search_datadog_*` call with a tight query, then one follow-up fetch if needed.
-- For "what is the current error rate / log volume / top offenders" style questions, prefer `analyze_datadog_logs` (SQL-style aggregation) over pulling raw log pages back through `search_datadog_logs`.
-- For service-topology questions ("what calls checkout?", "what does the payment API depend on?"), prefer `search_datadog_service_dependencies` over manually stitching spans together.
-- Use `search_datadog_monitors` for "is this alerting?" or "what is monitor X doing?"; use `search_datadog_incidents` / `get_datadog_incident` for incident context.
-- Use `search_datadog_rum_events` only when the user asks about real-user / browser telemetry, not for backend issues.
+2. Use Pup:
+- Run Datadog commands with `pup --read-only --agent ...`. The plugin also sets read-only/agent env vars, but include the flags so command transcripts show the intended mode.
+- If you are unsure about a command or flag, inspect Pup's schema with `pup --read-only --agent agent schema --compact` or the relevant `pup --read-only --agent <group> --help` output before guessing.
+- Start narrow: pick the single most direct command for the request before broader search.
+  - Known incident ID: `pup --read-only --agent incidents get <incident_id>`
+  - Known monitor ID: `pup --read-only --agent monitors get <monitor_id>`
+  - Known notebook ID: `pup --read-only --agent notebooks get <notebook_id>`
+  - Known metric name: `pup --read-only --agent metrics query --query="avg:<metric>{...}" --from="15m" --to="now"`; use `metrics metadata get` or `metrics tags list` when the user wants available tags or dimensions.
+- For exploratory questions, prefer one focused Pup search/list/aggregate command, then one follow-up fetch if needed.
+- For "current error rate / log volume / top offenders" questions, prefer `pup logs aggregate` over pulling raw log pages back through `pup logs search`.
+- For service-topology questions ("what calls checkout?", "what does the payment API depend on?"), prefer `pup apm dependencies list` or `pup apm flow-map` over stitching spans together manually.
+- Use `pup monitors search` or `pup monitors list` for "is this alerting?" and `pup incidents list` / `pup incidents get` for incident context.
+- Use RUM commands only when the user asks about real-user / browser telemetry, not for backend issues.
 3. Bound every query:
 - Always constrain time windows. Default to the last 15 minutes for "right now" questions and the last 24 hours for retrospective questions; otherwise use the window the user named.
 - Always include `env:` when `datadog.env` is set or the user named an env.
-- Always include `service:` when the user named a service or `datadog.service` is set and the tool is service-scoped.
-- Cap result size. Prefer the default or small page sizes; do not page through thousands of logs when an aggregate tool answers the question.
+- Always include `service:` when the user named a service or `datadog.service` is set and the command is service-scoped.
+- Cap result size. Prefer the default or small page sizes; do not page through thousands of logs when an aggregate command answers the question.
 4. Report the result:
 - Return the concrete answer first (counts, status, incident severity, trace timing, top offenders), then a short evidence block.
-- Include Datadog deep links (e.g. `https://app.datadoghq.com/logs?query=...`, `https://app.datadoghq.com/apm/trace/<id>`, `https://app.datadoghq.com/incidents/<id>`) so Slack users can click through.
-- Preserve interesting spans, log lines, or metric values inline only when they are the evidence for the answer. Do not dump raw tool output.
-- Keep routine tool chatter silent. Do not narrate each MCP search or fetch step.
+- Include Datadog deep links when Pup returns them or when you can construct a stable app link from an ID. Do not fabricate links from incomplete identifiers.
+- Preserve interesting spans, log lines, or metric values inline only when they are evidence for the answer. Do not dump raw command output.
+- Keep routine tool chatter silent. Do not narrate every Pup search or fetch step.
 ## Guardrails
-- Read-only only in this skill. Do not create, edit, mute, or resolve monitors, incidents, notebooks, dashboards, SLOs, or feature flags — the plugin intentionally does not expose those tools.
+- Read-only only in this skill. Do not create, edit, mute, delete, import, submit, or resolve monitors, incidents, notebooks, dashboards, SLOs, metrics, API keys, RUM resources, or other Datadog objects.
 - Log, RUM, APM, and incident payloads can contain PII or sensitive customer data. Quote only the minimum needed to answer the question. Do not paste full raw log bodies or span payloads when a summary plus a deep link is enough.
-- If a Datadog tool returns a generic `403`, `permission denied`, or similar, stop and tell the user the current Datadog connection could not access the requested resource. Do not guess at missing RBAC scopes.
+- If Pup returns `403`, `permission denied`, or similar, stop and tell the user the Datadog credentials could not access the requested resource. Do not guess at missing RBAC scopes.
 - If Datadog responds with `429 Too Many Requests`, wait briefly and retry the same query once. If it still fails, report the throttle and stop.
-- For large traces that the server marks as truncated, report that fact; do not pretend the shown spans are complete.
+- For large traces or span responses that are incomplete, report that fact; do not pretend the shown spans are complete.
 - Do not use this skill for Sentry issues, Linear/GitHub ticketing, or source-code investigation. Hand those off to the matching skill.

package/skills/datadog/references/api-surface.md CHANGED Viewed

@@ -4,56 +4,54 @@ Use this reference for any Datadog operation.
 ## Provider surface
-The packaged plugin points at Datadog's hosted remote MCP server and enables the `core`, `apm`, and `error-tracking` toolsets. Tool exposure is intentionally limited to the read-oriented surface below.
-### Tools exposed in this skill
-| Tool                                  | Intent                                                                              |
-| ------------------------------------- | ----------------------------------------------------------------------------------- |
-| `search_datadog_logs`                 | Search raw log events by filter (service, host, env, status, query, time window).   |
-| `analyze_datadog_logs`                | SQL-style aggregation over logs for counts, group-bys, top-N, and numeric analysis. |
-| `search_datadog_events`               | Datadog Events API: deployments, infra changes, alerts, status events.              |
-| `search_datadog_metrics`              | List available metrics by name pattern, tag, or service.                            |
-| `get_datadog_metric`                  | Query a specific metric time series over a time window.                             |
-| `get_datadog_metric_context`          | Fetch metadata and available tag dimensions for a metric.                           |
-| `search_datadog_spans`                | Search APM spans by service, operation, tags, time, error state.                    |
-| `get_datadog_trace`                   | Fetch a full trace by trace ID.                                                     |
-| `search_datadog_services`             | List services from the Software Catalog with ownership and tag metadata.            |
-| `search_datadog_service_dependencies` | Upstream/downstream service map for a service, or services owned by a team.         |
-| `search_datadog_hosts`                | List monitored hosts with tags and health state.                                    |
-| `search_datadog_monitors`             | List monitors, their statuses, and alert conditions.                                |
-| `search_datadog_incidents`            | List incidents with severity, state, and metadata.                                  |
-| `get_datadog_incident`                | Retrieve a specific incident by ID (timeline detail may be absent).                 |
-| `search_datadog_dashboards`           | List available dashboards.                                                          |
-| `search_datadog_notebooks`            | List Datadog notebooks by author, tag, or content.                                  |
-| `get_datadog_notebook`                | Fetch a notebook by ID.                                                             |
-| `search_datadog_rum_events`           | Search Datadog RUM (Real User Monitoring) events for browser / frontend issues.     |
-### Tools intentionally not exposed
-- Notebook mutations (`create_datadog_notebook`, `edit_datadog_notebook`).
-- Monitor, SLO, or incident mutations.
-- Feature-flag, DBM, and security toolsets (the packaged URL does not request them).
+The packaged plugin installs Datadog's `pup` CLI and configures it for agent-mode, read-only Datadog API access. Pup defaults to JSON output, which is the right format for analysis.
+Run commands as `pup --read-only --agent ...`. If a command surface is unclear, inspect `pup --read-only --agent agent schema --compact` or `pup --read-only --agent <group> --help` before guessing.
+### Read-oriented commands
+| Need                    | Pup command pattern                                                                                                                  |
+| ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
+| Raw logs                | `pup --read-only --agent logs search --query="service:checkout env:prod status:error" --from="15m" --limit=20`                       |
+| Log aggregation         | `pup --read-only --agent logs aggregate --query="service:checkout env:prod" --compute=count --group-by=status`                       |
+| Metrics                 | `pup --read-only --agent metrics list`, `metrics search`, `metrics query`, `metrics metadata get`, `metrics tags list`               |
+| Spans / traces          | `pup --read-only --agent traces search --query="service:checkout status:error" --from="15m" --limit=20`                              |
+| Span aggregation        | `pup --read-only --agent traces aggregate --query="service:checkout" --compute="percentile(@duration, 95)" --group-by=resource_name` |
+| APM services            | `pup --read-only --agent apm services list --env prod`, `apm services stats --env prod`                                              |
+| Service dependencies    | `pup --read-only --agent apm dependencies list --env prod` or `apm flow-map --query="service:checkout"`                              |
+| Monitors                | `pup --read-only --agent monitors search --query="service:checkout"`, `monitors list --tags=service:checkout`, `monitors get <id>`   |
+| Incidents               | `pup --read-only --agent incidents list --query="state:active" --limit=20`, `incidents get <id>`                                     |
+| Hosts                   | `pup --read-only --agent infrastructure hosts list --filter="env:prod" --count=50`, `infrastructure hosts get <host>`                |
+| Dashboards              | `pup --read-only --agent dashboards list`, `dashboards get <id>`, `dashboards url <id>`                                              |
+| Notebooks               | `pup --read-only --agent notebooks list`, `notebooks get <id>`                                                                       |
+| RUM events and sessions | `pup --read-only --agent rum events --query='@type:error'`, `rum aggregate`, `rum sessions search`                                   |
+### Commands to avoid
+Do not run write commands, even with `--read-only` present:
+- `create`, `update`, `delete`, `import`, `submit`, `cancel`, `mute`, `resolve`, or any command that writes a JSON file to Datadog.
+- API key, app key, user, org policy, security, SLO, dashboard, monitor, incident, notebook, RUM metric, retention filter, playlist, or workflow mutations.
 If a user asks for a mutation, stop and explain that this skill is read-only.
 ## Operation patterns
-| Intent                                           | Minimum tool pattern                                                                                                                               |
-| ------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------- |
-| "Why is service X failing right now?"            | `search_datadog_monitors` + `analyze_datadog_logs` (top error counts by status or message) + optionally `get_datadog_trace` for one failing trace. |
-| "Show me errors for service X in the last hour." | `analyze_datadog_logs` for counts/top-N first; only fall back to `search_datadog_logs` if the user asked for specific log lines.                   |
-| "What is the status of monitor X?"               | `search_datadog_monitors` with the monitor name/tag, then cite state + last transition time.                                                       |
-| "Tell me about incident INC-123."                | `get_datadog_incident` directly. Only fall back to `search_datadog_incidents` if no ID is known.                                                   |
-| "What depends on the checkout service?"          | `search_datadog_service_dependencies` scoped to that service.                                                                                      |
-| "How did this trace spend its time?"             | `get_datadog_trace` by ID; cite the slowest spans.                                                                                                 |
-| "What tag values are valid for this metric?"     | `get_datadog_metric_context` before `get_datadog_metric`.                                                                                          |
-| "Which hosts are unhealthy?"                     | `search_datadog_hosts` filtered by health/tags.                                                                                                    |
-| "Find slow page loads."                          | `search_datadog_rum_events` with a page/speed filter.                                                                                              |
+| Intent                                           | Minimum command pattern                                                                                                 |
+| ------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------- |
+| "Why is service X failing right now?"            | `monitors search/list` + `logs aggregate` for top errors + optionally `traces search` for representative failing spans. |
+| "Show me errors for service X in the last hour." | `logs aggregate` for counts/top-N first; only use `logs search` if the user asked for specific log lines.               |
+| "What is the status of monitor X?"               | `monitors search --query=...` or `monitors get <id>`, then cite state and last transition if present.                   |
+| "Tell me about incident INC-123."                | `incidents get <id>` directly. Only fall back to `incidents list --query=...` if no ID is known.                        |
+| "What depends on checkout?"                      | `apm dependencies list --env <env>` or `apm flow-map --query="service:checkout" --env <env>`.                           |
+| "How did this trace spend its time?"             | `traces search --query="trace_id:<id>"`; cite slowest/error spans. Pup exposes span search, not a guaranteed full tree. |
+| "What tag values are valid for this metric?"     | `metrics metadata get <metric>` and `metrics tags list <metric> --from=... --to=...` before `metrics query`.            |
+| "Which hosts are unhealthy?"                     | `infrastructure hosts list --filter=...` with env/service/role filters.                                                 |
+| "Find slow page loads."                          | `rum aggregate` or `rum events` with RUM facets and a bounded time window.                                              |
 ## Content expectations
-- Translate Slack-thread wording into stable observability language (env, service, status, span, monitor, incident, host).
-- Preserve material URLs present in the conversation (Sentry, GitHub, dashboards, prior Datadog links) when they add evidence.
-- Include Datadog deep links (`https://app.datadoghq.com/...`) with the answer so users can click through.
-- Label assumptions clearly when the thread leaves important details uncertain (chosen env, chosen time window, chosen service).
+- Translate Slack-thread wording into stable observability language: env, service, status, span, monitor, incident, host.
+- Preserve material URLs present in the conversation when they add evidence.
+- Include Datadog deep links when Pup returns them or when a stable ID-specific link is obvious.
+- Label assumptions clearly when the thread leaves important details uncertain: chosen env, chosen time window, chosen service.

package/skills/datadog/references/common-use-cases.md CHANGED Viewed

@@ -5,77 +5,82 @@ Use these patterns to shape concrete Datadog requests.
 ## 1. Triage "service X is failing right now"
 - Default the time window to the last 15 minutes unless the user gave a different one.
-- Constrain by `service:` and `env:` (explicit user input wins; fall back to `datadog.service` / `datadog.env`).
-- `search_datadog_monitors` for `service:<x>` first — a firing monitor usually names the failure mode.
-- Then `analyze_datadog_logs` to aggregate by status/level/message to find the top error shape.
-- If the user asks "why", fetch one representative failing trace with `get_datadog_trace` or `search_datadog_spans` filtered to `service:<x> status:error`.
-- Report monitor state, top error, and one failing trace link — not a dump.
+- Constrain by `service:` and `env:`. Explicit user input wins; fall back to `datadog.service` / `datadog.env`.
+- Run `pup --read-only --agent monitors search --query="service:<x>"` or `monitors list --tags=service:<x>,env:<env>` first; a firing monitor usually names the failure mode.
+- Then run `pup --read-only --agent logs aggregate --query="service:<x> env:<env>" --from="15m" --to="now" --compute=count --group-by=status` or group by an error facet such as `@error.kind`.
+- If the user asks "why", search representative failing spans with `pup --read-only --agent traces search --query="service:<x> env:<env> status:error" --from="15m" --limit=20`.
+- Report monitor state, top error, and one representative trace/span link when available.
 ## 2. "Is this monitor alerting?"
-- Use `search_datadog_monitors` with the monitor name, tag, or ID.
-- Report state (`OK`, `Warn`, `Alert`, `No Data`), last transition, and the monitor link.
-- If the monitor is in `No Data`, note that explicitly — it is not the same as healthy.
+- If the user gave a monitor ID, run `pup --read-only --agent monitors get <id>`.
+- Otherwise run `pup --read-only --agent monitors search --query="<name or tag>"` or `monitors list --name="<name>" --tags=...`.
+- Report state (`OK`, `Warn`, `Alert`, `No Data`), last transition if present, and the monitor link.
+- If the monitor is in `No Data`, note that explicitly; it is not the same as healthy.
 ## 3. "Tell me about incident INC-123" or "What is the status of the Redis incident?"
-- If the user named the incident ID, go straight to `get_datadog_incident`.
-- If only a topic was named, use `search_datadog_incidents` filtered by active/severity and scan for a match in the thread's time window.
-- Report severity, state, owner, and link to the incident.
-- Note that incident timeline detail may be absent from the MCP response; do not fabricate timeline entries.
+- If the user named the incident ID, run `pup --read-only --agent incidents get <id>`.
+- If only a topic was named, run `pup --read-only --agent incidents list --query="state:active <topic>" --limit=20` and scan for a match in the thread's time window.
+- Report severity, state, owner/team if present, and link to the incident.
+- Do not fabricate timeline entries if Pup does not return them.
 ## 4. Log search with a specific query
-- Default to `search_datadog_logs` only when the user explicitly wants raw log lines.
-- Constrain with `service:`, `env:`, `status:`, `host:`, or `@<faceted_field>:` as appropriate (see `query-syntax.md`).
+- Use `pup --read-only --agent logs search` only when the user explicitly wants raw log lines.
+- Constrain with `service:`, `env:`, `status:`, `host:`, or `@<faceted_field>:` as appropriate.
 - Cap page size and time window to avoid huge responses.
-- Report a short summary plus a Datadog logs deep link. Quote only the minimum log content.
+- Report a short summary plus a Datadog logs deep link when available. Quote only the minimum log content.
 ## 5. "What are the top errors for service X right now?"
-- Prefer `analyze_datadog_logs` with a SQL-style `GROUP BY status` or `GROUP BY @http.status_code` / `GROUP BY @error.kind`.
+- Prefer `pup --read-only --agent logs aggregate --query="service:<x> env:<env> status:error" --compute=count --group-by=@error.kind --limit=10`.
+- Use `--group-by=@http.status_code`, `status`, `service`, `host`, or another facet when it better matches the question.
 - Report the top 3-5 buckets with counts, not an exhaustive table.
-- Include the aggregated query link so the user can open the same view in Datadog.
 ## 6. Trace inspection by ID
-- Use `get_datadog_trace` with the trace ID.
-- Cite the top 3 slowest or error-tagged spans (service, operation, duration, error state).
-- If the server marks the trace as truncated, say so — some spans are not present.
+- Pup exposes span search. Use `pup --read-only --agent traces search --query="trace_id:<id>" --from=<window> --to=<window> --limit=100`.
+- Cite the top 3 slowest or error-tagged spans: service, resource/operation, duration, error state.
+- If the returned spans look partial, say so. Do not claim a complete trace tree unless the output proves it.
 ## 7. Span search for a known error pattern
-- Use `search_datadog_spans` with explicit filters like `service:<x> status:error resource_name:"..."` and a bounded time window.
-- Report span counts plus the most illustrative span's trace link.
+- Use `pup --read-only --agent traces search --query='service:<x> env:<env> status:error resource_name:"..."' --from=... --to=...`.
+- For counts or latency buckets, use `pup --read-only --agent traces aggregate --query="service:<x> env:<env>" --compute=count --group-by=resource_name`.
+- Report counts plus the most illustrative span's trace link when available.
 ## 8. Service topology lookup
-- Use `search_datadog_service_dependencies` to answer "what calls X?" or "what does X depend on?" or "what does team Y own?".
-- Return the dependency list with service names and link back to the Service Catalog page.
+- Use `pup --read-only --agent apm dependencies list --env <env> --from=... --to=...` to answer dependency questions.
+- Use `pup --read-only --agent apm flow-map --query="service:<x>" --env <env> --from=... --to=...` when the question is centered on one service.
+- Return the dependency list with service names and a Service Catalog/APM link when available.
 ## 9. Metric lookup
-- Use `search_datadog_metrics` when the user is unsure of the metric name.
-- Once the metric name is known, use `get_datadog_metric` with the time window and tag filters.
-- Use `get_datadog_metric_context` before querying if the user wants to know which tags (`env`, `service`, `host`, ...) are usable.
-- Report headline numbers (current, peak, delta) plus a metric explorer link.
+- Use `pup --read-only --agent metrics search --query="<pattern>"` or `metrics list --filter="<pattern>"` when the user is unsure of the metric name.
+- Once the metric name is known, use `pup --read-only --agent metrics query --query="avg:<metric>{env:<env>,service:<service>}" --from=... --to=...`.
+- Use `pup --read-only --agent metrics metadata get <metric>` and `metrics tags list <metric> --from=... --to=...` before querying if the user wants valid tags.
+- Report headline numbers: current, peak, delta, or bucketed values as appropriate.
 ## 10. Host health
-- Use `search_datadog_hosts` filtered by tag, role, or `down:true`.
-- Return counts, the list of unhealthy hosts (names + tags), and a host map link.
+- Use `pup --read-only --agent infrastructure hosts list --filter="env:<env> <role-or-service>" --count=50`.
+- Use `pup --read-only --agent infrastructure hosts get <hostname>` for a specific host.
+- Return counts, unhealthy host names/tags, and a host map link when available.
 ## 11. RUM / frontend slowness
-- Use `search_datadog_rum_events` only when the user asked about end-user / browser experience.
+- Use `pup --read-only --agent rum aggregate` for top views/errors and `rum events` only when the user needs example events.
+- Use `pup --read-only --agent rum sessions search` for session questions.
 - Constrain to `@type:error`, slow page loads, or specific views; bound the time window.
-- Do not use RUM for backend errors — those live in logs/APM.
+- Do not use RUM for backend errors; those live in logs/APM.
 ## 12. Dashboards and notebooks
-- `search_datadog_dashboards` to list dashboards by topic, team, or tag — useful for "do we already have a dashboard for X?".
-- `search_datadog_notebooks` + `get_datadog_notebook` for reading existing investigation notebooks.
-- This skill does not create or edit dashboards or notebooks. If the user asks, stop and say so.
+- `pup --read-only --agent dashboards list` and `dashboards get <id>` are useful for "do we already have a dashboard for X?".
+- `pup --read-only --agent notebooks list` and `notebooks get <id>` are for reading investigation notebooks.
+- This skill does not create or edit dashboards or notebooks.
 ## 13. Storing channel defaults

package/skills/datadog/references/query-syntax.md CHANGED Viewed

@@ -1,22 +1,22 @@
 # Query Syntax
-Use this reference when forming Datadog log queries, span queries, and log analytics (`analyze_datadog_logs`) SQL.
+Use this reference when forming Datadog log queries, span queries, RUM queries, and Pup aggregate commands.
 ## Log search query syntax
 Datadog log search queries are tag-and-facet based. Core building blocks:
-| Form               | Meaning                                                              |
-| ------------------ | -------------------------------------------------------------------- |
-| `service:<name>`   | Reserved attribute — service emitting the log.                       |
-| `env:<name>`       | Reserved attribute — deployment environment tag.                     |
-| `host:<name>`      | Reserved attribute — emitting host.                                  |
-| `status:<level>`   | Log level: `error`, `warn`, `info`, `debug`, etc.                    |
-| `source:<name>`    | Log source integration (e.g. `nginx`, `python`).                     |
-| `@<field>:<value>` | Faceted attribute (custom JSON field), e.g. `@http.status_code:500`. |
-| `"some phrase"`    | Free-text phrase search.                                             |
-| `AND`, `OR`, `-`   | Boolean ops; `-` negates. Default operator between terms is `AND`.   |
-| `(a OR b) AND c`   | Parenthesized boolean expression.                                    |
+| Form               | Meaning                                                             |
+| ------------------ | ------------------------------------------------------------------- |
+| `service:<name>`   | Reserved attribute: service emitting the log.                       |
+| `env:<name>`       | Reserved attribute: deployment environment tag.                     |
+| `host:<name>`      | Reserved attribute: emitting host.                                  |
+| `status:<level>`   | Log level: `error`, `warn`, `info`, `debug`, etc.                   |
+| `source:<name>`    | Log source integration, for example `nginx` or `python`.            |
+| `@<field>:<value>` | Faceted attribute: custom JSON field, e.g. `@http.status_code:500`. |
+| `"some phrase"`    | Free-text phrase search.                                            |
+| `AND`, `OR`, `-`   | Boolean ops; `-` negates. Default operator between terms is `AND`.  |
+| `(a OR b) AND c`   | Parenthesized boolean expression.                                   |
 Common examples:
@@ -31,47 +31,65 @@ Tips:
 - `status` and `@http.status_code` are different. `status` is the log level; `@http.status_code` is the HTTP response code.
 - Reserved attributes (`service`, `env`, `host`, `status`, `source`) do not take the `@` prefix. Custom fields do.
+## Pup log commands
+- Raw logs: `pup --read-only --agent logs search --query="service:checkout env:prod status:error" --from="15m" --to="now" --limit=20`
+- Alternate v2 listing: `pup --read-only --agent logs list --query="service:checkout env:prod" --from="1h" --limit=20`
+- Aggregation: `pup --read-only --agent logs aggregate --query="service:checkout env:prod status:error" --compute=count --group-by=@error.kind --limit=10`
+`logs aggregate` options to prefer for analytics:
+- `--compute=count` for volume.
+- `--compute="avg(@duration)"`, `sum(...)`, `min(...)`, `max(...)`, or `percentile(@duration, 95)` for numeric fields.
+- `--group-by=status`, `service`, `host`, `@http.status_code`, `@error.kind`, or another facet.
+- `--limit=10` unless the user needs more.
 ## Span / APM search
 APM span search shares the same query language, plus a few APM-specific attributes:
-| Attribute          | Meaning                                    |
-| ------------------ | ------------------------------------------ |
-| `service:<name>`   | Service emitting the span.                 |
-| `env:<name>`       | Deployment environment tag.                |
-| `operation_name:X` | Span operation name (e.g. `http.request`). |
-| `resource_name:X`  | Endpoint or handler.                       |
-| `status:error`     | Span is marked as an error.                |
-| `duration:>500ms`  | Range filter on span duration.             |
+| Attribute          | Meaning                                   |
+| ------------------ | ----------------------------------------- |
+| `service:<name>`   | Service emitting the span.                |
+| `env:<name>`       | Deployment environment tag.               |
+| `operation_name:X` | Span operation name, e.g. `http.request`. |
+| `resource_name:X`  | Endpoint or handler.                      |
+| `status:error`     | Span is marked as an error.               |
+| `@duration:>...`   | Duration filter in nanoseconds.           |
+Commands:
+- `pup --read-only --agent traces search --query="service:checkout env:prod status:error" --from="15m" --limit=20`
+- `pup --read-only --agent traces aggregate --query="service:checkout env:prod" --compute="percentile(@duration, 95)" --group-by=resource_name`
+- For a trace ID, use `traces search --query="trace_id:<id>"` with a window that brackets the trace. Pup returns matching spans; do not assume it returned a complete tree unless the output proves it.
+## RUM queries
-## `analyze_datadog_logs` SQL
+Use RUM only for browser/user-experience questions:
-`analyze_datadog_logs` takes SQL-like aggregations over the same log data. Prefer it for counts, top-N, group-bys, and time-bucketed analytics instead of paging raw logs.
+- `pup --read-only --agent rum events --query='@type:error @application.name:"Web"' --from="1h" --limit=20`
+- `pup --read-only --agent rum aggregate --query='@type:view' --compute="percentile(@view.loading_time, 95)" --group-by=@view.name`
+- `pup --read-only --agent rum sessions search --query='@session.type:user' --from="1h" --limit=20`
-Conventions:
+## Metric queries
-- Wrap log query filters in a `WHERE` clause using the same log-search query syntax (quoted as a string).
-- Use `COUNT(*)` for volume, `COUNT(DISTINCT <field>)` for unique cardinality.
-- `GROUP BY` faceted fields (without `@` in the SQL form — the tool's schema specifies how to reference them; follow the tool's input schema exactly).
-- Cap with `ORDER BY ... DESC LIMIT N` — top 5-10 is usually enough.
+Datadog metric query strings follow the usual metric explorer shape:
-Example intents (shape — not a literal string; call the tool with the input schema it advertises):
+- `avg:system.cpu.user{env:prod,service:checkout}`
+- `sum:trace.http.request.errors{env:prod,service:checkout}.as_count()`
+- `p95:trace.http.request.duration{env:prod,service:checkout}`
-- Top 10 services by error count in the last hour.
-- HTTP 5xx count by status code in the last 15 minutes, grouped by `@http.status_code`.
-- Log volume by `host` over the last hour to spot a noisy emitter.
+Use `metrics search` or `metrics list` to find names, `metrics metadata get` for metadata, and `metrics tags list` for tag dimensions before querying when needed.
 ## Time windows
 - For "right now" questions, default to the last 15 minutes.
 - For "what happened earlier today" questions, default to the last 24 hours.
 - For incident-linked questions, prefer a window that brackets the incident `created` time.
-- Always include a time window — unbounded queries are slow and easy to misinterpret.
+- Always include a time window. Unbounded queries are slow and easy to misinterpret.
 ## What to cite back
-- The exact query string used (`service:checkout env:prod status:error`) — users often want to click through.
-- A Datadog deep link that encodes the same filter:
-  - `https://app.datadoghq.com/logs?query=<url-encoded-query>&from_ts=<ms>&to_ts=<ms>`
-  - `https://app.datadoghq.com/apm/traces?query=<url-encoded-query>`
+- The exact query string used, for example `service:checkout env:prod status:error`.
 - The time window you used.
+- A Datadog deep link when Pup returns one or when a stable ID-specific app link is available.

package/skills/datadog/references/troubleshooting-workarounds.md CHANGED Viewed

@@ -1,17 +1,23 @@
 # Troubleshooting and Workarounds
-Use this reference when Datadog MCP calls fail or return unexpected results.
+Use this reference when Pup commands fail or return unexpected results.
 ## Permission and scope errors
-- A Datadog API returning `403 Forbidden` or `permission denied` means the user's Datadog role cannot read that resource (metrics, APM, incidents, RUM, etc.).
-- Stop and tell the user the current Datadog connection could not access the requested data. Suggest they verify their Datadog role/team.
+- A `403 Forbidden` or `permission denied` response means the configured Datadog API/application keys cannot read that resource: metrics, APM, incidents, RUM, and so on.
+- Stop and tell the user the current Datadog integration could not access the requested data. Suggest the operator verify the Datadog application key scopes/role.
 - Do not guess specific missing permission names unless Datadog explicitly named one in the error.
 - Do not loop retrying a 403.
+## Authentication errors
+- A `401 Unauthorized`, `missing API key`, or `missing application key` error usually means `DATADOG_API_KEY` or `DATADOG_APP_KEY` is missing from the Junior deployment env, or the key was revoked.
+- Pup receives placeholder env values in the sandbox so it will make HTTP requests; the host injects the real `DD-API-KEY` and `DD-APPLICATION-KEY` headers for Datadog API domains.
+- Do not ask the user to paste keys into Slack or the sandbox. Tell the operator to fix the deployment env and retry.
 ## Rate limits
-- Datadog throttles the unstable MCP endpoint. A `429 Too Many Requests` response is expected under load.
+- Datadog API endpoints can return `429 Too Many Requests`.
 - Retry the same query once after a short wait.
 - If it fails again, report the throttle and stop. Do not fall back to larger scans that will throttle harder.
@@ -19,21 +25,24 @@ Use this reference when Datadog MCP calls fail or return unexpected results.
 - Double-check that `env:` and `service:` match real values. Datadog tag values are case-sensitive.
 - Widen the time window before widening the filter. Many "no results" cases are just too narrow a window.
-- If searching logs with `@<field>:value`, confirm the field exists as a facet; custom log attributes must be facetized in Datadog to be searchable.
-- If an expected monitor or incident is missing, the user's account may not have access to that workspace or team.
+- If searching logs or RUM with `@<field>:value`, confirm the field exists as a facet.
+- If an expected monitor or incident is missing, the application key may not have access to that team/resource.
 ## Too many results / large payloads
-- Prefer `analyze_datadog_logs` with `GROUP BY` + `LIMIT` over paging raw logs.
-- For traces marked truncated by the server, say so in the reply. Do not pretend the shown spans are complete.
+- Prefer `pup --read-only --agent logs aggregate` or `traces aggregate` with `--group-by` + `--limit` over paging raw events.
+- For span/trace responses that look partial, say so in the reply. Do not pretend the shown spans are complete.
 - Quote only the minimum log / span / metric content needed as evidence. Link to Datadog for the rest.
 ## Multiple Datadog sites
-- The packaged plugin defaults to the US1 endpoint (`mcp.datadoghq.com`). The manifest declares `DATADOG_SITE` in its `env-vars` block with a default of `datadoghq.com` and references it from `mcp.url` as `${DATADOG_SITE}`, so non-US1 operators (US3, US5, EU, AP1, AP2, GovCloud) set `DATADOG_SITE` in their Junior deployment env to their site host (e.g. `us5.datadoghq.com`, `datadoghq.eu`, `ddog-gov.com`). Users hitting auth failures against the wrong regional endpoint should have the operator confirm `DATADOG_SITE` is set correctly.
-- If the user's Datadog account lives on a different site than the deployment is configured for, advise the operator to update the `DATADOG_SITE` environment variable. Do not try to work around this silently inside a turn.
+- The packaged plugin defaults to US1 (`datadoghq.com`) and sets Pup's `DD_SITE` from the manifest `DATADOG_SITE` env var.
+- Non-US1 operators set `DATADOG_SITE` in their Junior deployment env to their site host, for example `us5.datadoghq.com`, `datadoghq.eu`, or `ddog-gov.com`.
+- Setting deployment `DD_SITE` alone has no effect; the plugin owns Pup's sandbox `DD_SITE` through `DATADOG_SITE`.
+- The packaged plugin allows the standard Datadog API hosts for US1, US3, US5, EU, AP1, AP2, and GovCloud. A custom or staging Datadog domain needs a manifest change so the API domain allowlist matches.
+- If the user's Datadog account lives on a different site than the deployment is configured for, advise the operator to update `DATADOG_SITE`. Do not try to work around this silently inside a turn.
 ## Read-only scope
-- This skill intentionally exposes only read-oriented Datadog tools.
-- If the user asks to create a notebook, edit a monitor, mute an alert, or resolve an incident, stop and tell them those actions are not in scope. Do not attempt to approximate the mutation from read tools.
+- This skill intentionally uses only read-oriented Pup commands.
+- If the user asks to create a notebook, edit a monitor, mute an alert, submit a metric, or resolve an incident, stop and tell them those actions are not in scope.