PyPI - holmesgpt - Versions diffs - 0.14.0a0__py3-none-any.whl → 0.14.1__py3-none-any.whl - Mend

holmesgpt 0.14.0a0py3-none-any.whl → 0.14.1py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of holmesgpt might be problematic. Click here for more details.

Files changed (82) hide show

holmes/__init__.py +1 -1
holmes/clients/robusta_client.py +15 -4
holmes/common/env_vars.py +8 -1
holmes/config.py +66 -139
holmes/core/investigation.py +1 -2
holmes/core/llm.py +295 -52
holmes/core/models.py +2 -0
holmes/core/safeguards.py +4 -4
holmes/core/supabase_dal.py +14 -8
holmes/core/tool_calling_llm.py +110 -102
holmes/core/tools.py +260 -25
holmes/core/tools_utils/data_types.py +81 -0
holmes/core/tools_utils/tool_context_window_limiter.py +33 -0
holmes/core/tools_utils/tool_executor.py +2 -2
holmes/core/toolset_manager.py +150 -3
holmes/core/transformers/__init__.py +23 -0
holmes/core/transformers/base.py +62 -0
holmes/core/transformers/llm_summarize.py +174 -0
holmes/core/transformers/registry.py +122 -0
holmes/core/transformers/transformer.py +31 -0
holmes/main.py +5 -0
holmes/plugins/prompts/_fetch_logs.jinja2 +10 -1
holmes/plugins/toolsets/aks-node-health.yaml +46 -0
holmes/plugins/toolsets/aks.yaml +64 -0
holmes/plugins/toolsets/atlas_mongodb/mongodb_atlas.py +17 -15
holmes/plugins/toolsets/azure_sql/tools/analyze_connection_failures.py +8 -4
holmes/plugins/toolsets/azure_sql/tools/analyze_database_connections.py +7 -3
holmes/plugins/toolsets/azure_sql/tools/analyze_database_health_status.py +3 -3
holmes/plugins/toolsets/azure_sql/tools/analyze_database_performance.py +3 -3
holmes/plugins/toolsets/azure_sql/tools/analyze_database_storage.py +7 -3
holmes/plugins/toolsets/azure_sql/tools/get_active_alerts.py +4 -4
holmes/plugins/toolsets/azure_sql/tools/get_slow_queries.py +7 -3
holmes/plugins/toolsets/azure_sql/tools/get_top_cpu_queries.py +7 -3
holmes/plugins/toolsets/azure_sql/tools/get_top_data_io_queries.py +7 -3
holmes/plugins/toolsets/azure_sql/tools/get_top_log_io_queries.py +7 -3
holmes/plugins/toolsets/bash/bash_toolset.py +6 -6
holmes/plugins/toolsets/bash/common/bash.py +7 -7
holmes/plugins/toolsets/coralogix/toolset_coralogix_logs.py +5 -3
holmes/plugins/toolsets/datadog/datadog_api.py +490 -24
holmes/plugins/toolsets/datadog/datadog_logs_instructions.jinja2 +21 -10
holmes/plugins/toolsets/datadog/toolset_datadog_general.py +344 -205
holmes/plugins/toolsets/datadog/toolset_datadog_logs.py +189 -17
holmes/plugins/toolsets/datadog/toolset_datadog_metrics.py +95 -30
holmes/plugins/toolsets/datadog/toolset_datadog_rds.py +10 -10
holmes/plugins/toolsets/datadog/toolset_datadog_traces.py +20 -20
holmes/plugins/toolsets/git.py +21 -21
holmes/plugins/toolsets/grafana/common.py +2 -2
holmes/plugins/toolsets/grafana/toolset_grafana.py +4 -4
holmes/plugins/toolsets/grafana/toolset_grafana_loki.py +5 -4
holmes/plugins/toolsets/grafana/toolset_grafana_tempo.jinja2 +123 -23
holmes/plugins/toolsets/grafana/toolset_grafana_tempo.py +165 -307
holmes/plugins/toolsets/internet/internet.py +3 -3
holmes/plugins/toolsets/internet/notion.py +3 -3
holmes/plugins/toolsets/investigator/core_investigation.py +3 -3
holmes/plugins/toolsets/kafka.py +18 -18
holmes/plugins/toolsets/kubernetes.yaml +58 -0
holmes/plugins/toolsets/kubernetes_logs.py +6 -6
holmes/plugins/toolsets/kubernetes_logs.yaml +32 -0
holmes/plugins/toolsets/logging_utils/logging_api.py +1 -1
holmes/plugins/toolsets/mcp/toolset_mcp.py +4 -4
holmes/plugins/toolsets/newrelic.py +5 -5
holmes/plugins/toolsets/opensearch/opensearch.py +5 -5
holmes/plugins/toolsets/opensearch/opensearch_logs.py +7 -7
holmes/plugins/toolsets/opensearch/opensearch_traces.py +10 -10
holmes/plugins/toolsets/prometheus/prometheus.py +841 -351
holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2 +39 -2
holmes/plugins/toolsets/prometheus/utils.py +28 -0
holmes/plugins/toolsets/rabbitmq/toolset_rabbitmq.py +6 -4
holmes/plugins/toolsets/robusta/robusta.py +10 -10
holmes/plugins/toolsets/runbook/runbook_fetcher.py +4 -4
holmes/plugins/toolsets/servicenow/servicenow.py +6 -6
holmes/plugins/toolsets/utils.py +88 -0
holmes/utils/config_utils.py +91 -0
holmes/utils/env.py +7 -0
holmes/utils/holmes_status.py +2 -1
holmes/utils/sentry_helper.py +41 -0
holmes/utils/stream.py +9 -0
{holmesgpt-0.14.0a0.dist-info → holmesgpt-0.14.1.dist-info}/METADATA +10 -14
{holmesgpt-0.14.0a0.dist-info → holmesgpt-0.14.1.dist-info}/RECORD +82 -72
{holmesgpt-0.14.0a0.dist-info → holmesgpt-0.14.1.dist-info}/LICENSE.txt +0 -0
{holmesgpt-0.14.0a0.dist-info → holmesgpt-0.14.1.dist-info}/WHEEL +0 -0
{holmesgpt-0.14.0a0.dist-info → holmesgpt-0.14.1.dist-info}/entry_points.txt +0 -0

holmes/plugins/toolsets/grafana/toolset_grafana_tempo.jinja2 CHANGED Viewed

@@ -5,43 +5,142 @@ Assume every application provides tempo traces.
 ## API Endpoints and Tool Mapping
 1. **Trace Search** (GET /api/search)
-   - `search_traces_by_query`: Use with 'q' parameter for TraceQL queries
-   - `search_traces_by_tags`: Use with 'tags' parameter for logfmt queries
+   - `tempo_search_traces_by_query`: Use with 'q' parameter for TraceQL queries
+   - `tempo_search_traces_by_tags`: Use with 'tags' parameter for logfmt queries
 2. **Trace Details** (GET /api/v2/traces/{trace_id})
-   - `query_trace_by_id`: Retrieve full trace data
+   - `tempo_query_trace_by_id`: Retrieve full trace data
 3. **Tag Discovery**
-   - `search_tag_names` (GET /api/v2/search/tags): List available tags
-   - `search_tag_values` (GET /api/v2/search/tag/{tag}/values): Get values for a tag
+   - `tempo_search_tag_names` (GET /api/v2/search/tags): List available tags
+   - `tempo_search_tag_values` (GET /api/v2/search/tag/{tag}/values): Get values for a tag
 4. **TraceQL Metrics**
-   - `query_metrics_instant` (GET /api/metrics/query): Single value computation
-   - `query_metrics_range` (GET /api/metrics/query_range): Time series data
+   - `tempo_query_metrics_instant` (GET /api/metrics/query): Single value computation
+   - `tempo_query_metrics_range` (GET /api/metrics/query_range): Time series data
 ## Usage Workflow
 ### 1. Discovering Available Data
 Start by understanding what tags and values exist:
-- Use `search_tag_names` to discover available tags
-- Use `search_tag_values` to see all values for a specific tag (e.g., service names)
+- Use `tempo_search_tag_names` to discover available tags
+- Use `tempo_search_tag_values` to see all values for a specific tag (e.g., service names)
 ### 2. Searching for Traces
 **TraceQL Search (recommended):**
-Use `search_traces_by_query` with TraceQL syntax for powerful filtering:
-- Find errors: `{span.http.status_code>=400}`
-- Service traces: `{resource.service.name="api"}`
-- Slow traces: `{duration>100ms}`
-- Complex queries: `{resource.service.name="api" && span.http.status_code=500 && duration>1s}`
+Use `tempo_search_traces_by_query` with TraceQL syntax for powerful filtering.
+**TraceQL Capabilities:**
+TraceQL can select traces based on the following:
+- **Span and resource attributes** - Filter by any attribute on spans or resources
+- **Timing and duration** - Filter by trace/span duration
+- **Basic aggregates** - Use aggregate functions to compute values across spans
+**Supported Aggregate Functions:**
+- `count()` - Count the number of spans matching the criteria
+- `avg(attribute)` - Calculate average of a numeric attribute across spans
+- `min(attribute)` - Find minimum value of a numeric attribute
+- `max(attribute)` - Find maximum value of a numeric attribute
+- `sum(attribute)` - Sum values of a numeric attribute across spans
+**Aggregate Function Usage:**
+Aggregates are used with the pipe operator `|` to filter traces based on computed values across their spans.
+**Aggregate Examples:**
+- `{ span.http.status_code = 200 } | count() > 3` - Find traces with more than 3 spans having HTTP 200 status
+- `{ } | sum(span.bytesProcessed) > 1000000000` - Find traces where total processed bytes exceed 1 GB
+- `{ status = error } | by(resource.service.name) | count() > 1` - Find services with more than 1 error
+**Select Function:**
+- `{ status = error } | select(span.http.status_code, span.http.url)` - Select specific attributes from error spans
+**TraceQL Query Structure:**
+TraceQL queries follow the pattern: `{span-selectors} | aggregate`
+**TraceQL Query Examples (from official docs):**
+1. **Find traces of a specific operation:**
+   ```
+   {resource.service.name = "frontend" && name = "POST /api/orders"}
+   ```
+   ```
+   {
+     resource.service.namespace = "ecommerce" &&
+     resource.service.name = "frontend" &&
+     resource.deployment.environment = "production" &&
+     name = "POST /api/orders"
+   }
+   ```
+2. **Find traces with a particular outcome:**
+   ```
+   {
+     resource.service.name="frontend" &&
+     name = "POST /api/orders" &&
+     status = error
+   }
+   ```
+   ```
+   {
+     resource.service.name="frontend" &&
+     name = "POST /api/orders" &&
+     span.http.status_code >= 500
+   }
+   ```
+3. **Find traces with a particular behavior:**
+   ```
+   {span.service.name="frontend" && name = "GET /api/products/{id}"} && {span.db.system="postgresql"}
+   ```
+4. **Find traces across environments:**
+   ```
+   { resource.deployment.environment = "production" } && { resource.deployment.environment = "staging" }
+   ```
+5. **Structural operators (advanced):**
+   ```
+   { resource.service.name="frontend" } >> { status = error }  # Frontend spans followed by errors
+   { } !< { resource.service.name = "productcatalogservice" }  # Traces without productcatalog as child
+   { resource.service.name = "productcatalogservice" } ~ { resource.service.name="frontend" }  # Sibling spans
+   ```
+6. **Additional operator examples:**
+   ```
+   { span.http.method = "GET" && status = ok } && { span.http.method = "DELETE" && status != ok }  # && for multiple conditions
+   ```
+   ```
+   { resource.deployment.environment =~ "prod-.*" && span.http.status_code = 200 }  # =~ regex match
+   { span.http.method =~ "DELETE|GET" }  # Regex match multiple values
+   { trace:rootName !~ ".*perf.*" }  # !~ negated regex
+   { resource.cloud.region = "us-east-1" } || { resource.cloud.region = "us-west-1" }  # || OR operator
+   ```
+   ```
+   { span.http.status_code >= 400 && span.http.status_code < 500 }  # Client errors (4xx)
+   { span.http.url = "/path/of/api" } >> { span.db.name = "db-shard-001" }  # >> descendant
+   { span.http.status_code = 200 } | select(resource.service.name)  # Select specific attributes
+   ```
+**Common Attributes to Query:**
+- `resource.service.name` - Service name
+- `resource.k8s.*` - Kubernetes metadata (pod.name, namespace.name, deployment.name, etc.)
+- `span.http.*` - HTTP attributes (status_code, method, route, url, etc.)
+- `name` - Span name
+- `status` - Span status (error, ok)
+- `duration` - Span duration
+- `kind` - Span kind (server, client, producer, consumer, internal)
 **Tag-based Search (legacy):**
-Use `search_traces_by_tags` with logfmt format when you need min/max duration filters:
-- Example: `resource.service.name="api" http.status_code="500"`
+Use `tempo_search_traces_by_tags` with logfmt format when you need min/max duration filters:
+- Example: `service.name="api" http.status_code="500"`
 - Supports `min_duration` and `max_duration` parameters
 ### 3. Analyzing Specific Traces
 When you have trace IDs from search results:
-- Use `query_trace_by_id` to get full trace details
+- Use `tempo_query_trace_by_id` to get full trace details
 - Examine spans for errors, slow operations, and bottlenecks
 ### 4. Computing Metrics from Traces
@@ -115,26 +214,26 @@ TraceQL metrics parse your traces in aggregate to provide RED (Rate, Error, Dura
    ```
 10. **Using topk modifier** - Find top 10 endpoints by request rate:
-    ```
-    { resource.service.name = "foo" } | rate() by (span.http.url) | topk(10)
-    ```
+   ```
+   { resource.service.name = "foo" } | rate() by (span.http.url) | topk(10)
+   ```
 **Choosing Between Instant and Range Queries:**
-**Instant Metrics** (`query_metrics_instant`) - Returns a single aggregated value for the entire time range. Use this when:
+**Instant Metrics** (`tempo_query_metrics_instant`) - Returns a single aggregated value for the entire time range. Use this when:
 - You need a total count or sum across the whole period
 - You want a single metric value (e.g., total error count, average latency)
 - You don't need to see how the metric changes over time
 - You're computing a KPI or summary statistic
-**Time Series Metrics** (`query_metrics_range`) - Returns values at regular intervals controlled by the 'step' parameter. Use this when:
+**Time Series Metrics** (`tempo_query_metrics_range`) - Returns values at regular intervals controlled by the 'step' parameter. Use this when:
 - You need to graph metrics over time or analyze trends
 - You want to see patterns, spikes, or changes in metrics
 - You're troubleshooting time-based issues
 - You need to correlate metrics with specific time periods
 ## Special workflow for performance issues
-When investigating performance issues in kubernetes via traces, call fetch_tempo_traces_comparative_sample. This tool provides comprehensive analysis for identifying patterns.
+When investigating performance issues in kubernetes via traces, call tempo_fetch_traces_comparative_sample. This tool provides comprehensive analysis for identifying patterns.
 ## Important Notes
 - TraceQL is the modern query language - prefer it over tag-based search
@@ -145,3 +244,4 @@ When investigating performance issues in kubernetes via traces, call fetch_tempo
 - Use time filters (start/end) to improve query performance
 - To get information about Kubernetes resources try these first: resource.service.name, resource.k8s.pod.name, resource.k8s.namespace.name, resource.k8s.deployment.name, resource.k8s.node.name, resource.k8s.container.name
 - TraceQL and TraceQL metrics language are complex. If you get empty data, try to simplify your query and try again!
+- IMPORTANT: TraceQL is not the same as 'TraceQL metrics' - Make sure you use the correct syntax and functions

holmesgpt 0.14.0a0__py3-none-any.whl → 0.14.1__py3-none-any.whl

Potentially problematic release.

holmesgpt 0.14.0a0py3-none-any.whl → 0.14.1py3-none-any.whl