hawkeye-mcp-server 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +123 -0
- package/INSTALLATION.md +734 -0
- package/LICENSE +21 -0
- package/README.md +289 -0
- package/SPECIFICATION.md +1073 -0
- package/USAGE.md +849 -0
- package/build/config/config.d.ts +58 -0
- package/build/config/config.js +100 -0
- package/build/config/config.js.map +1 -0
- package/build/index.d.ts +6 -0
- package/build/index.js +138 -0
- package/build/index.js.map +1 -0
- package/build/services/auth.service.d.ts +34 -0
- package/build/services/auth.service.js +96 -0
- package/build/services/auth.service.js.map +1 -0
- package/build/services/project.service.d.ts +50 -0
- package/build/services/project.service.js +136 -0
- package/build/services/project.service.js.map +1 -0
- package/build/services/session.service.d.ts +68 -0
- package/build/services/session.service.js +357 -0
- package/build/services/session.service.js.map +1 -0
- package/build/tools/continue-investigation.d.ts +10 -0
- package/build/tools/continue-investigation.js +84 -0
- package/build/tools/continue-investigation.js.map +1 -0
- package/build/tools/get-incident-report.d.ts +10 -0
- package/build/tools/get-incident-report.js +62 -0
- package/build/tools/get-incident-report.js.map +1 -0
- package/build/tools/get-session-report.d.ts +25 -0
- package/build/tools/get-session-report.js +46 -0
- package/build/tools/get-session-report.js.map +1 -0
- package/build/tools/get-session-summary.d.ts +22 -0
- package/build/tools/get-session-summary.js +41 -0
- package/build/tools/get-session-summary.js.map +1 -0
- package/build/tools/get-status.d.ts +10 -0
- package/build/tools/get-status.js +129 -0
- package/build/tools/get-status.js.map +1 -0
- package/build/tools/index.d.ts +29 -0
- package/build/tools/index.js +349 -0
- package/build/tools/index.js.map +1 -0
- package/build/tools/inspect-session.d.ts +28 -0
- package/build/tools/inspect-session.js +51 -0
- package/build/tools/inspect-session.js.map +1 -0
- package/build/tools/investigate-alert.d.ts +10 -0
- package/build/tools/investigate-alert.js +122 -0
- package/build/tools/investigate-alert.js.map +1 -0
- package/build/tools/list-sessions.d.ts +49 -0
- package/build/tools/list-sessions.js +79 -0
- package/build/tools/list-sessions.js.map +1 -0
- package/build/types/errors.d.ts +61 -0
- package/build/types/errors.js +76 -0
- package/build/types/errors.js.map +1 -0
- package/build/types/hawkeye.d.ts +238 -0
- package/build/types/hawkeye.js +8 -0
- package/build/types/hawkeye.js.map +1 -0
- package/build/types/mcp.d.ts +125 -0
- package/build/types/mcp.js +6 -0
- package/build/types/mcp.js.map +1 -0
- package/build/utils/errors.d.ts +20 -0
- package/build/utils/errors.js +125 -0
- package/build/utils/errors.js.map +1 -0
- package/build/utils/http-client.d.ts +51 -0
- package/build/utils/http-client.js +133 -0
- package/build/utils/http-client.js.map +1 -0
- package/build/utils/logger.d.ts +35 -0
- package/build/utils/logger.js +77 -0
- package/build/utils/logger.js.map +1 -0
- package/build/utils/validation.d.ts +134 -0
- package/build/utils/validation.js +68 -0
- package/build/utils/validation.js.map +1 -0
- package/package.json +66 -0
package/USAGE.md
ADDED
|
@@ -0,0 +1,849 @@
|
|
|
1
|
+
# Usage Guide
|
|
2
|
+
|
|
3
|
+
Complete guide to using the Hawkeye MCP Server effectively with practical examples and workflows.
|
|
4
|
+
|
|
5
|
+
## Table of Contents
|
|
6
|
+
|
|
7
|
+
- [Basic Concepts](#basic-concepts)
|
|
8
|
+
- [Getting Started](#getting-started)
|
|
9
|
+
- [Common Workflows](#common-workflows)
|
|
10
|
+
- [Available Tools Reference](#available-tools-reference)
|
|
11
|
+
- [Advanced Usage](#advanced-usage)
|
|
12
|
+
- [Tips & Best Practices](#tips--best-practices)
|
|
13
|
+
- [Examples Library](#examples-library)
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Basic Concepts
|
|
18
|
+
|
|
19
|
+
### What is MCP?
|
|
20
|
+
|
|
21
|
+
**Model Context Protocol (MCP)** is a standard protocol that allows AI agents to use external tools and services. Think of it like plugins for your AI assistant.
|
|
22
|
+
|
|
23
|
+
### How Does Hawkeye MCP Work?
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
You (via AI Agent) → MCP Server → Hawkeye API → Investigation Results
|
|
27
|
+
↑_________________________________________________↓
|
|
28
|
+
(Results displayed in your AI agent)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
1. **You ask** your AI agent a question about incidents
|
|
32
|
+
2. **AI agent decides** which Hawkeye tool to use
|
|
33
|
+
3. **MCP server** translates the request and calls Hawkeye API
|
|
34
|
+
4. **Hawkeye** performs the investigation
|
|
35
|
+
5. **Results** are returned to you through the AI agent
|
|
36
|
+
|
|
37
|
+
### Key Terminology
|
|
38
|
+
|
|
39
|
+
- **Session**: An investigation or conversation with Hawkeye
|
|
40
|
+
- **Investigation**: The process of analyzing an incident/alert
|
|
41
|
+
- **RCA**: Root Cause Analysis - the investigation result
|
|
42
|
+
- **Project**: A Hawkeye project containing your cloud environment data
|
|
43
|
+
- **Alert/Incident**: A cloud issue that needs investigation
|
|
44
|
+
- **Chain of Thoughts**: The step-by-step reasoning Hawkeye used
|
|
45
|
+
- **Prompt Cycle**: One round of question/answer in an investigation
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Getting Started
|
|
50
|
+
|
|
51
|
+
### Your First Investigation
|
|
52
|
+
|
|
53
|
+
Let's start with the simplest workflow:
|
|
54
|
+
|
|
55
|
+
**Step 1: List Your Projects**
|
|
56
|
+
|
|
57
|
+
```
|
|
58
|
+
You: "What Hawkeye projects do I have access to?"
|
|
59
|
+
|
|
60
|
+
Agent: *Calls hawkeye_list_projects*
|
|
61
|
+
|
|
62
|
+
Response:
|
|
63
|
+
- HTM-Azure (74acc801-3428-4292-8a74-ae16ecb71c24)
|
|
64
|
+
- HTM-GCP (13541fe6-b849-4d56-b620-54fd584f1300)
|
|
65
|
+
- Datadog_AWS (a6c01691-2fc2-4065-9e32-8c8f1d3d19f5)
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
**Step 2: List Recent Sessions**
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
You: "Show me recent investigations from the HTM-Azure project"
|
|
72
|
+
|
|
73
|
+
Agent: *Calls hawkeye_list_sessions with project_uuid*
|
|
74
|
+
|
|
75
|
+
Response:
|
|
76
|
+
- Incident: 217 - ImagePullBackOff issue (2025-09-12)
|
|
77
|
+
- Incident: 215 - ImagePullBackOff issue (2025-09-12)
|
|
78
|
+
- Incident: 541 - Testing 8910 (2025-07-19)
|
|
79
|
+
...
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**Step 3: Inspect a Session**
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
You: "Show me the full details of session 68c3663ff7afd98ca1b1ebe2"
|
|
86
|
+
|
|
87
|
+
Agent: *Calls hawkeye_inspect_session*
|
|
88
|
+
|
|
89
|
+
Response: Complete investigation with chain of thoughts, sources, and findings
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Common Workflows
|
|
95
|
+
|
|
96
|
+
### Workflow 1: Finding Uninvestigated Incidents
|
|
97
|
+
|
|
98
|
+
This is one of the most common use cases - finding incidents that haven't been analyzed yet.
|
|
99
|
+
|
|
100
|
+
**Quick Method:**
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
You: "Show me all uninvestigated incidents in HTM-Azure project"
|
|
104
|
+
|
|
105
|
+
Agent: *Calls hawkeye_list_sessions with:*
|
|
106
|
+
{
|
|
107
|
+
"project_uuid": "74acc801-3428-4292-8a74-ae16ecb71c24",
|
|
108
|
+
"only_uninvestigated": true
|
|
109
|
+
}
|
|
110
|
+
|
|
111
|
+
Response:
|
|
112
|
+
Found 15 uninvestigated incidents:
|
|
113
|
+
1. Incident: 217 - ImagePullBackOff (P4, Created: 2025-09-12)
|
|
114
|
+
2. Incident: 215 - ImagePullBackOff (P4, Created: 2025-09-12)
|
|
115
|
+
...
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**With Date Filter:**
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
You: "Show me uninvestigated incidents from the last 7 days"
|
|
122
|
+
|
|
123
|
+
Agent: *Adds date_from filter*
|
|
124
|
+
{
|
|
125
|
+
"only_uninvestigated": true,
|
|
126
|
+
"date_from": "2025-11-12"
|
|
127
|
+
}
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
**With Priority Focus (Manual):**
|
|
131
|
+
|
|
132
|
+
Since filtering by priority isn't built-in yet, ask the agent to filter results:
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
You: "Show me uninvestigated P1 incidents"
|
|
136
|
+
|
|
137
|
+
Agent: *Gets all uninvestigated, then filters for P1 in results*
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
---
|
|
141
|
+
|
|
142
|
+
### Workflow 2: Investigating a Specific Alert
|
|
143
|
+
|
|
144
|
+
When you have an alert ID and want to investigate it:
|
|
145
|
+
|
|
146
|
+
**Standard Investigation:**
|
|
147
|
+
|
|
148
|
+
```
|
|
149
|
+
You: "Investigate alert /subscriptions/cb3bc010-0fc7-476a-8577-03c3e45e2296/resourcegroups/test123/providers/microsoft.insights/components/test123-insights/providers/Microsoft.AlertsManagement/alerts/08f9c803-6bff-4030-bbe8-300a993af000"
|
|
150
|
+
|
|
151
|
+
Agent: *Calls hawkeye_investigate_alert*
|
|
152
|
+
{
|
|
153
|
+
"alert_id": "/subscriptions/.../alerts/08f9c803...",
|
|
154
|
+
"wait_for_completion": true
|
|
155
|
+
}
|
|
156
|
+
|
|
157
|
+
Response:
|
|
158
|
+
- Checks for existing investigation
|
|
159
|
+
- If found: Returns existing RCA
|
|
160
|
+
- If not found: Creates new investigation
|
|
161
|
+
- Returns: Root cause analysis with recommendations
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
**Quick Investigation (Don't Wait):**
|
|
165
|
+
|
|
166
|
+
```
|
|
167
|
+
You: "Start investigating alert XYZ but don't wait for completion"
|
|
168
|
+
|
|
169
|
+
Agent: *Sets wait_for_completion: false*
|
|
170
|
+
|
|
171
|
+
Response:
|
|
172
|
+
Investigation started, session_uuid: abc-123
|
|
173
|
+
You can check status later with get_investigation_status
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
**Check Status Later:**
|
|
177
|
+
|
|
178
|
+
```
|
|
179
|
+
You: "What's the status of investigation abc-123?"
|
|
180
|
+
|
|
181
|
+
Agent: *Calls hawkeye_get_investigation_status*
|
|
182
|
+
|
|
183
|
+
Response: Investigation 85% complete, currently analyzing logs...
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
---
|
|
187
|
+
|
|
188
|
+
### Workflow 3: Deep Diving into an Investigation
|
|
189
|
+
|
|
190
|
+
When you want to understand how Hawkeye reached its conclusions:
|
|
191
|
+
|
|
192
|
+
**Get Full Session Details:**
|
|
193
|
+
|
|
194
|
+
```
|
|
195
|
+
You: "Show me everything about session 68c3663ff7afd98ca1b1ebe2"
|
|
196
|
+
|
|
197
|
+
Agent: *Calls hawkeye_inspect_session*
|
|
198
|
+
|
|
199
|
+
Response includes:
|
|
200
|
+
- Session metadata (creation time, last update)
|
|
201
|
+
- All prompt cycles (questions asked)
|
|
202
|
+
- Chain of thoughts for each cycle (reasoning steps)
|
|
203
|
+
- All data sources consulted
|
|
204
|
+
- Follow-up suggestions
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
**Example Output Structure:**
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Session: 68c3663ff7afd98ca1b1ebe2
|
|
211
|
+
Name: Incident: 217
|
|
212
|
+
Created: 2025-09-12T00:15:59Z
|
|
213
|
+
|
|
214
|
+
Prompt Cycle 1:
|
|
215
|
+
Status: COMPLETED
|
|
216
|
+
Final Answer: "The ImagePullBackOff error is caused by..."
|
|
217
|
+
|
|
218
|
+
Chain of Thoughts:
|
|
219
|
+
1. "Analyzing pod status in namespace neubird-alex"
|
|
220
|
+
2. "Checking image registry connectivity"
|
|
221
|
+
3. "Reviewing recent deployment changes"
|
|
222
|
+
|
|
223
|
+
Sources:
|
|
224
|
+
- Kubernetes Events (234 events analyzed)
|
|
225
|
+
- Pod Logs (15 pods checked)
|
|
226
|
+
- Deployment History (last 5 deployments)
|
|
227
|
+
|
|
228
|
+
Follow-up Suggestions:
|
|
229
|
+
- "Would you like me to check if this affects other namespaces?"
|
|
230
|
+
- "Should I analyze the image registry configuration?"
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
---
|
|
234
|
+
|
|
235
|
+
### Workflow 4: Following Up on Investigations
|
|
236
|
+
|
|
237
|
+
Continue a conversation with Hawkeye about an investigation:
|
|
238
|
+
|
|
239
|
+
**Basic Follow-up:**
|
|
240
|
+
|
|
241
|
+
```
|
|
242
|
+
You: "In session abc-123, can you dig deeper into the network issues?"
|
|
243
|
+
|
|
244
|
+
Agent: *Calls hawkeye_continue_investigation*
|
|
245
|
+
{
|
|
246
|
+
"session_uuid": "abc-123",
|
|
247
|
+
"follow_up_prompt": "Analyze the network policy issues in more detail",
|
|
248
|
+
"wait_for_completion": true
|
|
249
|
+
}
|
|
250
|
+
|
|
251
|
+
Response: Detailed analysis of network policies, routes, and connectivity
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
**Contextual Follow-up:**
|
|
255
|
+
|
|
256
|
+
The AI agent can reference previous conversation:
|
|
257
|
+
|
|
258
|
+
```
|
|
259
|
+
You: "What about the database you mentioned?"
|
|
260
|
+
|
|
261
|
+
Agent: *Uses context from previous responses*
|
|
262
|
+
"Analyzing the database connection issues mentioned in the previous investigation..."
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
### Workflow 5: Getting Analytics and Reports
|
|
268
|
+
|
|
269
|
+
Understand your incident patterns and time saved:
|
|
270
|
+
|
|
271
|
+
**Organization-Wide Statistics:**
|
|
272
|
+
|
|
273
|
+
```
|
|
274
|
+
You: "Show me our incident statistics and how much time Hawkeye has saved us"
|
|
275
|
+
|
|
276
|
+
Agent: *Calls hawkeye_get_incident_report*
|
|
277
|
+
|
|
278
|
+
Response:
|
|
279
|
+
Incident Report:
|
|
280
|
+
- Time Period: 2025-07-01 to 2025-11-19
|
|
281
|
+
- Total Incidents: 342
|
|
282
|
+
- Total Investigations: 156
|
|
283
|
+
- Average MTTR: 45 minutes
|
|
284
|
+
- Time Saved: 234 hours (14,040 minutes)
|
|
285
|
+
- Noise Reduction: 67%
|
|
286
|
+
|
|
287
|
+
By Priority:
|
|
288
|
+
- P1: 12 incidents (avg MTTR: 25min, 95% investigated)
|
|
289
|
+
- P2: 34 incidents (avg MTTR: 35min, 78% investigated)
|
|
290
|
+
- P3: 89 incidents (avg MTTR: 50min, 62% investigated)
|
|
291
|
+
- P4: 207 incidents (avg MTTR: 60min, 34% investigated)
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
**Session-Specific Reports:**
|
|
295
|
+
|
|
296
|
+
```
|
|
297
|
+
You: "Get me a summary report for sessions abc-123, def-456, and ghi-789"
|
|
298
|
+
|
|
299
|
+
Agent: *Calls hawkeye_get_session_report*
|
|
300
|
+
{
|
|
301
|
+
"session_uuids": ["abc-123", "def-456", "ghi-789"],
|
|
302
|
+
"project_uuid": "74acc801-3428-4292-8a74-ae16ecb71c24"
|
|
303
|
+
}
|
|
304
|
+
|
|
305
|
+
Response: Individual reports for each session with time-saved metrics
|
|
306
|
+
```
|
|
307
|
+
|
|
308
|
+
**Quality Scores:**
|
|
309
|
+
|
|
310
|
+
```
|
|
311
|
+
You: "How good was the analysis for session abc-123?"
|
|
312
|
+
|
|
313
|
+
Agent: *Calls hawkeye_get_session_summary*
|
|
314
|
+
|
|
315
|
+
Response:
|
|
316
|
+
Analysis Score:
|
|
317
|
+
Accuracy:
|
|
318
|
+
- Root Cause Correct: Yes
|
|
319
|
+
- Impact Analysis: Yes
|
|
320
|
+
- Timeline Accurate: Partial
|
|
321
|
+
- Overall Score: 92/100
|
|
322
|
+
|
|
323
|
+
Completeness:
|
|
324
|
+
- Data Sources: 8/10
|
|
325
|
+
- Remediation Steps: 9/10
|
|
326
|
+
- Prevention Measures: 7/10
|
|
327
|
+
- Overall Score: 85/100
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
|
|
332
|
+
## Available Tools Reference
|
|
333
|
+
|
|
334
|
+
### 1. hawkeye_list_projects
|
|
335
|
+
|
|
336
|
+
**Purpose:** List all available Hawkeye projects
|
|
337
|
+
|
|
338
|
+
**When to use:**
|
|
339
|
+
- First time setup
|
|
340
|
+
- When you don't know the project UUID
|
|
341
|
+
- To see all available projects
|
|
342
|
+
|
|
343
|
+
**Example:**
|
|
344
|
+
```
|
|
345
|
+
"List all my Hawkeye projects"
|
|
346
|
+
"What projects do I have access to?"
|
|
347
|
+
"Show me available Hawkeye projects"
|
|
348
|
+
```
|
|
349
|
+
|
|
350
|
+
---
|
|
351
|
+
|
|
352
|
+
### 2. hawkeye_investigate_alert
|
|
353
|
+
|
|
354
|
+
**Purpose:** Investigate a specific alert by ID
|
|
355
|
+
|
|
356
|
+
**When to use:**
|
|
357
|
+
- You have an alert ID from your monitoring system
|
|
358
|
+
- You want to find or create an RCA for a specific alert
|
|
359
|
+
- Starting a new investigation
|
|
360
|
+
|
|
361
|
+
**Parameters:**
|
|
362
|
+
- `alert_id` (required): Full alert identifier
|
|
363
|
+
- `wait_for_completion` (optional): Whether to wait for results
|
|
364
|
+
- `max_wait_seconds` (optional): Max time to wait
|
|
365
|
+
- `project_uuid` (optional): Which project to use
|
|
366
|
+
|
|
367
|
+
**Example:**
|
|
368
|
+
```
|
|
369
|
+
"Investigate alert /subscriptions/.../alerts/08f9c803..."
|
|
370
|
+
"Analyze this alert: [paste alert ID]"
|
|
371
|
+
"Create an RCA for alert XYZ"
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
---
|
|
375
|
+
|
|
376
|
+
### 3. hawkeye_list_sessions
|
|
377
|
+
|
|
378
|
+
**Purpose:** List investigation sessions with filtering
|
|
379
|
+
|
|
380
|
+
**When to use:**
|
|
381
|
+
- Finding uninvestigated incidents
|
|
382
|
+
- Reviewing recent investigations
|
|
383
|
+
- Searching for specific sessions
|
|
384
|
+
- Filtering by date or status
|
|
385
|
+
|
|
386
|
+
**Key Parameters:**
|
|
387
|
+
- `only_uninvestigated`: Quick filter for new incidents
|
|
388
|
+
- `investigation_status`: Filter by status
|
|
389
|
+
- `session_type`: Filter by incident vs manual
|
|
390
|
+
- `date_from`/`date_to`: Date range filtering
|
|
391
|
+
- `page`/`limit`: Pagination
|
|
392
|
+
|
|
393
|
+
**Example:**
|
|
394
|
+
```
|
|
395
|
+
"Show uninvestigated incidents"
|
|
396
|
+
"List sessions from last week"
|
|
397
|
+
"Show me recent investigations in HTM-Azure"
|
|
398
|
+
"Find all incidents from October"
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
---
|
|
402
|
+
|
|
403
|
+
### 4. hawkeye_inspect_session
|
|
404
|
+
|
|
405
|
+
**Purpose:** Get detailed information about a session
|
|
406
|
+
|
|
407
|
+
**When to use:**
|
|
408
|
+
- Deep diving into an investigation
|
|
409
|
+
- Understanding Hawkeye's reasoning
|
|
410
|
+
- Reviewing data sources used
|
|
411
|
+
- Getting follow-up suggestions
|
|
412
|
+
|
|
413
|
+
**Example:**
|
|
414
|
+
```
|
|
415
|
+
"Show me session abc-123 in detail"
|
|
416
|
+
"What sources did Hawkeye use for session XYZ?"
|
|
417
|
+
"Explain how Hawkeye solved incident ABC"
|
|
418
|
+
```
|
|
419
|
+
|
|
420
|
+
---
|
|
421
|
+
|
|
422
|
+
### 5. hawkeye_continue_investigation
|
|
423
|
+
|
|
424
|
+
**Purpose:** Ask follow-up questions on an existing investigation
|
|
425
|
+
|
|
426
|
+
**When to use:**
|
|
427
|
+
- Need more details on a specific aspect
|
|
428
|
+
- Want to explore a different angle
|
|
429
|
+
- Following up on suggestions
|
|
430
|
+
|
|
431
|
+
**Example:**
|
|
432
|
+
```
|
|
433
|
+
"In session abc-123, analyze the database connections"
|
|
434
|
+
"Can you dig deeper into the network issues from that investigation?"
|
|
435
|
+
"Tell me more about the root cause you found"
|
|
436
|
+
```
|
|
437
|
+
|
|
438
|
+
---
|
|
439
|
+
|
|
440
|
+
### 6. hawkeye_get_investigation_status
|
|
441
|
+
|
|
442
|
+
**Purpose:** Check progress of an ongoing investigation
|
|
443
|
+
|
|
444
|
+
**When to use:**
|
|
445
|
+
- Investigation started without waiting
|
|
446
|
+
- Checking if investigation is complete
|
|
447
|
+
- Getting current progress
|
|
448
|
+
|
|
449
|
+
**Example:**
|
|
450
|
+
```
|
|
451
|
+
"What's the status of investigation abc-123?"
|
|
452
|
+
"Is the investigation complete yet?"
|
|
453
|
+
"Check progress on session XYZ"
|
|
454
|
+
```
|
|
455
|
+
|
|
456
|
+
---
|
|
457
|
+
|
|
458
|
+
### 7. hawkeye_get_session_report
|
|
459
|
+
|
|
460
|
+
**Purpose:** Get summary reports with time-saved metrics
|
|
461
|
+
|
|
462
|
+
**When to use:**
|
|
463
|
+
- Quick overview of investigations
|
|
464
|
+
- Getting time-saved numbers
|
|
465
|
+
- Comparing multiple sessions
|
|
466
|
+
|
|
467
|
+
**Example:**
|
|
468
|
+
```
|
|
469
|
+
"Get a report for sessions A, B, and C"
|
|
470
|
+
"How much time did we save on investigation XYZ?"
|
|
471
|
+
"Summarize these three incidents"
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
---
|
|
475
|
+
|
|
476
|
+
### 8. hawkeye_get_session_summary
|
|
477
|
+
|
|
478
|
+
**Purpose:** Get quality scores and detailed analysis metrics
|
|
479
|
+
|
|
480
|
+
**When to use:**
|
|
481
|
+
- Evaluating investigation quality
|
|
482
|
+
- Understanding accuracy scores
|
|
483
|
+
- Getting detailed metrics
|
|
484
|
+
|
|
485
|
+
**Example:**
|
|
486
|
+
```
|
|
487
|
+
"How accurate was the analysis for session abc-123?"
|
|
488
|
+
"Give me quality scores for investigation XYZ"
|
|
489
|
+
"Was this a good investigation?"
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
---
|
|
493
|
+
|
|
494
|
+
### 9. hawkeye_get_incident_report
|
|
495
|
+
|
|
496
|
+
**Purpose:** Get organization-wide incident analytics
|
|
497
|
+
|
|
498
|
+
**When to use:**
|
|
499
|
+
- Understanding overall incident patterns
|
|
500
|
+
- Getting MTTR metrics
|
|
501
|
+
- Calculating time saved
|
|
502
|
+
- Reporting to management
|
|
503
|
+
|
|
504
|
+
**Example:**
|
|
505
|
+
```
|
|
506
|
+
"Show me our incident statistics"
|
|
507
|
+
"How much time has Hawkeye saved us?"
|
|
508
|
+
"What's our average MTTR?"
|
|
509
|
+
"Give me an overview of all incidents"
|
|
510
|
+
```
|
|
511
|
+
|
|
512
|
+
---
|
|
513
|
+
|
|
514
|
+
## Advanced Usage
|
|
515
|
+
|
|
516
|
+
### Working with Multiple Projects
|
|
517
|
+
|
|
518
|
+
If you manage multiple Hawkeye projects:
|
|
519
|
+
|
|
520
|
+
**Set Default Project:**
|
|
521
|
+
|
|
522
|
+
Set `HAWKEYE_DEFAULT_PROJECT_UUID` in your configuration, or always specify:
|
|
523
|
+
|
|
524
|
+
```
|
|
525
|
+
"List uninvestigated incidents in project 74acc801-3428-4292-8a74-ae16ecb71c24"
|
|
526
|
+
```
|
|
527
|
+
|
|
528
|
+
**Switch Between Projects:**
|
|
529
|
+
|
|
530
|
+
```
|
|
531
|
+
"Show incidents from HTM-Azure"
|
|
532
|
+
"Now show me incidents from HTM-GCP"
|
|
533
|
+
```
|
|
534
|
+
|
|
535
|
+
The AI agent will remember project names and UUIDs from earlier in the conversation.
|
|
536
|
+
|
|
537
|
+
---
|
|
538
|
+
|
|
539
|
+
### Filtering and Searching
|
|
540
|
+
|
|
541
|
+
**Date Range Queries:**
|
|
542
|
+
|
|
543
|
+
```
|
|
544
|
+
"Show incidents from last week"
|
|
545
|
+
"List investigations between Oct 1 and Oct 15"
|
|
546
|
+
"Find sessions from the past 30 days"
|
|
547
|
+
```
|
|
548
|
+
|
|
549
|
+
**Status Filtering:**
|
|
550
|
+
|
|
551
|
+
```
|
|
552
|
+
"Show only uninvestigated incidents"
|
|
553
|
+
"List completed investigations"
|
|
554
|
+
"Find investigations in progress"
|
|
555
|
+
```
|
|
556
|
+
|
|
557
|
+
**Combining Filters:**
|
|
558
|
+
|
|
559
|
+
```
|
|
560
|
+
"Show uninvestigated P1 incidents from last week in HTM-Azure project, excluding grouped incidents"
|
|
561
|
+
```
|
|
562
|
+
|
|
563
|
+
---
|
|
564
|
+
|
|
565
|
+
### Batch Operations
|
|
566
|
+
|
|
567
|
+
**Analyzing Multiple Sessions:**
|
|
568
|
+
|
|
569
|
+
```
|
|
570
|
+
"Get reports for all sessions from last week"
|
|
571
|
+
"Analyze these 5 incidents: [list of UUIDs]"
|
|
572
|
+
"Compare these three investigations"
|
|
573
|
+
```
|
|
574
|
+
|
|
575
|
+
**Bulk Status Checks:**
|
|
576
|
+
|
|
577
|
+
```
|
|
578
|
+
"Check status of these investigations: abc, def, ghi"
|
|
579
|
+
```
|
|
580
|
+
|
|
581
|
+
---
|
|
582
|
+
|
|
583
|
+
## Tips & Best Practices
|
|
584
|
+
|
|
585
|
+
### 1. Be Specific with Project Names
|
|
586
|
+
|
|
587
|
+
**Good:**
|
|
588
|
+
```
|
|
589
|
+
"Show uninvestigated incidents in HTM-Azure project"
|
|
590
|
+
```
|
|
591
|
+
|
|
592
|
+
**Less Good:**
|
|
593
|
+
```
|
|
594
|
+
"Show uninvestigated incidents" # May not know which project
|
|
595
|
+
```
|
|
596
|
+
|
|
597
|
+
### 2. Use Natural Language
|
|
598
|
+
|
|
599
|
+
The AI agent understands context:
|
|
600
|
+
|
|
601
|
+
**Good:**
|
|
602
|
+
```
|
|
603
|
+
You: "List my projects"
|
|
604
|
+
Agent: *Shows projects*
|
|
605
|
+
You: "Show uninvestigated incidents in the Azure one"
|
|
606
|
+
Agent: *Knows you mean HTM-Azure*
|
|
607
|
+
```
|
|
608
|
+
|
|
609
|
+
### 3. Ask for Summaries
|
|
610
|
+
|
|
611
|
+
Don't get overwhelmed with data:
|
|
612
|
+
|
|
613
|
+
```
|
|
614
|
+
"Give me a summary of the top 5 most critical uninvestigated incidents"
|
|
615
|
+
```
|
|
616
|
+
|
|
617
|
+
### 4. Follow the Investigation Flow
|
|
618
|
+
|
|
619
|
+
Natural workflow:
|
|
620
|
+
|
|
621
|
+
1. List uninvestigated incidents
|
|
622
|
+
2. Pick one to investigate
|
|
623
|
+
3. Review the analysis
|
|
624
|
+
4. Ask follow-up questions if needed
|
|
625
|
+
5. Move to next incident
|
|
626
|
+
|
|
627
|
+
### 5. Use Filters Effectively
|
|
628
|
+
|
|
629
|
+
Start broad, then narrow down:
|
|
630
|
+
|
|
631
|
+
```
|
|
632
|
+
"Show all incidents from last week"
|
|
633
|
+
"Now filter for P1 priority"
|
|
634
|
+
"Exclude the ones that are already investigated"
|
|
635
|
+
```
|
|
636
|
+
|
|
637
|
+
### 6. Save Session UUIDs
|
|
638
|
+
|
|
639
|
+
When you find an interesting session:
|
|
640
|
+
|
|
641
|
+
```
|
|
642
|
+
"Save session abc-123 for me to review later"
|
|
643
|
+
```
|
|
644
|
+
|
|
645
|
+
Or keep them in your notes for future reference.
|
|
646
|
+
|
|
647
|
+
### 7. Leverage Chain of Thoughts
|
|
648
|
+
|
|
649
|
+
Understanding the reasoning:
|
|
650
|
+
|
|
651
|
+
```
|
|
652
|
+
"Explain step-by-step how you reached this conclusion"
|
|
653
|
+
"Show me the chain of thoughts for this investigation"
|
|
654
|
+
```
|
|
655
|
+
|
|
656
|
+
---
|
|
657
|
+
|
|
658
|
+
## Examples Library
|
|
659
|
+
|
|
660
|
+
### Example 1: Morning Incident Review
|
|
661
|
+
|
|
662
|
+
**Scenario:** Start of day, review overnight incidents
|
|
663
|
+
|
|
664
|
+
```
|
|
665
|
+
You: "Show me all uninvestigated incidents from the last 24 hours"
|
|
666
|
+
|
|
667
|
+
Agent: *Lists 8 new incidents*
|
|
668
|
+
|
|
669
|
+
You: "Which ones are P1 or P2?"
|
|
670
|
+
|
|
671
|
+
Agent: *Filters and shows 2 P1 incidents*
|
|
672
|
+
|
|
673
|
+
You: "Investigate the first P1"
|
|
674
|
+
|
|
675
|
+
Agent: *Creates investigation and returns RCA*
|
|
676
|
+
|
|
677
|
+
You: "That makes sense. Investigate the second one too"
|
|
678
|
+
|
|
679
|
+
Agent: *Investigates second incident*
|
|
680
|
+
|
|
681
|
+
You: "Create a summary I can send to my team"
|
|
682
|
+
|
|
683
|
+
Agent: *Summarizes both investigations with action items*
|
|
684
|
+
```
|
|
685
|
+
|
|
686
|
+
### Example 2: Investigating a Production Issue
|
|
687
|
+
|
|
688
|
+
**Scenario:** Production alert just fired
|
|
689
|
+
|
|
690
|
+
```
|
|
691
|
+
You: "We just got alert /subscriptions/.../alerts/NEW-ALERT-123, investigate it ASAP"
|
|
692
|
+
|
|
693
|
+
Agent: *Starts investigation immediately*
|
|
694
|
+
{
|
|
695
|
+
"wait_for_completion": true,
|
|
696
|
+
"alert_id": "/subscriptions/.../alerts/NEW-ALERT-123"
|
|
697
|
+
}
|
|
698
|
+
|
|
699
|
+
Response: [RCA with root cause and remediation steps]
|
|
700
|
+
|
|
701
|
+
You: "The root cause mentions database connections. Can you analyze the database more deeply?"
|
|
702
|
+
|
|
703
|
+
Agent: *Continues investigation focusing on database*
|
|
704
|
+
|
|
705
|
+
Response: [Detailed database analysis]
|
|
706
|
+
|
|
707
|
+
You: "What are the recommended fixes?"
|
|
708
|
+
|
|
709
|
+
Agent: *Provides actionable remediation steps*
|
|
710
|
+
```
|
|
711
|
+
|
|
712
|
+
### Example 3: Weekly Report Generation
|
|
713
|
+
|
|
714
|
+
**Scenario:** Create weekly incident report
|
|
715
|
+
|
|
716
|
+
```
|
|
717
|
+
You: "Show me all incidents from the past week"
|
|
718
|
+
|
|
719
|
+
Agent: *Lists incidents from last 7 days*
|
|
720
|
+
|
|
721
|
+
You: "How many were investigated?"
|
|
722
|
+
|
|
723
|
+
Agent: "45 total incidents, 32 investigated (71%)"
|
|
724
|
+
|
|
725
|
+
You: "Give me the incident report with all statistics"
|
|
726
|
+
|
|
727
|
+
Agent: *Calls hawkeye_get_incident_report*
|
|
728
|
+
|
|
729
|
+
Response: [Full statistics with MTTR, time saved, etc.]
|
|
730
|
+
|
|
731
|
+
You: "Create a summary suitable for management review"
|
|
732
|
+
|
|
733
|
+
Agent: *Formats report with key metrics and highlights*
|
|
734
|
+
```
|
|
735
|
+
|
|
736
|
+
### Example 4: Post-Mortem Analysis
|
|
737
|
+
|
|
738
|
+
**Scenario:** Analyze a past incident
|
|
739
|
+
|
|
740
|
+
```
|
|
741
|
+
You: "Find the incident about ImagePullBackOff from September 12"
|
|
742
|
+
|
|
743
|
+
Agent: *Searches and finds session 68c3663ff7afd98ca1b1ebe2*
|
|
744
|
+
|
|
745
|
+
You: "Show me the complete investigation"
|
|
746
|
+
|
|
747
|
+
Agent: *Displays full session details*
|
|
748
|
+
|
|
749
|
+
You: "What data sources were used?"
|
|
750
|
+
|
|
751
|
+
Agent: "Kubernetes Events (234), Pod Logs (15 pods), Deployment History"
|
|
752
|
+
|
|
753
|
+
You: "Give me the quality score for this investigation"
|
|
754
|
+
|
|
755
|
+
Agent: *Shows 92/100 accuracy, 85/100 completeness*
|
|
756
|
+
|
|
757
|
+
You: "Perfect, export this for the post-mortem document"
|
|
758
|
+
|
|
759
|
+
Agent: *Formats investigation for documentation*
|
|
760
|
+
```
|
|
761
|
+
|
|
762
|
+
### Example 5: Trend Analysis
|
|
763
|
+
|
|
764
|
+
**Scenario:** Understanding incident patterns
|
|
765
|
+
|
|
766
|
+
```
|
|
767
|
+
You: "Show me all incidents from October"
|
|
768
|
+
|
|
769
|
+
Agent: *Lists October incidents*
|
|
770
|
+
|
|
771
|
+
You: "Group them by type"
|
|
772
|
+
|
|
773
|
+
Agent: *Groups: 15 database issues, 23 network issues, 8 pod issues*
|
|
774
|
+
|
|
775
|
+
You: "Focus on the database issues - show me if they're related"
|
|
776
|
+
|
|
777
|
+
Agent: *Analyzes and finds 12 of 15 share same root cause*
|
|
778
|
+
|
|
779
|
+
You: "What's the common root cause?"
|
|
780
|
+
|
|
781
|
+
Agent: "Connection pool exhaustion during peak hours"
|
|
782
|
+
|
|
783
|
+
You: "Get me the incident report to see the impact"
|
|
784
|
+
|
|
785
|
+
Agent: *Shows MTTR, time to detect, business impact*
|
|
786
|
+
```
|
|
787
|
+
|
|
788
|
+
---
|
|
789
|
+
|
|
790
|
+
## Frequently Asked Questions
|
|
791
|
+
|
|
792
|
+
### Q: Can I investigate multiple alerts at once?
|
|
793
|
+
|
|
794
|
+
A: Currently, you need to investigate alerts one at a time. However, you can start multiple investigations without waiting for completion:
|
|
795
|
+
|
|
796
|
+
```
|
|
797
|
+
"Start investigating alert A, don't wait"
|
|
798
|
+
"Start investigating alert B, don't wait"
|
|
799
|
+
"Check status of both investigations"
|
|
800
|
+
```
|
|
801
|
+
|
|
802
|
+
### Q: How long do investigations take?
|
|
803
|
+
|
|
804
|
+
A: Typically 30 seconds to 5 minutes, depending on:
|
|
805
|
+
- Complexity of the incident
|
|
806
|
+
- Amount of data to analyze
|
|
807
|
+
- Current system load
|
|
808
|
+
|
|
809
|
+
### Q: Can I cancel an investigation?
|
|
810
|
+
|
|
811
|
+
A: Not currently. If you don't wait for completion, you can simply ignore it and start a new one.
|
|
812
|
+
|
|
813
|
+
### Q: What happens if my session times out?
|
|
814
|
+
|
|
815
|
+
A: The investigation continues in Hawkeye. You can check its status later using the session UUID.
|
|
816
|
+
|
|
817
|
+
### Q: Can I re-investigate an alert?
|
|
818
|
+
|
|
819
|
+
A: Yes, but Hawkeye will find the existing investigation by default. To force a new investigation, ask specifically:
|
|
820
|
+
|
|
821
|
+
```
|
|
822
|
+
"Create a new investigation for alert XYZ (ignore existing ones)"
|
|
823
|
+
```
|
|
824
|
+
|
|
825
|
+
### Q: How do I share an investigation with my team?
|
|
826
|
+
|
|
827
|
+
A: Get the session details and share the formatted output, or share the Hawkeye web UI link:
|
|
828
|
+
|
|
829
|
+
```
|
|
830
|
+
"Give me a shareable link for session abc-123"
|
|
831
|
+
```
|
|
832
|
+
|
|
833
|
+
---
|
|
834
|
+
|
|
835
|
+
## Next Steps
|
|
836
|
+
|
|
837
|
+
- **Explore:** Try different queries and see what works
|
|
838
|
+
- **Experiment:** Use the filters and parameters
|
|
839
|
+
- **Integrate:** Build this into your incident response workflow
|
|
840
|
+
- **Automate:** Consider automating daily uninvestigated incident reviews
|
|
841
|
+
- **Feedback:** Let us know what works and what doesn't (support@neubird.ai)
|
|
842
|
+
|
|
843
|
+
---
|
|
844
|
+
|
|
845
|
+
**Happy investigating! 🔍**
|
|
846
|
+
|
|
847
|
+
For technical details, see [SPECIFICATION.md](./SPECIFICATION.md)
|
|
848
|
+
|
|
849
|
+
For installation help, see [INSTALLATION.md](./INSTALLATION.md)
|