specweave 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/INSTALL.md +848 -0
- package/LICENSE +21 -0
- package/README.md +675 -0
- package/SPECWEAVE.md +665 -0
- package/bin/install-agents.sh +57 -0
- package/bin/install-all.sh +49 -0
- package/bin/install-commands.sh +56 -0
- package/bin/install-skills.sh +57 -0
- package/bin/specweave.js +81 -0
- package/dist/adapters/adapter-base.d.ts +50 -0
- package/dist/adapters/adapter-base.d.ts.map +1 -0
- package/dist/adapters/adapter-base.js +146 -0
- package/dist/adapters/adapter-base.js.map +1 -0
- package/dist/adapters/adapter-interface.d.ts +108 -0
- package/dist/adapters/adapter-interface.d.ts.map +1 -0
- package/dist/adapters/adapter-interface.js +9 -0
- package/dist/adapters/adapter-interface.js.map +1 -0
- package/dist/adapters/claude/adapter.d.ts +54 -0
- package/dist/adapters/claude/adapter.d.ts.map +1 -0
- package/dist/adapters/claude/adapter.js +184 -0
- package/dist/adapters/claude/adapter.js.map +1 -0
- package/dist/adapters/copilot/adapter.d.ts +42 -0
- package/dist/adapters/copilot/adapter.d.ts.map +1 -0
- package/dist/adapters/copilot/adapter.js +239 -0
- package/dist/adapters/copilot/adapter.js.map +1 -0
- package/dist/adapters/cursor/adapter.d.ts +42 -0
- package/dist/adapters/cursor/adapter.d.ts.map +1 -0
- package/dist/adapters/cursor/adapter.js +297 -0
- package/dist/adapters/cursor/adapter.js.map +1 -0
- package/dist/adapters/generic/adapter.d.ts +40 -0
- package/dist/adapters/generic/adapter.d.ts.map +1 -0
- package/dist/adapters/generic/adapter.js +155 -0
- package/dist/adapters/generic/adapter.js.map +1 -0
- package/dist/cli/commands/init.d.ts +6 -0
- package/dist/cli/commands/init.d.ts.map +1 -0
- package/dist/cli/commands/init.js +247 -0
- package/dist/cli/commands/init.js.map +1 -0
- package/dist/cli/commands/install.d.ts +7 -0
- package/dist/cli/commands/install.d.ts.map +1 -0
- package/dist/cli/commands/install.js +160 -0
- package/dist/cli/commands/install.js.map +1 -0
- package/dist/cli/commands/list.d.ts +6 -0
- package/dist/cli/commands/list.d.ts.map +1 -0
- package/dist/cli/commands/list.js +154 -0
- package/dist/cli/commands/list.js.map +1 -0
- package/package.json +90 -0
- package/src/adapters/README.md +312 -0
- package/src/adapters/adapter-base.ts +146 -0
- package/src/adapters/adapter-interface.ts +120 -0
- package/src/adapters/claude/README.md +241 -0
- package/src/adapters/claude/adapter.ts +157 -0
- package/src/adapters/copilot/.github/copilot/instructions.md +376 -0
- package/src/adapters/copilot/README.md +200 -0
- package/src/adapters/copilot/adapter.ts +210 -0
- package/src/adapters/cursor/.cursor/context/docs-context.md +62 -0
- package/src/adapters/cursor/.cursor/context/increments-context.md +71 -0
- package/src/adapters/cursor/.cursor/context/strategy-context.md +73 -0
- package/src/adapters/cursor/.cursor/context/tests-context.md +89 -0
- package/src/adapters/cursor/.cursorrules +325 -0
- package/src/adapters/cursor/README.md +243 -0
- package/src/adapters/cursor/adapter.ts +268 -0
- package/src/adapters/generic/README.md +277 -0
- package/src/adapters/generic/SPECWEAVE-MANUAL.md +676 -0
- package/src/adapters/generic/adapter.ts +159 -0
- package/src/adapters/registry.yaml +126 -0
- package/src/agents/architect/AGENT.md +416 -0
- package/src/agents/devops/AGENT.md +1738 -0
- package/src/agents/docs-writer/AGENT.md +239 -0
- package/src/agents/performance/AGENT.md +228 -0
- package/src/agents/pm/AGENT.md +751 -0
- package/src/agents/qa-lead/AGENT.md +150 -0
- package/src/agents/security/AGENT.md +179 -0
- package/src/agents/sre/AGENT.md +582 -0
- package/src/agents/sre/modules/backend-diagnostics.md +481 -0
- package/src/agents/sre/modules/database-diagnostics.md +509 -0
- package/src/agents/sre/modules/infrastructure.md +561 -0
- package/src/agents/sre/modules/monitoring.md +439 -0
- package/src/agents/sre/modules/security-incidents.md +421 -0
- package/src/agents/sre/modules/ui-diagnostics.md +302 -0
- package/src/agents/sre/playbooks/01-high-cpu-usage.md +204 -0
- package/src/agents/sre/playbooks/02-database-deadlock.md +241 -0
- package/src/agents/sre/playbooks/03-memory-leak.md +252 -0
- package/src/agents/sre/playbooks/04-slow-api-response.md +269 -0
- package/src/agents/sre/playbooks/05-ddos-attack.md +293 -0
- package/src/agents/sre/playbooks/06-disk-full.md +314 -0
- package/src/agents/sre/playbooks/07-service-down.md +333 -0
- package/src/agents/sre/playbooks/08-data-corruption.md +337 -0
- package/src/agents/sre/playbooks/09-cascade-failure.md +430 -0
- package/src/agents/sre/playbooks/10-rate-limit-exceeded.md +464 -0
- package/src/agents/sre/scripts/health-check.sh +230 -0
- package/src/agents/sre/scripts/log-analyzer.py +213 -0
- package/src/agents/sre/scripts/metrics-collector.sh +294 -0
- package/src/agents/sre/scripts/trace-analyzer.js +257 -0
- package/src/agents/sre/templates/incident-report.md +249 -0
- package/src/agents/sre/templates/mitigation-plan.md +375 -0
- package/src/agents/sre/templates/post-mortem.md +418 -0
- package/src/agents/sre/templates/runbook-template.md +412 -0
- package/src/agents/tech-lead/AGENT.md +263 -0
- package/src/commands/add-tasks.md +176 -0
- package/src/commands/close-increment.md +347 -0
- package/src/commands/create-increment.md +223 -0
- package/src/commands/create-project.md +528 -0
- package/src/commands/generate-docs.md +623 -0
- package/src/commands/list-increments.md +180 -0
- package/src/commands/review-docs.md +331 -0
- package/src/commands/start-increment.md +139 -0
- package/src/commands/sync-github.md +115 -0
- package/src/commands/validate-increment.md +800 -0
- package/src/hooks/README.md +252 -0
- package/src/hooks/docs-changed.sh +59 -0
- package/src/hooks/human-input-required.sh +55 -0
- package/src/hooks/post-task-completion.sh +57 -0
- package/src/hooks/pre-implementation.sh +47 -0
- package/src/skills/ado-sync/README.md +449 -0
- package/src/skills/ado-sync/SKILL.md +245 -0
- package/src/skills/ado-sync/test-cases/test-1.yaml +9 -0
- package/src/skills/ado-sync/test-cases/test-2.yaml +8 -0
- package/src/skills/ado-sync/test-cases/test-3.yaml +9 -0
- package/src/skills/bmad-method-expert/SKILL.md +628 -0
- package/src/skills/bmad-method-expert/scripts/analyze-project.js +318 -0
- package/src/skills/bmad-method-expert/scripts/check-setup.js +208 -0
- package/src/skills/bmad-method-expert/scripts/generate-template.js +1149 -0
- package/src/skills/bmad-method-expert/scripts/validate-documents.js +340 -0
- package/src/skills/bmad-method-expert/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/bmad-method-expert/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/bmad-method-expert/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/brownfield-analyzer/SKILL.md +523 -0
- package/src/skills/brownfield-analyzer/test-cases/test-1-basic-analysis.yaml +48 -0
- package/src/skills/brownfield-analyzer/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/brownfield-analyzer/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/brownfield-onboarder/SKILL.md +625 -0
- package/src/skills/brownfield-onboarder/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/brownfield-onboarder/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/brownfield-onboarder/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/calendar-system/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/calendar-system/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/calendar-system/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/context-loader/SKILL.md +734 -0
- package/src/skills/context-loader/test-cases/test-1-basic-loading.yaml +39 -0
- package/src/skills/context-loader/test-cases/test-2-token-budget-exceeded.yaml +44 -0
- package/src/skills/context-loader/test-cases/test-3-section-anchors.yaml +45 -0
- package/src/skills/context-optimizer/SKILL.md +618 -0
- package/src/skills/context-optimizer/test-cases/test-1-bug-fix-narrow.yaml +97 -0
- package/src/skills/context-optimizer/test-cases/test-2-feature-focused.yaml +109 -0
- package/src/skills/context-optimizer/test-cases/test-3-architecture-broad.yaml +98 -0
- package/src/skills/cost-optimizer/SKILL.md +190 -0
- package/src/skills/cost-optimizer/test-cases/test-1-basic-comparison.yaml +75 -0
- package/src/skills/cost-optimizer/test-cases/test-2-budget-constraint.yaml +52 -0
- package/src/skills/cost-optimizer/test-cases/test-3-scale-requirement.yaml +63 -0
- package/src/skills/cost-optimizer/test-results/README.md +46 -0
- package/src/skills/design-system-architect/SKILL.md +107 -0
- package/src/skills/design-system-architect/test-cases/test-1-token-structure.yaml +23 -0
- package/src/skills/design-system-architect/test-cases/test-2-component-hierarchy.yaml +24 -0
- package/src/skills/design-system-architect/test-cases/test-3-accessibility-checklist.yaml +23 -0
- package/src/skills/diagrams-architect/SKILL.md +763 -0
- package/src/skills/diagrams-generator/SKILL.md +25 -0
- package/src/skills/diagrams-generator/test-cases/test-1.yaml +9 -0
- package/src/skills/diagrams-generator/test-cases/test-2.yaml +9 -0
- package/src/skills/diagrams-generator/test-cases/test-3.yaml +8 -0
- package/src/skills/docs-updater/README.md +48 -0
- package/src/skills/docs-updater/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/docs-updater/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/docs-updater/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/dotnet-backend/SKILL.md +250 -0
- package/src/skills/e2e-playwright/README.md +506 -0
- package/src/skills/e2e-playwright/SKILL.md +457 -0
- package/src/skills/e2e-playwright/execute.js +373 -0
- package/src/skills/e2e-playwright/lib/utils.js +514 -0
- package/src/skills/e2e-playwright/package.json +33 -0
- package/src/skills/e2e-playwright/test-cases/TC-001-basic-navigation.yaml +54 -0
- package/src/skills/e2e-playwright/test-cases/TC-002-form-interaction.yaml +64 -0
- package/src/skills/e2e-playwright/test-cases/TC-003-specweave-integration.yaml +74 -0
- package/src/skills/e2e-playwright/test-cases/TC-004-accessibility-check.yaml +98 -0
- package/src/skills/figma-designer/SKILL.md +149 -0
- package/src/skills/figma-implementer/SKILL.md +148 -0
- package/src/skills/figma-mcp-connector/SKILL.md +136 -0
- package/src/skills/figma-mcp-connector/test-cases/test-1-read-file-desktop.yaml +22 -0
- package/src/skills/figma-mcp-connector/test-cases/test-2-read-file-framelink.yaml +21 -0
- package/src/skills/figma-mcp-connector/test-cases/test-3-error-handling.yaml +18 -0
- package/src/skills/figma-to-code/SKILL.md +128 -0
- package/src/skills/figma-to-code/test-cases/test-1-token-generation.yaml +29 -0
- package/src/skills/figma-to-code/test-cases/test-2-component-generation.yaml +27 -0
- package/src/skills/figma-to-code/test-cases/test-3-typescript-generation.yaml +28 -0
- package/src/skills/frontend/SKILL.md +177 -0
- package/src/skills/github-sync/SKILL.md +252 -0
- package/src/skills/github-sync/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/github-sync/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/github-sync/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/hetzner-provisioner/README.md +308 -0
- package/src/skills/hetzner-provisioner/SKILL.md +251 -0
- package/src/skills/hetzner-provisioner/test-cases/test-1-basic-provision.yaml +71 -0
- package/src/skills/hetzner-provisioner/test-cases/test-2-postgres-provision.yaml +85 -0
- package/src/skills/hetzner-provisioner/test-cases/test-3-ssl-config.yaml +126 -0
- package/src/skills/hetzner-provisioner/test-results/README.md +259 -0
- package/src/skills/increment-planner/SKILL.md +889 -0
- package/src/skills/increment-planner/scripts/feature-utils.js +250 -0
- package/src/skills/increment-planner/test-cases/test-1-basic-feature.yaml +27 -0
- package/src/skills/increment-planner/test-cases/test-2-complex-feature.yaml +30 -0
- package/src/skills/increment-planner/test-cases/test-3-auto-numbering.yaml +24 -0
- package/src/skills/increment-quality-judge/SKILL.md +566 -0
- package/src/skills/increment-quality-judge/test-cases/test-1-good-spec.yaml +95 -0
- package/src/skills/increment-quality-judge/test-cases/test-2-poor-spec.yaml +108 -0
- package/src/skills/increment-quality-judge/test-cases/test-3-export-suggestions.yaml +87 -0
- package/src/skills/jira-sync/README.md +328 -0
- package/src/skills/jira-sync/SKILL.md +209 -0
- package/src/skills/jira-sync/test-cases/test-1.yaml +9 -0
- package/src/skills/jira-sync/test-cases/test-2.yaml +9 -0
- package/src/skills/jira-sync/test-cases/test-3.yaml +10 -0
- package/src/skills/nextjs/SKILL.md +176 -0
- package/src/skills/nodejs-backend/SKILL.md +181 -0
- package/src/skills/notification-system/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/notification-system/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/notification-system/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/python-backend/SKILL.md +226 -0
- package/src/skills/role-orchestrator/README.md +197 -0
- package/src/skills/role-orchestrator/SKILL.md +1184 -0
- package/src/skills/role-orchestrator/test-cases/test-1-simple-product.yaml +98 -0
- package/src/skills/role-orchestrator/test-cases/test-2-quality-gate-failure.yaml +73 -0
- package/src/skills/role-orchestrator/test-cases/test-3-security-workflow.yaml +121 -0
- package/src/skills/role-orchestrator/test-cases/test-4-parallel-execution.yaml +145 -0
- package/src/skills/role-orchestrator/test-cases/test-5-feedback-loops.yaml +149 -0
- package/src/skills/skill-creator/LICENSE.txt +202 -0
- package/src/skills/skill-creator/SKILL.md +209 -0
- package/src/skills/skill-creator/scripts/init_skill.py +303 -0
- package/src/skills/skill-creator/scripts/package_skill.py +110 -0
- package/src/skills/skill-creator/scripts/quick_validate.py +65 -0
- package/src/skills/skill-creator/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/skill-creator/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/skill-creator/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/skill-router/SKILL.md +497 -0
- package/src/skills/skill-router/test-cases/test-1-basic-routing.yaml +33 -0
- package/src/skills/skill-router/test-cases/test-2-ambiguous-request.yaml +42 -0
- package/src/skills/skill-router/test-cases/test-3-nested-orchestration.yaml +50 -0
- package/src/skills/spec-driven-brainstorming/README.md +264 -0
- package/src/skills/spec-driven-brainstorming/SKILL.md +439 -0
- package/src/skills/spec-driven-brainstorming/test-cases/TC-001-simple-idea-to-design.yaml +148 -0
- package/src/skills/spec-driven-brainstorming/test-cases/TC-002-complex-ultrathink-design.yaml +190 -0
- package/src/skills/spec-driven-brainstorming/test-cases/TC-003-unclear-requirements-socratic.yaml +233 -0
- package/src/skills/spec-driven-debugging/README.md +479 -0
- package/src/skills/spec-driven-debugging/SKILL.md +652 -0
- package/src/skills/spec-driven-debugging/test-cases/TC-001-simple-auth-bug.yaml +212 -0
- package/src/skills/spec-driven-debugging/test-cases/TC-002-race-condition-ultrathink.yaml +461 -0
- package/src/skills/spec-driven-debugging/test-cases/TC-003-brownfield-missing-spec.yaml +366 -0
- package/src/skills/spec-kit-expert/SKILL.md +1012 -0
- package/src/skills/spec-kit-expert/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/spec-kit-expert/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/spec-kit-expert/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/specweave-ado-mapper/SKILL.md +501 -0
- package/src/skills/specweave-detector/SKILL.md +420 -0
- package/src/skills/specweave-detector/test-cases/test-1-basic-detection.yaml +37 -0
- package/src/skills/specweave-detector/test-cases/test-2-missing-config.yaml +37 -0
- package/src/skills/specweave-detector/test-cases/test-3-non-specweave-project.yaml +34 -0
- package/src/skills/specweave-jira-mapper/SKILL.md +500 -0
- package/src/skills/stripe-integrator/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/stripe-integrator/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/stripe-integrator/test-cases/test-3-placeholder.yaml +12 -0
- package/src/skills/task-builder/README.md +90 -0
- package/src/skills/task-builder/test-cases/test-1-placeholder.yaml +12 -0
- package/src/skills/task-builder/test-cases/test-2-placeholder.yaml +12 -0
- package/src/skills/task-builder/test-cases/test-3-placeholder.yaml +12 -0
- package/src/templates/.env.example +144 -0
- package/src/templates/.gitignore.template +81 -0
- package/src/templates/CLAUDE.md.template +383 -0
- package/src/templates/README.md.template +240 -0
- package/src/templates/config.yaml +333 -0
- package/src/templates/docs/README.md +124 -0
- package/src/templates/docs/adr-template.md +118 -0
- package/src/templates/docs/hld-template.md +220 -0
- package/src/templates/docs/lld-template.md +580 -0
- package/src/templates/docs/prd-template.md +132 -0
- package/src/templates/docs/rfc-template.md +229 -0
- package/src/templates/docs/runbook-template.md +298 -0
- package/src/templates/environments/minimal/.env.production +16 -0
- package/src/templates/environments/minimal/README.md +54 -0
- package/src/templates/environments/minimal/deploy-production.yml +52 -0
- package/src/templates/environments/progressive/.env.qa +28 -0
- package/src/templates/environments/progressive/README.md +129 -0
- package/src/templates/environments/progressive/deploy-production.yml +93 -0
- package/src/templates/environments/progressive/deploy-qa.yml +62 -0
- package/src/templates/environments/progressive/deploy-staging.yml +67 -0
- package/src/templates/environments/standard/.env.development +20 -0
- package/src/templates/environments/standard/.env.production +30 -0
- package/src/templates/environments/standard/.env.staging +23 -0
- package/src/templates/environments/standard/README.md +97 -0
- package/src/templates/environments/standard/deploy-production.yml +68 -0
- package/src/templates/environments/standard/deploy-staging.yml +61 -0
- package/src/templates/environments/standard/docker-compose.yml +43 -0
- package/src/templates/increment-metadata-template.yaml +138 -0
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
/**
|
|
4
|
+
* trace-analyzer.js
|
|
5
|
+
* Analyze distributed tracing data to identify bottlenecks
|
|
6
|
+
*
|
|
7
|
+
* Usage: node trace-analyzer.js <trace-id>
|
|
8
|
+
* node trace-analyzer.js <trace-id> --format=json
|
|
9
|
+
* node trace-analyzer.js --file=trace.json
|
|
10
|
+
*/
|
|
11
|
+
|
|
12
|
+
const fs = require('fs');
|
|
13
|
+
const path = require('path');
|
|
14
|
+
|
|
15
|
+
// Parse arguments
|
|
16
|
+
const args = process.argv.slice(2);
|
|
17
|
+
let traceId = null;
|
|
18
|
+
let traceFile = null;
|
|
19
|
+
let outputFormat = 'text'; // text or json
|
|
20
|
+
|
|
21
|
+
for (const arg of args) {
|
|
22
|
+
if (arg.startsWith('--file=')) {
|
|
23
|
+
traceFile = arg.split('=')[1];
|
|
24
|
+
} else if (arg.startsWith('--format=')) {
|
|
25
|
+
outputFormat = arg.split('=')[1];
|
|
26
|
+
} else if (!arg.startsWith('--')) {
|
|
27
|
+
traceId = arg;
|
|
28
|
+
}
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
// Mock trace data (in production, fetch from APM/tracing system)
|
|
32
|
+
function getMockTraceData(id) {
|
|
33
|
+
return {
|
|
34
|
+
traceId: id,
|
|
35
|
+
rootSpan: {
|
|
36
|
+
spanId: 'span-1',
|
|
37
|
+
service: 'frontend',
|
|
38
|
+
operation: 'GET /dashboard',
|
|
39
|
+
startTime: 1698345600000,
|
|
40
|
+
duration: 8250, // ms
|
|
41
|
+
children: [
|
|
42
|
+
{
|
|
43
|
+
spanId: 'span-2',
|
|
44
|
+
service: 'api',
|
|
45
|
+
operation: 'GET /api/dashboard',
|
|
46
|
+
startTime: 1698345600010,
|
|
47
|
+
duration: 8200,
|
|
48
|
+
children: [
|
|
49
|
+
{
|
|
50
|
+
spanId: 'span-3',
|
|
51
|
+
service: 'api',
|
|
52
|
+
operation: 'db.query',
|
|
53
|
+
startTime: 1698345600020,
|
|
54
|
+
duration: 7800, // SLOW!
|
|
55
|
+
tags: {
|
|
56
|
+
'db.statement': 'SELECT * FROM users WHERE last_login_at > ...',
|
|
57
|
+
'db.type': 'postgresql',
|
|
58
|
+
},
|
|
59
|
+
children: [],
|
|
60
|
+
},
|
|
61
|
+
{
|
|
62
|
+
spanId: 'span-4',
|
|
63
|
+
service: 'api',
|
|
64
|
+
operation: 'cache.get',
|
|
65
|
+
startTime: 1698345608200,
|
|
66
|
+
duration: 5,
|
|
67
|
+
children: [],
|
|
68
|
+
},
|
|
69
|
+
],
|
|
70
|
+
},
|
|
71
|
+
],
|
|
72
|
+
},
|
|
73
|
+
};
|
|
74
|
+
}
|
|
75
|
+
|
|
76
|
+
// Load trace from file or mock
|
|
77
|
+
function loadTrace() {
|
|
78
|
+
if (traceFile) {
|
|
79
|
+
try {
|
|
80
|
+
const data = fs.readFileSync(traceFile, 'utf8');
|
|
81
|
+
return JSON.parse(data);
|
|
82
|
+
} catch (error) {
|
|
83
|
+
console.error(`❌ Error loading trace file: ${error.message}`);
|
|
84
|
+
process.exit(1);
|
|
85
|
+
}
|
|
86
|
+
} else if (traceId) {
|
|
87
|
+
return getMockTraceData(traceId);
|
|
88
|
+
} else {
|
|
89
|
+
console.error('Usage: node trace-analyzer.js <trace-id> OR --file=trace.json');
|
|
90
|
+
process.exit(1);
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
// Analyze trace
|
|
95
|
+
function analyzeTrace(trace) {
|
|
96
|
+
const analysis = {
|
|
97
|
+
traceId: trace.traceId,
|
|
98
|
+
totalDuration: trace.rootSpan.duration,
|
|
99
|
+
rootOperation: trace.rootSpan.operation,
|
|
100
|
+
spanCount: 0,
|
|
101
|
+
slowSpans: [],
|
|
102
|
+
bottlenecks: [],
|
|
103
|
+
serviceBreakdown: {},
|
|
104
|
+
};
|
|
105
|
+
|
|
106
|
+
// Traverse spans
|
|
107
|
+
function traverseSpans(span, depth = 0) {
|
|
108
|
+
analysis.spanCount++;
|
|
109
|
+
|
|
110
|
+
// Track service time
|
|
111
|
+
if (!analysis.serviceBreakdown[span.service]) {
|
|
112
|
+
analysis.serviceBreakdown[span.service] = {
|
|
113
|
+
totalTime: 0,
|
|
114
|
+
calls: 0,
|
|
115
|
+
};
|
|
116
|
+
}
|
|
117
|
+
analysis.serviceBreakdown[span.service].totalTime += span.duration;
|
|
118
|
+
analysis.serviceBreakdown[span.service].calls++;
|
|
119
|
+
|
|
120
|
+
// Identify slow spans (>1s)
|
|
121
|
+
if (span.duration > 1000) {
|
|
122
|
+
analysis.slowSpans.push({
|
|
123
|
+
service: span.service,
|
|
124
|
+
operation: span.operation,
|
|
125
|
+
duration: span.duration,
|
|
126
|
+
percentage: ((span.duration / analysis.totalDuration) * 100).toFixed(1),
|
|
127
|
+
depth,
|
|
128
|
+
});
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
// Traverse children
|
|
132
|
+
if (span.children) {
|
|
133
|
+
span.children.forEach(child => traverseSpans(child, depth + 1));
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
|
|
137
|
+
traverseSpans(trace.rootSpan);
|
|
138
|
+
|
|
139
|
+
// Sort slow spans by duration
|
|
140
|
+
analysis.slowSpans.sort((a, b) => b.duration - a.duration);
|
|
141
|
+
|
|
142
|
+
// Identify bottlenecks (spans taking >50% of total time)
|
|
143
|
+
analysis.bottlenecks = analysis.slowSpans.filter(
|
|
144
|
+
span => parseFloat(span.percentage) > 50
|
|
145
|
+
);
|
|
146
|
+
|
|
147
|
+
return analysis;
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
// Format duration
|
|
151
|
+
function formatDuration(ms) {
|
|
152
|
+
if (ms < 1000) return `${ms}ms`;
|
|
153
|
+
return `${(ms / 1000).toFixed(2)}s`;
|
|
154
|
+
}
|
|
155
|
+
|
|
156
|
+
// Print analysis (text format)
|
|
157
|
+
function printAnalysis(analysis) {
|
|
158
|
+
console.log('========================================');
|
|
159
|
+
console.log('DISTRIBUTED TRACE ANALYSIS');
|
|
160
|
+
console.log('========================================');
|
|
161
|
+
console.log(`Trace ID: ${analysis.traceId}`);
|
|
162
|
+
console.log(`Root Operation: ${analysis.rootOperation}`);
|
|
163
|
+
console.log(`Total Duration: ${formatDuration(analysis.totalDuration)}`);
|
|
164
|
+
console.log(`Total Spans: ${analysis.spanCount}`);
|
|
165
|
+
console.log('');
|
|
166
|
+
|
|
167
|
+
// Service breakdown
|
|
168
|
+
console.log('📊 SERVICE BREAKDOWN');
|
|
169
|
+
console.log('-------------------');
|
|
170
|
+
console.log(`${'Service'.padEnd(20)} ${'Time'.padEnd(15)} ${'Calls'.padEnd(10)} ${'% of Total'.padEnd(15)}`);
|
|
171
|
+
console.log('-'.repeat(70));
|
|
172
|
+
|
|
173
|
+
for (const [service, data] of Object.entries(analysis.serviceBreakdown)) {
|
|
174
|
+
const percentage = ((data.totalTime / analysis.totalDuration) * 100).toFixed(1);
|
|
175
|
+
console.log(
|
|
176
|
+
`${service.padEnd(20)} ${formatDuration(data.totalTime).padEnd(15)} ${String(data.calls).padEnd(10)} ${percentage}%`
|
|
177
|
+
);
|
|
178
|
+
}
|
|
179
|
+
console.log('');
|
|
180
|
+
|
|
181
|
+
// Slow spans
|
|
182
|
+
if (analysis.slowSpans.length > 0) {
|
|
183
|
+
console.log(`🐌 SLOW SPANS (>${formatDuration(1000)})`);
|
|
184
|
+
console.log('-------------------');
|
|
185
|
+
console.log(`${'Service'.padEnd(15)} ${'Operation'.padEnd(30)} ${'Duration'.padEnd(15)} ${'% of Total'.padEnd(15)}`);
|
|
186
|
+
console.log('-'.repeat(80));
|
|
187
|
+
|
|
188
|
+
for (const span of analysis.slowSpans.slice(0, 10)) {
|
|
189
|
+
console.log(
|
|
190
|
+
`${span.service.padEnd(15)} ${span.operation.padEnd(30)} ${formatDuration(span.duration).padEnd(15)} ${span.percentage}%`
|
|
191
|
+
);
|
|
192
|
+
}
|
|
193
|
+
console.log('');
|
|
194
|
+
}
|
|
195
|
+
|
|
196
|
+
// Bottlenecks
|
|
197
|
+
if (analysis.bottlenecks.length > 0) {
|
|
198
|
+
console.log('🚨 BOTTLENECKS (>50% of total time)');
|
|
199
|
+
console.log('-----------------------------------');
|
|
200
|
+
|
|
201
|
+
for (const bottleneck of analysis.bottlenecks) {
|
|
202
|
+
console.log(`⚠️ ${bottleneck.service} - ${bottleneck.operation}`);
|
|
203
|
+
console.log(` Duration: ${formatDuration(bottleneck.duration)} (${bottleneck.percentage}% of trace)`);
|
|
204
|
+
console.log('');
|
|
205
|
+
}
|
|
206
|
+
}
|
|
207
|
+
|
|
208
|
+
// Recommendations
|
|
209
|
+
console.log('💡 RECOMMENDATIONS');
|
|
210
|
+
console.log('-----------------');
|
|
211
|
+
|
|
212
|
+
if (analysis.bottlenecks.length > 0) {
|
|
213
|
+
console.log('🔴 CRITICAL: Bottlenecks detected!');
|
|
214
|
+
for (const bottleneck of analysis.bottlenecks) {
|
|
215
|
+
console.log(` - Optimize ${bottleneck.service}.${bottleneck.operation} (${bottleneck.percentage}% of trace)`);
|
|
216
|
+
|
|
217
|
+
// Specific recommendations based on operation
|
|
218
|
+
if (bottleneck.operation.includes('db.query')) {
|
|
219
|
+
console.log(' → Add database index, optimize query, add caching');
|
|
220
|
+
} else if (bottleneck.operation.includes('http')) {
|
|
221
|
+
console.log(' → Add timeout, cache response, use async processing');
|
|
222
|
+
} else if (bottleneck.operation.includes('cache')) {
|
|
223
|
+
console.log(' → Check cache hit rate, optimize cache key');
|
|
224
|
+
}
|
|
225
|
+
}
|
|
226
|
+
} else if (analysis.slowSpans.length > 0) {
|
|
227
|
+
console.log('🟡 Some slow spans detected:');
|
|
228
|
+
for (const span of analysis.slowSpans.slice(0, 3)) {
|
|
229
|
+
console.log(` - ${span.service}.${span.operation}: ${formatDuration(span.duration)}`);
|
|
230
|
+
}
|
|
231
|
+
} else {
|
|
232
|
+
console.log('✅ No obvious performance issues detected.');
|
|
233
|
+
console.log(' All spans complete in reasonable time.');
|
|
234
|
+
}
|
|
235
|
+
|
|
236
|
+
console.log('');
|
|
237
|
+
console.log('Next steps:');
|
|
238
|
+
console.log(' - Profile slowest spans');
|
|
239
|
+
console.log(' - Check for N+1 queries, missing indexes');
|
|
240
|
+
console.log(' - Add caching where appropriate');
|
|
241
|
+
console.log(' - Review external API timeouts');
|
|
242
|
+
console.log('');
|
|
243
|
+
}
|
|
244
|
+
|
|
245
|
+
// Main
|
|
246
|
+
function main() {
|
|
247
|
+
const trace = loadTrace();
|
|
248
|
+
const analysis = analyzeTrace(trace);
|
|
249
|
+
|
|
250
|
+
if (outputFormat === 'json') {
|
|
251
|
+
console.log(JSON.stringify(analysis, null, 2));
|
|
252
|
+
} else {
|
|
253
|
+
printAnalysis(analysis);
|
|
254
|
+
}
|
|
255
|
+
}
|
|
256
|
+
|
|
257
|
+
main();
|
|
@@ -0,0 +1,249 @@
|
|
|
1
|
+
# Incident Report: [Incident Title]
|
|
2
|
+
|
|
3
|
+
**Date**: YYYY-MM-DD
|
|
4
|
+
**Time Started**: HH:MM UTC
|
|
5
|
+
**Time Resolved**: HH:MM UTC (or "Ongoing")
|
|
6
|
+
**Duration**: X hours Y minutes
|
|
7
|
+
**Severity**: SEV1 / SEV2 / SEV3
|
|
8
|
+
**Status**: Investigating / Mitigating / Resolved
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Summary
|
|
13
|
+
|
|
14
|
+
Brief one-paragraph description of what happened, impact, and current status.
|
|
15
|
+
|
|
16
|
+
**Example**:
|
|
17
|
+
```
|
|
18
|
+
On 2025-10-26 at 14:00 UTC, the API service became unavailable due to database connection pool exhaustion. All users were unable to access the application. The issue was resolved at 14:30 UTC by restarting the database and fixing a connection leak in the payment service. Total downtime: 30 minutes.
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Impact
|
|
24
|
+
|
|
25
|
+
### Users Affected
|
|
26
|
+
- **Scope**: All users / Partial / Specific region / Specific feature
|
|
27
|
+
- **Count**: X,XXX users (or percentage)
|
|
28
|
+
- **Duration**: HH:MM (how long were they affected)
|
|
29
|
+
|
|
30
|
+
### Services Affected
|
|
31
|
+
- [ ] Frontend/UI
|
|
32
|
+
- [ ] Backend API
|
|
33
|
+
- [ ] Database
|
|
34
|
+
- [ ] Payment processing
|
|
35
|
+
- [ ] Authentication
|
|
36
|
+
- [ ] [Other service]
|
|
37
|
+
|
|
38
|
+
### Business Impact
|
|
39
|
+
- **Revenue Lost**: $X,XXX (if calculable)
|
|
40
|
+
- **SLA Breach**: Yes / No (if applicable)
|
|
41
|
+
- **Customer Complaints**: X tickets/emails
|
|
42
|
+
- **Reputation**: Social media mentions, press coverage
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Timeline
|
|
47
|
+
|
|
48
|
+
Detailed chronological timeline of events with timestamps.
|
|
49
|
+
|
|
50
|
+
| Time (UTC) | Event | Action Taken | By Whom |
|
|
51
|
+
|------------|-------|--------------|---------|
|
|
52
|
+
| 14:00 | First alert: "Database connection pool exhausted" | Alert triggered | Monitoring |
|
|
53
|
+
| 14:02 | On-call engineer paged | Acknowledged alert | SRE (Jane) |
|
|
54
|
+
| 14:05 | Confirmed database connections at max (100/100) | Checked pg_stat_activity | SRE (Jane) |
|
|
55
|
+
| 14:10 | Identified connection leak in payment service | Reviewed application logs | SRE (Jane) |
|
|
56
|
+
| 14:15 | Restarted payment service | systemctl restart payment | SRE (Jane) |
|
|
57
|
+
| 14:20 | Database connections normalized (20/100) | Monitored connections | SRE (Jane) |
|
|
58
|
+
| 14:25 | Health checks passing | Verified /health endpoint | SRE (Jane) |
|
|
59
|
+
| 14:30 | Incident resolved | Declared incident resolved | SRE (Jane) |
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## Root Cause
|
|
64
|
+
|
|
65
|
+
**What broke**: Payment service had connection leak (connections not released after query)
|
|
66
|
+
|
|
67
|
+
**Why it broke**: Missing `conn.close()` in error handling path
|
|
68
|
+
|
|
69
|
+
**What triggered it**: High payment volume (Black Friday sale)
|
|
70
|
+
|
|
71
|
+
**Contributing factors**:
|
|
72
|
+
- Database connection pool size too small (100 connections)
|
|
73
|
+
- No connection timeout configured
|
|
74
|
+
- No monitoring alert for connection pool usage
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
## Detection
|
|
79
|
+
|
|
80
|
+
### How We Detected
|
|
81
|
+
- [X] Automated monitoring alert
|
|
82
|
+
- [ ] User report
|
|
83
|
+
- [ ] Internal team noticed
|
|
84
|
+
- [ ] External vendor notification
|
|
85
|
+
|
|
86
|
+
**Alert Details**:
|
|
87
|
+
- Alert name: "Database Connection Pool Exhausted"
|
|
88
|
+
- Alert triggered at: 14:00 UTC
|
|
89
|
+
- Time to detection: <1 minute (automated)
|
|
90
|
+
- Time to acknowledgment: 2 minutes
|
|
91
|
+
|
|
92
|
+
### Detection Quality
|
|
93
|
+
- **Good**: Alert fired quickly (<1 min)
|
|
94
|
+
- **To Improve**: Need alert BEFORE pool exhausted (at 80% usage)
|
|
95
|
+
|
|
96
|
+
---
|
|
97
|
+
|
|
98
|
+
## Response
|
|
99
|
+
|
|
100
|
+
### Immediate Actions Taken
|
|
101
|
+
1. ✅ Acknowledged alert (14:02)
|
|
102
|
+
2. ✅ Checked database connection pool (14:05)
|
|
103
|
+
3. ✅ Identified connection leak (14:10)
|
|
104
|
+
4. ✅ Restarted payment service (14:15)
|
|
105
|
+
5. ✅ Verified resolution (14:30)
|
|
106
|
+
|
|
107
|
+
### What Worked Well
|
|
108
|
+
- Monitoring detected issue quickly
|
|
109
|
+
- Clear runbook for connection pool issues
|
|
110
|
+
- SRE responded within 2 minutes
|
|
111
|
+
- Root cause identified in 10 minutes
|
|
112
|
+
|
|
113
|
+
### What Could Be Improved
|
|
114
|
+
- Connection leak should have been caught in code review
|
|
115
|
+
- No automated tests for connection cleanup
|
|
116
|
+
- Connection pool too small for Black Friday traffic
|
|
117
|
+
- No early warning alert (only alerted when 100% full)
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## Resolution
|
|
122
|
+
|
|
123
|
+
### Short-term Fix (Immediate)
|
|
124
|
+
- Restarted payment service to release connections
|
|
125
|
+
- Manually monitored connection pool for 30 minutes
|
|
126
|
+
|
|
127
|
+
### Long-term Fix (To Prevent Recurrence)
|
|
128
|
+
- [ ] Fix connection leak in payment service code (PRIORITY 1)
|
|
129
|
+
- [ ] Add automated test for connection cleanup (PRIORITY 1)
|
|
130
|
+
- [ ] Increase connection pool size (100 → 200) (PRIORITY 2)
|
|
131
|
+
- [ ] Add connection pool monitoring alert (>80%) (PRIORITY 2)
|
|
132
|
+
- [ ] Add connection timeout (30 seconds) (PRIORITY 3)
|
|
133
|
+
- [ ] Review all database queries for connection leaks (PRIORITY 3)
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Communication
|
|
138
|
+
|
|
139
|
+
### Internal Communication
|
|
140
|
+
- **Incident channel**: #incident-20251026-db-pool
|
|
141
|
+
- **Participants**: SRE (Jane), DevOps (John), Manager (Sarah)
|
|
142
|
+
- **Updates posted**: Every 10 minutes
|
|
143
|
+
|
|
144
|
+
### External Communication
|
|
145
|
+
- **Status page**: Updated at 14:05, 14:20, 14:30
|
|
146
|
+
- **Customer email**: Sent at 15:00 (post-incident)
|
|
147
|
+
- **Social media**: Tweet at 14:10 acknowledging issue
|
|
148
|
+
|
|
149
|
+
**Sample Status Page Update**:
|
|
150
|
+
```
|
|
151
|
+
[14:05] Investigating: We are currently investigating an issue affecting API availability. Our team is actively working on a resolution.
|
|
152
|
+
|
|
153
|
+
[14:20] Monitoring: We have identified the issue and implemented a fix. We are monitoring the situation to ensure stability.
|
|
154
|
+
|
|
155
|
+
[14:30] Resolved: The issue has been resolved. All services are now operating normally. We apologize for the inconvenience.
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Metrics
|
|
161
|
+
|
|
162
|
+
### Response Time
|
|
163
|
+
- **Time to detect**: <1 minute (excellent)
|
|
164
|
+
- **Time to acknowledge**: 2 minutes (good)
|
|
165
|
+
- **Time to triage**: 5 minutes (good)
|
|
166
|
+
- **Time to identify root cause**: 10 minutes (good)
|
|
167
|
+
- **Time to resolution**: 30 minutes (acceptable)
|
|
168
|
+
|
|
169
|
+
### Availability
|
|
170
|
+
- **Uptime target**: 99.9% (43.2 minutes downtime/month)
|
|
171
|
+
- **Actual downtime**: 30 minutes
|
|
172
|
+
- **SLA breach**: No (within monthly budget)
|
|
173
|
+
|
|
174
|
+
### Error Rate
|
|
175
|
+
- **Normal error rate**: 0.1%
|
|
176
|
+
- **During incident**: 100% (complete outage)
|
|
177
|
+
- **Peak error count**: 10,000 errors
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Action Items
|
|
182
|
+
|
|
183
|
+
| # | Action | Owner | Priority | Due Date | Status |
|
|
184
|
+
|---|--------|-------|----------|----------|--------|
|
|
185
|
+
| 1 | Fix connection leak in payment service | Dev (Mike) | P1 | 2025-10-27 | Pending |
|
|
186
|
+
| 2 | Add automated test for connection cleanup | QA (Lisa) | P1 | 2025-10-27 | Pending |
|
|
187
|
+
| 3 | Increase connection pool size (100 → 200) | DBA (Tom) | P2 | 2025-10-28 | Pending |
|
|
188
|
+
| 4 | Add connection pool monitoring (>80%) | SRE (Jane) | P2 | 2025-10-28 | Pending |
|
|
189
|
+
| 5 | Add connection timeout (30s) | DBA (Tom) | P3 | 2025-10-30 | Pending |
|
|
190
|
+
| 6 | Review all queries for connection leaks | Dev (Mike) | P3 | 2025-11-02 | Pending |
|
|
191
|
+
| 7 | Load test for Black Friday traffic | DevOps (John) | P3 | 2025-11-10 | Pending |
|
|
192
|
+
|
|
193
|
+
---
|
|
194
|
+
|
|
195
|
+
## Lessons Learned
|
|
196
|
+
|
|
197
|
+
### What Went Well
|
|
198
|
+
- ✅ Monitoring detected issue immediately
|
|
199
|
+
- ✅ Clear escalation path (on-call responded quickly)
|
|
200
|
+
- ✅ Runbook helped identify issue faster
|
|
201
|
+
- ✅ Communication was clear and timely
|
|
202
|
+
|
|
203
|
+
### What Went Wrong
|
|
204
|
+
- ❌ Connection leak made it to production (code review miss)
|
|
205
|
+
- ❌ No automated test for connection cleanup
|
|
206
|
+
- ❌ Connection pool too small for high-traffic event
|
|
207
|
+
- ❌ No early warning alert (only alerted at 100%)
|
|
208
|
+
|
|
209
|
+
### Action Items to Prevent Recurrence
|
|
210
|
+
1. **Code Quality**: Add linter rule to check connection cleanup
|
|
211
|
+
2. **Testing**: Add integration test for connection pool under load
|
|
212
|
+
3. **Monitoring**: Add alert at 80% connection pool usage
|
|
213
|
+
4. **Capacity Planning**: Review capacity before high-traffic events
|
|
214
|
+
5. **Runbook Update**: Document connection leak troubleshooting
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## Appendices
|
|
219
|
+
|
|
220
|
+
### Related Incidents
|
|
221
|
+
- [2025-09-15] Database connection pool exhausted (similar issue)
|
|
222
|
+
- [2025-08-10] Payment service OOM crash
|
|
223
|
+
|
|
224
|
+
### Related Documentation
|
|
225
|
+
- Runbook: [Connection Pool Issues](../playbooks/connection-pool-exhausted.md)
|
|
226
|
+
- Post-mortem: [2025-09-15 Database Incident](../post-mortems/2025-09-15-db-pool.md)
|
|
227
|
+
- Code: [Payment Service](https://github.com/example/payment-service)
|
|
228
|
+
|
|
229
|
+
### Commands Run
|
|
230
|
+
```bash
|
|
231
|
+
# Check connection pool
|
|
232
|
+
SELECT count(*) FROM pg_stat_activity;
|
|
233
|
+
|
|
234
|
+
# Identify blocking queries
|
|
235
|
+
SELECT * FROM pg_stat_activity WHERE state != 'idle';
|
|
236
|
+
|
|
237
|
+
# Restart service
|
|
238
|
+
systemctl restart payment-service
|
|
239
|
+
|
|
240
|
+
# Monitor connections
|
|
241
|
+
watch -n 5 'psql -c "SELECT count(*) FROM pg_stat_activity"'
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
---
|
|
245
|
+
|
|
246
|
+
**Report Created By**: Jane (SRE)
|
|
247
|
+
**Report Date**: 2025-10-26
|
|
248
|
+
**Review Status**: Pending / Reviewed / Approved
|
|
249
|
+
**Reviewed By**: [Name, Date]
|