agentic-qe 1.9.2 → 1.9.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/helpers/checkpoint-manager.sh +251 -0
- package/.claude/helpers/github-safe.js +106 -0
- package/.claude/helpers/github-setup.sh +28 -0
- package/.claude/helpers/quick-start.sh +19 -0
- package/.claude/helpers/setup-mcp.sh +18 -0
- package/.claude/helpers/standard-checkpoint-hooks.sh +179 -0
- package/CHANGELOG.md +69 -0
- package/README.md +46 -5
- package/config/.env.otel.example +25 -0
- package/config/OTEL-QUICK-REFERENCE.md +137 -0
- package/config/README-OTEL.md +222 -0
- package/config/alerting-rules.yml +518 -0
- package/config/docker-compose.otel.yml +187 -0
- package/config/grafana/dashboards/agentic-qe-overview.json +286 -0
- package/config/grafana/provisioning/dashboards/dashboards.yml +19 -0
- package/config/grafana/provisioning/datasources/datasources.yml +53 -0
- package/config/otel-collector-config.yaml.example +145 -0
- package/config/prometheus.yml.example +106 -0
- package/dist/alerting/AlertManager.d.ts +120 -0
- package/dist/alerting/AlertManager.d.ts.map +1 -0
- package/dist/alerting/AlertManager.js +345 -0
- package/dist/alerting/AlertManager.js.map +1 -0
- package/dist/alerting/FeedbackRouter.d.ts +98 -0
- package/dist/alerting/FeedbackRouter.d.ts.map +1 -0
- package/dist/alerting/FeedbackRouter.js +331 -0
- package/dist/alerting/FeedbackRouter.js.map +1 -0
- package/dist/alerting/StrategyApplicator.d.ts +120 -0
- package/dist/alerting/StrategyApplicator.d.ts.map +1 -0
- package/dist/alerting/StrategyApplicator.js +299 -0
- package/dist/alerting/StrategyApplicator.js.map +1 -0
- package/dist/alerting/index.d.ts +68 -0
- package/dist/alerting/index.d.ts.map +1 -0
- package/dist/alerting/index.js +112 -0
- package/dist/alerting/index.js.map +1 -0
- package/dist/alerting/types.d.ts +118 -0
- package/dist/alerting/types.d.ts.map +1 -0
- package/dist/alerting/types.js +11 -0
- package/dist/alerting/types.js.map +1 -0
- package/dist/cli/init/claude-config.d.ts.map +1 -1
- package/dist/cli/init/claude-config.js +12 -7
- package/dist/cli/init/claude-config.js.map +1 -1
- package/dist/core/memory/IPatternStore.d.ts +209 -0
- package/dist/core/memory/IPatternStore.d.ts.map +1 -0
- package/dist/core/memory/IPatternStore.js +15 -0
- package/dist/core/memory/IPatternStore.js.map +1 -0
- package/dist/core/memory/MigrationTools.d.ts +192 -0
- package/dist/core/memory/MigrationTools.d.ts.map +1 -0
- package/dist/core/memory/MigrationTools.js +615 -0
- package/dist/core/memory/MigrationTools.js.map +1 -0
- package/dist/core/memory/NeuralEnhancement.d.ts +154 -0
- package/dist/core/memory/NeuralEnhancement.d.ts.map +1 -0
- package/dist/core/memory/NeuralEnhancement.js +598 -0
- package/dist/core/memory/NeuralEnhancement.js.map +1 -0
- package/dist/core/memory/PatternStoreFactory.d.ts +143 -0
- package/dist/core/memory/PatternStoreFactory.d.ts.map +1 -0
- package/dist/core/memory/PatternStoreFactory.js +370 -0
- package/dist/core/memory/PatternStoreFactory.js.map +1 -0
- package/dist/core/memory/RealAgentDBAdapter.d.ts +1 -0
- package/dist/core/memory/RealAgentDBAdapter.d.ts.map +1 -1
- package/dist/core/memory/RealAgentDBAdapter.js +28 -20
- package/dist/core/memory/RealAgentDBAdapter.js.map +1 -1
- package/dist/core/memory/RuVectorPatternStore.d.ts +198 -0
- package/dist/core/memory/RuVectorPatternStore.d.ts.map +1 -0
- package/dist/core/memory/RuVectorPatternStore.js +605 -0
- package/dist/core/memory/RuVectorPatternStore.js.map +1 -0
- package/dist/core/memory/SelfHealingMonitor.d.ts +186 -0
- package/dist/core/memory/SelfHealingMonitor.d.ts.map +1 -0
- package/dist/core/memory/SelfHealingMonitor.js +451 -0
- package/dist/core/memory/SelfHealingMonitor.js.map +1 -0
- package/dist/core/memory/SwarmMemoryManager.d.ts +62 -0
- package/dist/core/memory/SwarmMemoryManager.d.ts.map +1 -1
- package/dist/core/memory/SwarmMemoryManager.js +97 -0
- package/dist/core/memory/SwarmMemoryManager.js.map +1 -1
- package/dist/core/memory/index.d.ts +11 -0
- package/dist/core/memory/index.d.ts.map +1 -1
- package/dist/core/memory/index.js +36 -1
- package/dist/core/memory/index.js.map +1 -1
- package/dist/reasoning/RuVectorReasoningAdapter.d.ts +232 -0
- package/dist/reasoning/RuVectorReasoningAdapter.d.ts.map +1 -0
- package/dist/reasoning/RuVectorReasoningAdapter.js +585 -0
- package/dist/reasoning/RuVectorReasoningAdapter.js.map +1 -0
- package/dist/reasoning/index.d.ts +2 -0
- package/dist/reasoning/index.d.ts.map +1 -1
- package/dist/reasoning/index.js +6 -1
- package/dist/reasoning/index.js.map +1 -1
- package/dist/reporting/ResultAggregator.d.ts +107 -0
- package/dist/reporting/ResultAggregator.d.ts.map +1 -0
- package/dist/reporting/ResultAggregator.js +435 -0
- package/dist/reporting/ResultAggregator.js.map +1 -0
- package/dist/reporting/index.d.ts +48 -0
- package/dist/reporting/index.d.ts.map +1 -0
- package/dist/reporting/index.js +154 -0
- package/dist/reporting/index.js.map +1 -0
- package/dist/reporting/reporters/ControlLoopReporter.d.ts +128 -0
- package/dist/reporting/reporters/ControlLoopReporter.d.ts.map +1 -0
- package/dist/reporting/reporters/ControlLoopReporter.js +417 -0
- package/dist/reporting/reporters/ControlLoopReporter.js.map +1 -0
- package/dist/reporting/reporters/HumanReadableReporter.d.ts +140 -0
- package/dist/reporting/reporters/HumanReadableReporter.d.ts.map +1 -0
- package/dist/reporting/reporters/HumanReadableReporter.js +524 -0
- package/dist/reporting/reporters/HumanReadableReporter.js.map +1 -0
- package/dist/reporting/reporters/JSONReporter.d.ts +193 -0
- package/dist/reporting/reporters/JSONReporter.d.ts.map +1 -0
- package/dist/reporting/reporters/JSONReporter.js +324 -0
- package/dist/reporting/reporters/JSONReporter.js.map +1 -0
- package/dist/reporting/reporters/index.d.ts +14 -0
- package/dist/reporting/reporters/index.d.ts.map +1 -0
- package/dist/reporting/reporters/index.js +19 -0
- package/dist/reporting/reporters/index.js.map +1 -0
- package/dist/reporting/types.d.ts +427 -0
- package/dist/reporting/types.d.ts.map +1 -0
- package/dist/reporting/types.js +12 -0
- package/dist/reporting/types.js.map +1 -0
- package/docs/README.md +839 -0
- package/docs/reference/agents.md +412 -0
- package/docs/reference/skills.md +796 -0
- package/docs/reference/usage.md +512 -0
- package/package.json +12 -1
- package/templates/agent-code-execution-template.md +619 -0
- package/templates/aqe.sh +20 -0
package/README.md
CHANGED
|
@@ -9,11 +9,11 @@
|
|
|
9
9
|
<img alt="NPM Downloads" src="https://img.shields.io/npm/dw/agentic-qe">
|
|
10
10
|
|
|
11
11
|
|
|
12
|
-
**Version 1.9.
|
|
12
|
+
**Version 1.9.4** (Memory & Learning System Fixes) | [Changelog](CHANGELOG.md) | [Issues](https://github.com/proffesor-for-testing/agentic-qe/issues) | [Discussions](https://github.com/proffesor-for-testing/agentic-qe/discussions)
|
|
13
13
|
|
|
14
14
|
> Agentic test automation with AI learning, real-time visualization, OpenTelemetry observability, persistent event storage, constitutional AI governance, and intelligent model routing.
|
|
15
15
|
|
|
16
|
-
🎨 **Real-Time Visualization** | 📊 **Interactive Dashboards** | 🧠 **QE Agent Learning** | 💾 **Event Sourcing** | 📋 **Constitution System** | 📚 **
|
|
16
|
+
🎨 **Real-Time Visualization** | 📊 **Interactive Dashboards** | 🧠 **QE Agent Learning** | 💾 **Event Sourcing** | 📋 **Constitution System** | 📚 **38 QE Skills** | 🎯 **Flaky Detection** | 💰 **Multi-Model Router**
|
|
17
17
|
|
|
18
18
|
</div>
|
|
19
19
|
|
|
@@ -193,7 +193,7 @@ open http://localhost:3000
|
|
|
193
193
|
- **Performance Testing**: k6, JMeter, Gatling integration
|
|
194
194
|
- **Real-Time Streaming**: Live progress updates for all operations
|
|
195
195
|
|
|
196
|
-
### 🎓
|
|
196
|
+
### 🎓 38 QE Skills Library (v1.9.0)
|
|
197
197
|
**95%+ coverage of modern QE practices**
|
|
198
198
|
|
|
199
199
|
<details>
|
|
@@ -206,7 +206,7 @@ open http://localhost:3000
|
|
|
206
206
|
- **Code Quality**: code-review-quality, refactoring-patterns, quality-metrics
|
|
207
207
|
- **Communication**: bug-reporting-excellence, technical-writing, consultancy-practices
|
|
208
208
|
|
|
209
|
-
**Phase 2: Expanded QE Skills Library (
|
|
209
|
+
**Phase 2: Expanded QE Skills Library (16 skills)**
|
|
210
210
|
- **Testing Methodologies (7)**: regression-testing, shift-left-testing, shift-right-testing, test-design-techniques, mutation-testing, test-data-management, verification-quality
|
|
211
211
|
- **Specialized Testing (9)**: accessibility-testing, mobile-testing, database-testing, contract-testing, chaos-engineering-resilience, compatibility-testing, localization-testing, compliance-testing, visual-testing-advanced
|
|
212
212
|
- **Testing Infrastructure (2)**: test-environment-management, test-reporting-analytics
|
|
@@ -214,7 +214,7 @@ open http://localhost:3000
|
|
|
214
214
|
**Phase 3: Advanced Quality Engineering Skills (4 skills)**
|
|
215
215
|
- **Strategic Testing Methodologies (4)**: six-thinking-hats, brutal-honesty-review, sherlock-review, cicd-pipeline-qe-orchestrator
|
|
216
216
|
|
|
217
|
-
**Total:
|
|
217
|
+
**Total: 38 QE Skills** - Includes accessibility testing, shift-left/right testing, verification & quality assurance, visual testing advanced, XP practices, and technical writing
|
|
218
218
|
|
|
219
219
|
</details>
|
|
220
220
|
|
|
@@ -645,6 +645,47 @@ The test generator automatically delegates to subagents for a complete RED-GREEN
|
|
|
645
645
|
|
|
646
646
|
---
|
|
647
647
|
|
|
648
|
+
## 📝 What's New in v1.9.4
|
|
649
|
+
|
|
650
|
+
🔧 **Critical Memory & Learning System Fixes** (2025-11-30)
|
|
651
|
+
|
|
652
|
+
This release delivers critical fixes to the memory, learning, and patterns system. All QE agents now have a fully functional learning system with proper vector embeddings, Q-value reinforcement learning, and persistent pattern storage.
|
|
653
|
+
|
|
654
|
+
### Key Fixes
|
|
655
|
+
|
|
656
|
+
- **Vector embeddings now stored correctly**: Fixed `RealAgentDBAdapter.store()` to properly store 384-dimension embeddings as BLOB data
|
|
657
|
+
- **SQL parameter style bug**: Fixed agentdb's `SqlJsDatabase` wrapper to use spread params instead of array params
|
|
658
|
+
- **HNSW index schema mismatch**: Added `pattern_id` generated column for agentdb's HNSWIndex compatibility
|
|
659
|
+
- **Learning experience retrieval**: Added missing getter methods for Q-learning and experience replay
|
|
660
|
+
- **Hooks saving to wrong database**: Fixed all Claude Code hooks to explicitly export `AGENTDB_PATH` so learning data is saved correctly
|
|
661
|
+
- **CI platform compatibility**: Moved ARM64-only ruvector packages to optionalDependencies for x64 CI compatibility
|
|
662
|
+
|
|
663
|
+
### New Features
|
|
664
|
+
|
|
665
|
+
- **SwarmMemoryManager learning methods**: `getBestAction()`, `getRecentLearningExperiences()`, `getLearningStats()`, and more
|
|
666
|
+
- **Phase 4 Alerting & Reporting**: AlertManager, FeedbackRouter, StrategyApplicator modules
|
|
667
|
+
- **Quality Gate CI workflow**: GitHub Actions integration for automated quality validation
|
|
668
|
+
|
|
669
|
+
**Upgrade**: `npm install agentic-qe@1.9.4`
|
|
670
|
+
|
|
671
|
+
---
|
|
672
|
+
|
|
673
|
+
## 📝 What's New in v1.9.3
|
|
674
|
+
|
|
675
|
+
📦 **NPM Package Fix** (2025-11-26)
|
|
676
|
+
|
|
677
|
+
This release fixes missing files in the npm package distribution that caused `aqe init` to fail.
|
|
678
|
+
|
|
679
|
+
### Key Fixes
|
|
680
|
+
|
|
681
|
+
- **Added missing `templates/` directory**: Includes `aqe.sh` wrapper script
|
|
682
|
+
- **Added missing `.claude/helpers/` directory**: Includes 6 helper scripts (checkpoint-manager.sh, github-safe.js, etc.)
|
|
683
|
+
- **Added missing `docs/reference/` directory**: Includes reference documentation (agents.md, skills.md, usage.md)
|
|
684
|
+
|
|
685
|
+
**Upgrade**: `npm install agentic-qe@1.9.3`
|
|
686
|
+
|
|
687
|
+
---
|
|
688
|
+
|
|
648
689
|
## 📝 What's New in v1.9.2
|
|
649
690
|
|
|
650
691
|
🐛 **Learning Persistence Fix** (2025-11-26)
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# OTEL Stack Environment Variables
|
|
2
|
+
# Agentic QE Fleet - Issue #71
|
|
3
|
+
#
|
|
4
|
+
# Copy this file to .env.otel and customize as needed
|
|
5
|
+
# Usage: docker-compose -f config/docker-compose.otel.yml --env-file config/.env.otel up -d
|
|
6
|
+
|
|
7
|
+
# Deployment environment
|
|
8
|
+
DEPLOYMENT_ENVIRONMENT=development
|
|
9
|
+
|
|
10
|
+
# Grafana credentials (CHANGE IN PRODUCTION!)
|
|
11
|
+
GRAFANA_ADMIN_USER=admin
|
|
12
|
+
GRAFANA_ADMIN_PASSWORD=admin
|
|
13
|
+
|
|
14
|
+
# OTEL Collector settings
|
|
15
|
+
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
|
|
16
|
+
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
|
|
17
|
+
|
|
18
|
+
# Service configuration
|
|
19
|
+
SERVICE_NAME=agentic-qe-fleet
|
|
20
|
+
SERVICE_NAMESPACE=agentic-qe
|
|
21
|
+
SERVICE_VERSION=1.9.3
|
|
22
|
+
|
|
23
|
+
# Prometheus retention
|
|
24
|
+
PROMETHEUS_RETENTION_TIME=15d
|
|
25
|
+
PROMETHEUS_RETENTION_SIZE=10GB
|
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
# OTEL Stack Quick Reference Card
|
|
2
|
+
|
|
3
|
+
## 🚀 One-Line Start
|
|
4
|
+
|
|
5
|
+
```bash
|
|
6
|
+
docker-compose -f config/docker-compose.otel.yml up -d
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
## 🌐 Service URLs
|
|
10
|
+
|
|
11
|
+
| Service | URL | Login |
|
|
12
|
+
|---------|-----|-------|
|
|
13
|
+
| **Grafana** | http://localhost:3001 | admin/admin |
|
|
14
|
+
| **Prometheus** | http://localhost:9090 | - |
|
|
15
|
+
| **Jaeger** | http://localhost:16686 | - |
|
|
16
|
+
| **OTEL Health** | http://localhost:13133/health | - |
|
|
17
|
+
|
|
18
|
+
## 📡 Send Telemetry
|
|
19
|
+
|
|
20
|
+
### OTLP Endpoints
|
|
21
|
+
- gRPC: `localhost:4317`
|
|
22
|
+
- HTTP: `localhost:4318`
|
|
23
|
+
|
|
24
|
+
### Node.js Example
|
|
25
|
+
```javascript
|
|
26
|
+
const exporter = new OTLPTraceExporter({
|
|
27
|
+
url: 'http://localhost:4318/v1/traces'
|
|
28
|
+
});
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
### cURL Test
|
|
32
|
+
```bash
|
|
33
|
+
curl http://localhost:4318/v1/traces \
|
|
34
|
+
-H "Content-Type: application/json" \
|
|
35
|
+
-d @trace.json
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## 🔍 Quick Checks
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
# Verify all services
|
|
42
|
+
./scripts/verify-otel-stack.sh
|
|
43
|
+
|
|
44
|
+
# Check health
|
|
45
|
+
curl http://localhost:13133/health # OTEL Collector
|
|
46
|
+
curl http://localhost:9090/-/healthy # Prometheus
|
|
47
|
+
curl http://localhost:14269/ # Jaeger
|
|
48
|
+
curl http://localhost:3001/api/health # Grafana
|
|
49
|
+
|
|
50
|
+
# View logs
|
|
51
|
+
docker-compose -f config/docker-compose.otel.yml logs -f
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
## 🛠️ Common Commands
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
# Start
|
|
58
|
+
docker-compose -f config/docker-compose.otel.yml up -d
|
|
59
|
+
|
|
60
|
+
# Stop
|
|
61
|
+
docker-compose -f config/docker-compose.otel.yml down
|
|
62
|
+
|
|
63
|
+
# Restart
|
|
64
|
+
docker-compose -f config/docker-compose.otel.yml restart
|
|
65
|
+
|
|
66
|
+
# View status
|
|
67
|
+
docker-compose -f config/docker-compose.otel.yml ps
|
|
68
|
+
|
|
69
|
+
# Logs
|
|
70
|
+
docker-compose -f config/docker-compose.otel.yml logs -f [service]
|
|
71
|
+
|
|
72
|
+
# Remove everything (INCLUDING DATA!)
|
|
73
|
+
docker-compose -f config/docker-compose.otel.yml down -v
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## 📊 Port Reference
|
|
77
|
+
|
|
78
|
+
### OTEL Collector
|
|
79
|
+
- 4317 - OTLP gRPC
|
|
80
|
+
- 4318 - OTLP HTTP
|
|
81
|
+
- 8889 - Prometheus metrics
|
|
82
|
+
- 13133 - Health check
|
|
83
|
+
|
|
84
|
+
### Prometheus
|
|
85
|
+
- 9090 - Web UI
|
|
86
|
+
|
|
87
|
+
### Jaeger
|
|
88
|
+
- 16686 - UI
|
|
89
|
+
- 14269 - Metrics
|
|
90
|
+
|
|
91
|
+
### Grafana
|
|
92
|
+
- 3001 - Web UI
|
|
93
|
+
|
|
94
|
+
## 🔧 Configuration Files
|
|
95
|
+
|
|
96
|
+
- **Docker Compose**: `config/docker-compose.otel.yml`
|
|
97
|
+
- **OTEL Collector**: `config/otel-collector-config.yaml.example`
|
|
98
|
+
- **Prometheus**: `config/prometheus.yml.example`
|
|
99
|
+
- **Grafana Datasources**: `config/grafana/provisioning/datasources/datasources.yml`
|
|
100
|
+
- **Grafana Dashboards**: `config/grafana/provisioning/dashboards/dashboards.yml`
|
|
101
|
+
|
|
102
|
+
## 📚 Documentation
|
|
103
|
+
|
|
104
|
+
- **Quick Start**: `config/README-OTEL.md`
|
|
105
|
+
- **Full Summary**: `docs/implementation-plans/issue-71-completion-summary.md`
|
|
106
|
+
- **Architecture**: `docs/architecture/otel-stack-architecture.md`
|
|
107
|
+
|
|
108
|
+
## 🐛 Quick Troubleshooting
|
|
109
|
+
|
|
110
|
+
| Issue | Solution |
|
|
111
|
+
|-------|----------|
|
|
112
|
+
| Services not starting | Check logs: `docker-compose -f config/docker-compose.otel.yml logs` |
|
|
113
|
+
| Port already in use | Change external port in `docker-compose.otel.yml` |
|
|
114
|
+
| Grafana can't connect | Check datasources: Grafana → Configuration → Data Sources |
|
|
115
|
+
| No metrics in Prometheus | Check targets: http://localhost:9090/targets |
|
|
116
|
+
| No traces in Jaeger | Verify OTLP endpoint: `curl http://localhost:4318` |
|
|
117
|
+
|
|
118
|
+
## 🎯 Grafana Datasources
|
|
119
|
+
|
|
120
|
+
Pre-configured and auto-loaded:
|
|
121
|
+
|
|
122
|
+
1. **Prometheus** (default) - `http://prometheus:9090`
|
|
123
|
+
2. **Jaeger** - `http://jaeger:16686`
|
|
124
|
+
3. **OTEL Collector Metrics** - `http://otel-collector:8889`
|
|
125
|
+
|
|
126
|
+
## ✅ Verification Checklist
|
|
127
|
+
|
|
128
|
+
- [ ] All 4 services running: `docker-compose ps`
|
|
129
|
+
- [ ] Health checks passing: `./scripts/verify-otel-stack.sh`
|
|
130
|
+
- [ ] OTLP endpoints accessible: `curl http://localhost:4318`
|
|
131
|
+
- [ ] Prometheus targets green: http://localhost:9090/targets
|
|
132
|
+
- [ ] Grafana datasources connected: Grafana UI → Data Sources
|
|
133
|
+
- [ ] Sample dashboard visible: Grafana → Dashboards
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
**Issue #71 - COMPLETED** ✅
|
|
@@ -0,0 +1,222 @@
|
|
|
1
|
+
# OTEL Observability Stack - Quick Start Guide
|
|
2
|
+
|
|
3
|
+
Complete observability stack for the Agentic QE Fleet with OpenTelemetry, Prometheus, Jaeger, and Grafana.
|
|
4
|
+
|
|
5
|
+
## 🚀 Quick Start
|
|
6
|
+
|
|
7
|
+
### 1. Start the OTEL Stack
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
# Start only the OTEL stack
|
|
11
|
+
docker-compose -f config/docker-compose.otel.yml up -d
|
|
12
|
+
|
|
13
|
+
# Or combine with the main application
|
|
14
|
+
docker-compose -f docker-compose.yml -f config/docker-compose.otel.yml up -d
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
### 2. Access the Services
|
|
18
|
+
|
|
19
|
+
| Service | URL | Purpose |
|
|
20
|
+
|---------|-----|---------|
|
|
21
|
+
| **Grafana** | http://localhost:3001 | Dashboards and visualization |
|
|
22
|
+
| **Prometheus** | http://localhost:9090 | Metrics storage and querying |
|
|
23
|
+
| **Jaeger UI** | http://localhost:16686 | Distributed tracing |
|
|
24
|
+
| **OTEL Collector** | http://localhost:13133/health | Health check |
|
|
25
|
+
|
|
26
|
+
### 3. Default Credentials
|
|
27
|
+
|
|
28
|
+
- **Grafana**: `admin` / `admin` (change on first login)
|
|
29
|
+
|
|
30
|
+
## 📊 Service Endpoints
|
|
31
|
+
|
|
32
|
+
### OTEL Collector
|
|
33
|
+
- **OTLP gRPC**: `localhost:4317` - Send traces/metrics via gRPC
|
|
34
|
+
- **OTLP HTTP**: `localhost:4318` - Send traces/metrics via HTTP
|
|
35
|
+
- **Prometheus Exporter**: `localhost:8889` - Metrics endpoint
|
|
36
|
+
- **Health Check**: `localhost:13133` - Collector health
|
|
37
|
+
- **pprof**: `localhost:1777` - Performance profiling
|
|
38
|
+
- **zPages**: `localhost:55679` - Debug interface
|
|
39
|
+
|
|
40
|
+
### Prometheus
|
|
41
|
+
- **Web UI**: `localhost:9090` - Query and explore metrics
|
|
42
|
+
- **API**: `localhost:9090/api/v1/` - Prometheus HTTP API
|
|
43
|
+
|
|
44
|
+
### Jaeger
|
|
45
|
+
- **UI**: `localhost:16686` - Trace visualization
|
|
46
|
+
- **OTLP gRPC**: `localhost:4327` - Receive traces (forwarded from collector)
|
|
47
|
+
- **Metrics**: `localhost:14269/metrics` - Jaeger metrics
|
|
48
|
+
- **Health**: `localhost:14269/` - Health check
|
|
49
|
+
|
|
50
|
+
### Grafana
|
|
51
|
+
- **Web UI**: `localhost:3001` - Dashboards and visualization
|
|
52
|
+
- **API**: `localhost:3001/api/` - Grafana HTTP API
|
|
53
|
+
|
|
54
|
+
## 🔧 Configuration Files
|
|
55
|
+
|
|
56
|
+
### Required Files (Already Created)
|
|
57
|
+
- `config/docker-compose.otel.yml` - Docker Compose configuration
|
|
58
|
+
- `config/otel-collector-config.yaml.example` - OTEL Collector config
|
|
59
|
+
- `config/prometheus.yml.example` - Prometheus scrape config
|
|
60
|
+
- `config/grafana/provisioning/datasources/datasources.yml` - Grafana datasources
|
|
61
|
+
- `config/grafana/provisioning/dashboards/dashboards.yml` - Dashboard provisioning
|
|
62
|
+
- `config/grafana/dashboards/agentic-qe-overview.json` - Sample dashboard
|
|
63
|
+
|
|
64
|
+
### Environment Variables (Optional)
|
|
65
|
+
Copy and customize:
|
|
66
|
+
```bash
|
|
67
|
+
cp config/.env.otel.example config/.env.otel
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
Then use:
|
|
71
|
+
```bash
|
|
72
|
+
docker-compose -f config/docker-compose.otel.yml --env-file config/.env.otel up -d
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## 📈 Using the Stack
|
|
76
|
+
|
|
77
|
+
### Send Telemetry to OTEL Collector
|
|
78
|
+
|
|
79
|
+
#### Via HTTP (curl example)
|
|
80
|
+
```bash
|
|
81
|
+
curl -X POST http://localhost:4318/v1/traces \
|
|
82
|
+
-H "Content-Type: application/json" \
|
|
83
|
+
-d @trace-data.json
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
#### Via Node.js Application
|
|
87
|
+
```javascript
|
|
88
|
+
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
|
|
89
|
+
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
|
|
90
|
+
|
|
91
|
+
const provider = new NodeTracerProvider();
|
|
92
|
+
provider.addSpanProcessor(
|
|
93
|
+
new BatchSpanProcessor(
|
|
94
|
+
new OTLPTraceExporter({
|
|
95
|
+
url: 'http://localhost:4318/v1/traces'
|
|
96
|
+
})
|
|
97
|
+
)
|
|
98
|
+
);
|
|
99
|
+
provider.register();
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Query Metrics in Prometheus
|
|
103
|
+
|
|
104
|
+
1. Open http://localhost:9090
|
|
105
|
+
2. Try queries:
|
|
106
|
+
- `aqe_requests_total` - Total requests
|
|
107
|
+
- `rate(aqe_requests_total[5m])` - Request rate
|
|
108
|
+
- `histogram_quantile(0.95, rate(aqe_request_duration_bucket[5m]))` - P95 latency
|
|
109
|
+
|
|
110
|
+
### View Traces in Jaeger
|
|
111
|
+
|
|
112
|
+
1. Open http://localhost:16686
|
|
113
|
+
2. Select service: `agentic-qe-fleet`
|
|
114
|
+
3. Click "Find Traces"
|
|
115
|
+
4. Explore trace details and service dependencies
|
|
116
|
+
|
|
117
|
+
### Create Dashboards in Grafana
|
|
118
|
+
|
|
119
|
+
1. Open http://localhost:3001
|
|
120
|
+
2. Login with `admin` / `admin`
|
|
121
|
+
3. Navigate to Dashboards → Agentic QE Fleet → Overview
|
|
122
|
+
4. Or create new dashboards using Prometheus and Jaeger datasources
|
|
123
|
+
|
|
124
|
+
## 🛠️ Management Commands
|
|
125
|
+
|
|
126
|
+
### View Logs
|
|
127
|
+
```bash
|
|
128
|
+
# All services
|
|
129
|
+
docker-compose -f config/docker-compose.otel.yml logs -f
|
|
130
|
+
|
|
131
|
+
# Specific service
|
|
132
|
+
docker-compose -f config/docker-compose.otel.yml logs -f otel-collector
|
|
133
|
+
docker-compose -f config/docker-compose.otel.yml logs -f prometheus
|
|
134
|
+
docker-compose -f config/docker-compose.otel.yml logs -f jaeger
|
|
135
|
+
docker-compose -f config/docker-compose.otel.yml logs -f grafana
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Check Service Health
|
|
139
|
+
```bash
|
|
140
|
+
# OTEL Collector
|
|
141
|
+
curl http://localhost:13133/health
|
|
142
|
+
|
|
143
|
+
# Prometheus
|
|
144
|
+
curl http://localhost:9090/-/healthy
|
|
145
|
+
|
|
146
|
+
# Jaeger
|
|
147
|
+
curl http://localhost:14269/
|
|
148
|
+
|
|
149
|
+
# Grafana
|
|
150
|
+
curl http://localhost:3001/api/health
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
### Stop Services
|
|
154
|
+
```bash
|
|
155
|
+
# Stop OTEL stack
|
|
156
|
+
docker-compose -f config/docker-compose.otel.yml down
|
|
157
|
+
|
|
158
|
+
# Stop and remove volumes (CAUTION: deletes all data)
|
|
159
|
+
docker-compose -f config/docker-compose.otel.yml down -v
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
### Restart Services
|
|
163
|
+
```bash
|
|
164
|
+
# Restart all
|
|
165
|
+
docker-compose -f config/docker-compose.otel.yml restart
|
|
166
|
+
|
|
167
|
+
# Restart specific service
|
|
168
|
+
docker-compose -f config/docker-compose.otel.yml restart otel-collector
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
## 🔍 Troubleshooting
|
|
172
|
+
|
|
173
|
+
### OTEL Collector Not Receiving Data
|
|
174
|
+
1. Check collector logs: `docker-compose -f config/docker-compose.otel.yml logs otel-collector`
|
|
175
|
+
2. Verify endpoints: `curl http://localhost:13133/health`
|
|
176
|
+
3. Check application OTLP endpoint: `http://localhost:4318`
|
|
177
|
+
|
|
178
|
+
### Prometheus Not Scraping Metrics
|
|
179
|
+
1. Check Prometheus targets: http://localhost:9090/targets
|
|
180
|
+
2. Verify OTEL Collector is exposing metrics: `curl http://localhost:8889/metrics`
|
|
181
|
+
3. Check Prometheus config: `docker-compose -f config/docker-compose.otel.yml exec prometheus cat /etc/prometheus/prometheus.yml`
|
|
182
|
+
|
|
183
|
+
### Jaeger Not Showing Traces
|
|
184
|
+
1. Check Jaeger logs: `docker-compose -f config/docker-compose.otel.yml logs jaeger`
|
|
185
|
+
2. Verify OTEL Collector is forwarding traces (check collector logs)
|
|
186
|
+
3. Ensure application is sending traces to OTLP endpoint
|
|
187
|
+
|
|
188
|
+
### Grafana Datasources Not Working
|
|
189
|
+
1. Check datasource configuration: Grafana UI → Configuration → Data Sources
|
|
190
|
+
2. Test datasource connection (should show green checkmark)
|
|
191
|
+
3. Verify Prometheus/Jaeger are accessible from Grafana container
|
|
192
|
+
|
|
193
|
+
### Performance Issues
|
|
194
|
+
1. Adjust OTEL Collector batch size in `otel-collector-config.yaml.example`
|
|
195
|
+
2. Reduce Prometheus scrape interval in `prometheus.yml.example`
|
|
196
|
+
3. Adjust memory limits for services in `docker-compose.otel.yml`
|
|
197
|
+
|
|
198
|
+
## 📚 Next Steps
|
|
199
|
+
|
|
200
|
+
1. **Integrate with Application**: Configure your app to send telemetry to OTLP endpoints
|
|
201
|
+
2. **Create Custom Dashboards**: Build Grafana dashboards for your specific metrics
|
|
202
|
+
3. **Set Up Alerting**: Configure Prometheus alerting rules (see Phase 4 docs)
|
|
203
|
+
4. **Production Hardening**:
|
|
204
|
+
- Change default passwords
|
|
205
|
+
- Enable TLS/authentication
|
|
206
|
+
- Configure persistent storage
|
|
207
|
+
- Set up backup/restore procedures
|
|
208
|
+
|
|
209
|
+
## 📖 Related Documentation
|
|
210
|
+
|
|
211
|
+
- [OTEL Stack Architecture](../docs/architecture/otel-stack-architecture.md)
|
|
212
|
+
- [Phase 4 Alerting Implementation Plan](../docs/implementation-plans/phase4-alerting-implementation-plan.md)
|
|
213
|
+
- [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
|
|
214
|
+
- [Prometheus Documentation](https://prometheus.io/docs/)
|
|
215
|
+
- [Jaeger Documentation](https://www.jaegertracing.io/docs/)
|
|
216
|
+
- [Grafana Documentation](https://grafana.com/docs/)
|
|
217
|
+
|
|
218
|
+
## 🐛 Issue Tracking
|
|
219
|
+
|
|
220
|
+
This implementation resolves **Issue #71**: Complete OTEL Stack Docker Compose Configuration
|
|
221
|
+
|
|
222
|
+
For issues or improvements, please file an issue on the repository.
|