claude-flow-novice 1.6.5 → 1.6.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-flow-novice/dist/mcp/mcp-server-sdk.js +45 -0
- package/config/.env.example +178 -0
- package/config/DEPLOYMENT_GUIDE.md +692 -0
- package/config/README-CONFIG.md +331 -0
- package/config/coordination-config.sh +327 -0
- package/config/docker/env.development +53 -0
- package/config/docker/env.production +83 -0
- package/config/docker/env.staging +70 -0
- package/config/k8s/configmap-development.yaml +60 -0
- package/config/k8s/configmap-production.yaml +85 -0
- package/config/k8s/configmap-staging.yaml +76 -0
- package/config/k8s/secret-production.yaml +62 -0
- package/config/k8s/secret-staging.yaml +36 -0
- package/package.json +1 -1
- package/scripts/monitoring/dashboards/rate-limiting-dashboard.json +211 -0
- package/scripts/monitoring/quick-test-alerting.sh +118 -0
- package/scripts/monitoring/quick-test-rate-limiting.sh +206 -0
- package/scripts/monitoring/rate-limiting-monitor.sh +380 -0
|
@@ -0,0 +1,331 @@
|
|
|
1
|
+
# CLI Coordination Configuration
|
|
2
|
+
|
|
3
|
+
**Phase 1 Sprint 1.3** - Centralized configuration system for CLI coordination
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
The `coordination-config.sh` file provides a centralized configuration system for the CLI coordination infrastructure. All configuration options have sensible defaults optimized for 100-agent production swarms.
|
|
8
|
+
|
|
9
|
+
## Usage
|
|
10
|
+
|
|
11
|
+
### Source Configuration
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
# Source in your script
|
|
15
|
+
source config/coordination-config.sh
|
|
16
|
+
|
|
17
|
+
# Configuration auto-loads and validates
|
|
18
|
+
# All CFN_* environment variables are now available
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
### View Configuration
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
# Execute directly to print current configuration
|
|
25
|
+
bash config/coordination-config.sh
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
### Override Defaults
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
# Override before sourcing
|
|
32
|
+
export CFN_MAX_AGENTS=200
|
|
33
|
+
export CFN_SHARD_COUNT=32
|
|
34
|
+
source config/coordination-config.sh
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## Configuration Options
|
|
38
|
+
|
|
39
|
+
### Storage Configuration
|
|
40
|
+
|
|
41
|
+
| Variable | Default | Description |
|
|
42
|
+
|----------|---------|-------------|
|
|
43
|
+
| `CFN_BASE_DIR` | `/dev/shm/cfn` | Base directory for coordination infrastructure |
|
|
44
|
+
| `CFN_METRICS_DIR` | `$CFN_BASE_DIR/metrics` | Metrics storage directory |
|
|
45
|
+
| `CFN_METRICS_FILE` | `$CFN_METRICS_DIR/cfn-metrics.jsonl` | Metrics data file |
|
|
46
|
+
| `CFN_HEALTH_DIR` | `$CFN_BASE_DIR/health` | Health monitoring directory |
|
|
47
|
+
| `CFN_HEALTH_FILE` | `$CFN_HEALTH_DIR/agent-health.jsonl` | Health data file |
|
|
48
|
+
| `CFN_ALERT_DIR` | `$CFN_BASE_DIR/alerts` | Alert storage directory |
|
|
49
|
+
| `CFN_ALERT_FILE` | `$CFN_ALERT_DIR/cfn-alerts.jsonl` | Alert data file |
|
|
50
|
+
|
|
51
|
+
**Why `/dev/shm`?** In-memory filesystem for minimal I/O latency (<1ms vs 5-20ms for disk).
|
|
52
|
+
|
|
53
|
+
### Performance Configuration
|
|
54
|
+
|
|
55
|
+
| Variable | Default | Range | Description |
|
|
56
|
+
|----------|---------|-------|-------------|
|
|
57
|
+
| `CFN_MAX_AGENTS` | `100` | 1-1000 | Maximum concurrent agents supported |
|
|
58
|
+
| `CFN_SHARD_COUNT` | `16` | 1-64 | Number of shards for metrics distribution |
|
|
59
|
+
| `CFN_BATCH_SIZE` | `10` | 1-100 | Metrics per batch for processing |
|
|
60
|
+
|
|
61
|
+
**Tuning Guidelines:**
|
|
62
|
+
- **10-50 agents**: `CFN_SHARD_COUNT=8`
|
|
63
|
+
- **50-100 agents**: `CFN_SHARD_COUNT=16` (default)
|
|
64
|
+
- **100-200 agents**: `CFN_SHARD_COUNT=32`
|
|
65
|
+
- **200+ agents**: `CFN_SHARD_COUNT=64`
|
|
66
|
+
|
|
67
|
+
### Timeout Configuration
|
|
68
|
+
|
|
69
|
+
| Variable | Default | Range (ms) | Description |
|
|
70
|
+
|----------|---------|------------|-------------|
|
|
71
|
+
| `CFN_COORDINATION_TIMEOUT` | `10000` | 100-300000 | Coordination phase timeout |
|
|
72
|
+
| `CFN_HEALTH_TIMEOUT` | `30` | 1-300 | Health check timeout (seconds) |
|
|
73
|
+
| `CFN_MESSAGE_TIMEOUT` | `5000` | 100-60000 | Message delivery timeout |
|
|
74
|
+
|
|
75
|
+
**Production Recommendations:**
|
|
76
|
+
- **LAN deployment**: Keep defaults
|
|
77
|
+
- **WAN/cloud deployment**: `CFN_COORDINATION_TIMEOUT=30000`, `CFN_MESSAGE_TIMEOUT=10000`
|
|
78
|
+
- **Low-latency requirement**: `CFN_COORDINATION_TIMEOUT=5000`, `CFN_MESSAGE_TIMEOUT=2000`
|
|
79
|
+
|
|
80
|
+
### Monitoring Configuration
|
|
81
|
+
|
|
82
|
+
| Variable | Default | Description |
|
|
83
|
+
|----------|---------|-------------|
|
|
84
|
+
| `CFN_METRICS_ENABLED` | `true` | Enable metrics collection |
|
|
85
|
+
| `CFN_ALERTING_ENABLED` | `true` | Enable alerting system |
|
|
86
|
+
| `CFN_ALERT_INTERVAL` | `30` | Alert check interval (seconds) |
|
|
87
|
+
|
|
88
|
+
**Performance Impact:**
|
|
89
|
+
- Metrics collection: <1% overhead
|
|
90
|
+
- Alerting: <0.1% overhead (background monitoring)
|
|
91
|
+
|
|
92
|
+
### Alerting Thresholds
|
|
93
|
+
|
|
94
|
+
| Variable | Default | Range | Description |
|
|
95
|
+
|----------|---------|-------|-------------|
|
|
96
|
+
| `CFN_ALERT_COORD_TIME_MS` | `10000` | 100-60000 | Coordination time threshold (ms) |
|
|
97
|
+
| `CFN_ALERT_DELIVERY_RATE` | `90` | 1-100 | Minimum delivery rate (%) |
|
|
98
|
+
| `CFN_ALERT_CONSENSUS_SCORE` | `90` | 1-100 | Minimum consensus score (%) |
|
|
99
|
+
| `CFN_ALERT_CONFIDENCE_SCORE` | `75` | 1-100 | Minimum confidence score (%) |
|
|
100
|
+
|
|
101
|
+
**Threshold Tuning:**
|
|
102
|
+
- **Critical systems**: Increase thresholds (95% delivery, 95% consensus)
|
|
103
|
+
- **Development**: Decrease for more alerts (85% delivery, 80% consensus)
|
|
104
|
+
- **Production (default)**: Balanced thresholds for actionable alerts
|
|
105
|
+
|
|
106
|
+
### Data Retention Configuration
|
|
107
|
+
|
|
108
|
+
| Variable | Default | Range (hours) | Description |
|
|
109
|
+
|----------|---------|---------------|-------------|
|
|
110
|
+
| `CFN_METRICS_RETENTION_HOURS` | `48` | 1-720 | Metrics retention period |
|
|
111
|
+
| `CFN_ALERT_RETENTION_HOURS` | `24` | 1-720 | Alert retention period |
|
|
112
|
+
| `CFN_HEALTH_RETENTION_HOURS` | `12` | 1-720 | Health check retention period |
|
|
113
|
+
|
|
114
|
+
**Storage Estimates** (100 agents, default retention):
|
|
115
|
+
- Metrics: ~50MB (48h × 100 agents × 10KB/h)
|
|
116
|
+
- Alerts: ~5MB (24h × average 5 alerts/h × 40KB)
|
|
117
|
+
- Health: ~12MB (12h × 100 agents × 10KB/h)
|
|
118
|
+
- **Total**: ~67MB in `/dev/shm`
|
|
119
|
+
|
|
120
|
+
## Validation
|
|
121
|
+
|
|
122
|
+
Configuration is automatically validated when loaded:
|
|
123
|
+
|
|
124
|
+
```bash
|
|
125
|
+
source config/coordination-config.sh
|
|
126
|
+
# Output on success:
|
|
127
|
+
# Configuration loaded successfully
|
|
128
|
+
|
|
129
|
+
# Output on error:
|
|
130
|
+
# ERROR: CFN_MAX_AGENTS must be 1-1000, got 2000
|
|
131
|
+
# ERROR: Configuration validation failed
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Validation Rules
|
|
135
|
+
|
|
136
|
+
1. **Numeric ranges**: All numeric values validated against min/max
|
|
137
|
+
2. **Boolean values**: Must be `true` or `false`
|
|
138
|
+
3. **Directory permissions**: Parent directories must exist and be writable
|
|
139
|
+
4. **Dependencies**: Directories created automatically if missing
|
|
140
|
+
|
|
141
|
+
## Functions
|
|
142
|
+
|
|
143
|
+
### `validate_config()`
|
|
144
|
+
|
|
145
|
+
Validates all configuration values against defined ranges.
|
|
146
|
+
|
|
147
|
+
**Returns:** `0` on success, error count on failure
|
|
148
|
+
|
|
149
|
+
**Example:**
|
|
150
|
+
```bash
|
|
151
|
+
source config/coordination-config.sh
|
|
152
|
+
if validate_config; then
|
|
153
|
+
echo "Configuration valid"
|
|
154
|
+
fi
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
### `load_config()`
|
|
158
|
+
|
|
159
|
+
Loads and validates configuration, creates required directories.
|
|
160
|
+
|
|
161
|
+
**Returns:** `0` on success, `1` on failure
|
|
162
|
+
|
|
163
|
+
**Example:**
|
|
164
|
+
```bash
|
|
165
|
+
if load_config; then
|
|
166
|
+
# Configuration loaded successfully
|
|
167
|
+
echo "Ready to start coordination"
|
|
168
|
+
fi
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### `print_config()`
|
|
172
|
+
|
|
173
|
+
Displays current configuration values in human-readable format.
|
|
174
|
+
|
|
175
|
+
**Example:**
|
|
176
|
+
```bash
|
|
177
|
+
source config/coordination-config.sh
|
|
178
|
+
print_config
|
|
179
|
+
# Outputs formatted configuration table
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
### `init_directories()`
|
|
183
|
+
|
|
184
|
+
Creates all required directories for coordination infrastructure.
|
|
185
|
+
|
|
186
|
+
**Returns:** `0` on success, `1` on failure
|
|
187
|
+
|
|
188
|
+
**Example:**
|
|
189
|
+
```bash
|
|
190
|
+
if init_directories; then
|
|
191
|
+
echo "Directories initialized"
|
|
192
|
+
fi
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
## Environment-Specific Configurations
|
|
196
|
+
|
|
197
|
+
### Development Environment
|
|
198
|
+
|
|
199
|
+
```bash
|
|
200
|
+
export CFN_MAX_AGENTS=10
|
|
201
|
+
export CFN_SHARD_COUNT=4
|
|
202
|
+
export CFN_ALERT_DELIVERY_RATE=80
|
|
203
|
+
export CFN_METRICS_RETENTION_HOURS=12
|
|
204
|
+
source config/coordination-config.sh
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
### Staging Environment
|
|
208
|
+
|
|
209
|
+
```bash
|
|
210
|
+
export CFN_MAX_AGENTS=50
|
|
211
|
+
export CFN_SHARD_COUNT=8
|
|
212
|
+
export CFN_COORDINATION_TIMEOUT=15000
|
|
213
|
+
export CFN_METRICS_RETENTION_HOURS=24
|
|
214
|
+
source config/coordination-config.sh
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
### Production Environment (High-Scale)
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
export CFN_MAX_AGENTS=200
|
|
221
|
+
export CFN_SHARD_COUNT=32
|
|
222
|
+
export CFN_ALERT_CONSENSUS_SCORE=95
|
|
223
|
+
export CFN_METRICS_RETENTION_HOURS=72
|
|
224
|
+
source config/coordination-config.sh
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
## Integration Examples
|
|
228
|
+
|
|
229
|
+
### Metrics Library Integration
|
|
230
|
+
|
|
231
|
+
```bash
|
|
232
|
+
#!/usr/bin/env bash
|
|
233
|
+
source config/coordination-config.sh
|
|
234
|
+
source lib/metrics.sh
|
|
235
|
+
|
|
236
|
+
# Metrics automatically use CFN_METRICS_FILE from config
|
|
237
|
+
emit_coordination_time 150 5 "coordination-phase"
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
### Health Monitoring Integration
|
|
241
|
+
|
|
242
|
+
```bash
|
|
243
|
+
#!/usr/bin/env bash
|
|
244
|
+
source config/coordination-config.sh
|
|
245
|
+
source lib/health.sh
|
|
246
|
+
|
|
247
|
+
# Health checks automatically use CFN_HEALTH_FILE from config
|
|
248
|
+
check_agent_health "agent-1"
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
### Alerting Integration
|
|
252
|
+
|
|
253
|
+
```bash
|
|
254
|
+
#!/usr/bin/env bash
|
|
255
|
+
source config/coordination-config.sh
|
|
256
|
+
source lib/alerting.sh
|
|
257
|
+
|
|
258
|
+
# Alerting uses thresholds from config
|
|
259
|
+
check_thresholds "$CFN_METRICS_FILE"
|
|
260
|
+
```
|
|
261
|
+
|
|
262
|
+
## Troubleshooting
|
|
263
|
+
|
|
264
|
+
### Configuration Validation Fails
|
|
265
|
+
|
|
266
|
+
**Symptom:** `ERROR: Configuration validation failed`
|
|
267
|
+
|
|
268
|
+
**Solution:**
|
|
269
|
+
1. Check error messages for specific failures
|
|
270
|
+
2. Verify environment variable values are within ranges
|
|
271
|
+
3. Ensure parent directories exist and are writable
|
|
272
|
+
|
|
273
|
+
### Directory Creation Fails
|
|
274
|
+
|
|
275
|
+
**Symptom:** `ERROR: Failed to create directory: /dev/shm/cfn`
|
|
276
|
+
|
|
277
|
+
**Solution:**
|
|
278
|
+
1. Verify `/dev/shm` is mounted: `df -h /dev/shm`
|
|
279
|
+
2. Check permissions: `ls -ld /dev/shm`
|
|
280
|
+
3. Try alternative base directory: `export CFN_BASE_DIR=/tmp/cfn`
|
|
281
|
+
|
|
282
|
+
### Performance Issues
|
|
283
|
+
|
|
284
|
+
**Symptom:** High coordination times or low delivery rates
|
|
285
|
+
|
|
286
|
+
**Solution:**
|
|
287
|
+
1. Increase shard count for agent count
|
|
288
|
+
2. Increase timeout values for network latency
|
|
289
|
+
3. Monitor `/dev/shm` space: `df -h /dev/shm`
|
|
290
|
+
|
|
291
|
+
## Best Practices
|
|
292
|
+
|
|
293
|
+
1. **Environment Variables First**: Set overrides before sourcing
|
|
294
|
+
2. **Validate Early**: Source config at script start to catch errors early
|
|
295
|
+
3. **Document Overrides**: Comment why production values differ from defaults
|
|
296
|
+
4. **Monitor Retention**: Adjust retention based on `/dev/shm` size
|
|
297
|
+
5. **Test Validation**: Run with invalid values to verify validation works
|
|
298
|
+
|
|
299
|
+
## Migration from Hardcoded Values
|
|
300
|
+
|
|
301
|
+
### Before (Hardcoded)
|
|
302
|
+
|
|
303
|
+
```bash
|
|
304
|
+
METRICS_FILE="/dev/shm/cfn-metrics.jsonl"
|
|
305
|
+
MAX_AGENTS=100
|
|
306
|
+
COORD_TIMEOUT=10000
|
|
307
|
+
```
|
|
308
|
+
|
|
309
|
+
### After (Centralized Config)
|
|
310
|
+
|
|
311
|
+
```bash
|
|
312
|
+
source config/coordination-config.sh
|
|
313
|
+
# All values available as CFN_* variables
|
|
314
|
+
# Defaults match previous hardcoded values
|
|
315
|
+
# Can override with environment variables
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
## Related Documentation
|
|
319
|
+
|
|
320
|
+
- [Metrics Infrastructure](../lib/README-METRICS.md)
|
|
321
|
+
- [Health Monitoring](../lib/README-HEALTH.md)
|
|
322
|
+
- [Alerting System](../lib/README-ALERTING.md)
|
|
323
|
+
- [CLI Coordination Architecture](../planning/agent-coordination-v2/METRICS_COLLECTION_ARCHITECTURE.md)
|
|
324
|
+
|
|
325
|
+
## Version History
|
|
326
|
+
|
|
327
|
+
- **v1.0.0** (Sprint 1.3): Initial centralized configuration system
|
|
328
|
+
- Environment variable overrides
|
|
329
|
+
- Comprehensive validation
|
|
330
|
+
- Auto-initialization of directories
|
|
331
|
+
- Default values optimized for 100-agent swarm
|
|
@@ -0,0 +1,327 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# CLI Coordination Configuration
|
|
3
|
+
# Phase 1 Sprint 1.3: Centralized configuration system
|
|
4
|
+
|
|
5
|
+
set -euo pipefail
|
|
6
|
+
|
|
7
|
+
# ==============================================================================
|
|
8
|
+
# STORAGE CONFIGURATION
|
|
9
|
+
# ==============================================================================
|
|
10
|
+
|
|
11
|
+
# Base directory for coordination infrastructure (default: /dev/shm for performance)
|
|
12
|
+
export CFN_BASE_DIR="${CFN_BASE_DIR:-/dev/shm/cfn}"
|
|
13
|
+
|
|
14
|
+
# Metrics storage
|
|
15
|
+
export CFN_METRICS_DIR="${CFN_METRICS_DIR:-$CFN_BASE_DIR/metrics}"
|
|
16
|
+
export CFN_METRICS_FILE="${CFN_METRICS_FILE:-$CFN_METRICS_DIR/cfn-metrics.jsonl}"
|
|
17
|
+
|
|
18
|
+
# Health monitoring storage
|
|
19
|
+
export CFN_HEALTH_DIR="${CFN_HEALTH_DIR:-$CFN_BASE_DIR/health}"
|
|
20
|
+
export CFN_HEALTH_FILE="${CFN_HEALTH_FILE:-$CFN_HEALTH_DIR/agent-health.jsonl}"
|
|
21
|
+
|
|
22
|
+
# Alert storage
|
|
23
|
+
export CFN_ALERT_DIR="${CFN_ALERT_DIR:-$CFN_BASE_DIR/alerts}"
|
|
24
|
+
export CFN_ALERT_FILE="${CFN_ALERT_FILE:-$CFN_ALERT_DIR/cfn-alerts.jsonl}"
|
|
25
|
+
|
|
26
|
+
# ==============================================================================
|
|
27
|
+
# PERFORMANCE CONFIGURATION
|
|
28
|
+
# ==============================================================================
|
|
29
|
+
|
|
30
|
+
# Maximum concurrent agents (default: 100 for production swarms)
|
|
31
|
+
export CFN_MAX_AGENTS="${CFN_MAX_AGENTS:-100}"
|
|
32
|
+
|
|
33
|
+
# Shard count for metrics distribution (default: 16 for optimal load distribution)
|
|
34
|
+
export CFN_SHARD_COUNT="${CFN_SHARD_COUNT:-16}"
|
|
35
|
+
|
|
36
|
+
# Batch size for metric processing (default: 10 metrics per batch)
|
|
37
|
+
export CFN_BATCH_SIZE="${CFN_BATCH_SIZE:-10}"
|
|
38
|
+
|
|
39
|
+
# ==============================================================================
|
|
40
|
+
# TIMEOUT CONFIGURATION (milliseconds)
|
|
41
|
+
# ==============================================================================
|
|
42
|
+
|
|
43
|
+
# Coordination phase timeout (default: 10 seconds)
|
|
44
|
+
export CFN_COORDINATION_TIMEOUT="${CFN_COORDINATION_TIMEOUT:-10000}"
|
|
45
|
+
|
|
46
|
+
# Health check timeout (default: 30 seconds)
|
|
47
|
+
export CFN_HEALTH_TIMEOUT="${CFN_HEALTH_TIMEOUT:-30}"
|
|
48
|
+
|
|
49
|
+
# Message delivery timeout (default: 5 seconds)
|
|
50
|
+
export CFN_MESSAGE_TIMEOUT="${CFN_MESSAGE_TIMEOUT:-5000}"
|
|
51
|
+
|
|
52
|
+
# ==============================================================================
|
|
53
|
+
# MONITORING CONFIGURATION
|
|
54
|
+
# ==============================================================================
|
|
55
|
+
|
|
56
|
+
# Enable metrics collection (default: true)
|
|
57
|
+
export CFN_METRICS_ENABLED="${CFN_METRICS_ENABLED:-true}"
|
|
58
|
+
|
|
59
|
+
# Enable alerting system (default: true)
|
|
60
|
+
export CFN_ALERTING_ENABLED="${CFN_ALERTING_ENABLED:-true}"
|
|
61
|
+
|
|
62
|
+
# Alert check interval in seconds (default: 30 seconds)
|
|
63
|
+
export CFN_ALERT_INTERVAL="${CFN_ALERT_INTERVAL:-30}"
|
|
64
|
+
|
|
65
|
+
# ==============================================================================
|
|
66
|
+
# ALERTING THRESHOLDS
|
|
67
|
+
# ==============================================================================
|
|
68
|
+
|
|
69
|
+
# Coordination time alert threshold in milliseconds (default: 10000ms = 10s)
|
|
70
|
+
export CFN_ALERT_COORD_TIME_MS="${CFN_ALERT_COORD_TIME_MS:-10000}"
|
|
71
|
+
|
|
72
|
+
# Delivery rate alert threshold percentage (default: 90%)
|
|
73
|
+
export CFN_ALERT_DELIVERY_RATE="${CFN_ALERT_DELIVERY_RATE:-90}"
|
|
74
|
+
|
|
75
|
+
# Consensus score alert threshold percentage (default: 90%)
|
|
76
|
+
export CFN_ALERT_CONSENSUS_SCORE="${CFN_ALERT_CONSENSUS_SCORE:-90}"
|
|
77
|
+
|
|
78
|
+
# Confidence score alert threshold percentage (default: 75%)
|
|
79
|
+
export CFN_ALERT_CONFIDENCE_SCORE="${CFN_ALERT_CONFIDENCE_SCORE:-75}"
|
|
80
|
+
|
|
81
|
+
# ==============================================================================
|
|
82
|
+
# DATA RETENTION CONFIGURATION
|
|
83
|
+
# ==============================================================================
|
|
84
|
+
|
|
85
|
+
# Metrics retention in hours (default: 48 hours)
|
|
86
|
+
export CFN_METRICS_RETENTION_HOURS="${CFN_METRICS_RETENTION_HOURS:-48}"
|
|
87
|
+
|
|
88
|
+
# Alert retention in hours (default: 24 hours)
|
|
89
|
+
export CFN_ALERT_RETENTION_HOURS="${CFN_ALERT_RETENTION_HOURS:-24}"
|
|
90
|
+
|
|
91
|
+
# Health check retention in hours (default: 12 hours)
|
|
92
|
+
export CFN_HEALTH_RETENTION_HOURS="${CFN_HEALTH_RETENTION_HOURS:-12}"
|
|
93
|
+
|
|
94
|
+
# ==============================================================================
|
|
95
|
+
# VALIDATION FUNCTIONS
|
|
96
|
+
# ==============================================================================
|
|
97
|
+
|
|
98
|
+
# validate_numeric_range - Validate numeric value within range
|
|
99
|
+
# Args: $1=value, $2=min, $3=max, $4=name
|
|
100
|
+
validate_numeric_range() {
|
|
101
|
+
local value="$1"
|
|
102
|
+
local min="$2"
|
|
103
|
+
local max="$3"
|
|
104
|
+
local name="$4"
|
|
105
|
+
|
|
106
|
+
if ! [[ "$value" =~ ^[0-9]+$ ]]; then
|
|
107
|
+
echo "ERROR: $name must be numeric, got '$value'" >&2
|
|
108
|
+
return 1
|
|
109
|
+
fi
|
|
110
|
+
|
|
111
|
+
if [ "$value" -lt "$min" ] || [ "$value" -gt "$max" ]; then
|
|
112
|
+
echo "ERROR: $name must be $min-$max, got $value" >&2
|
|
113
|
+
return 1
|
|
114
|
+
fi
|
|
115
|
+
|
|
116
|
+
return 0
|
|
117
|
+
}
|
|
118
|
+
|
|
119
|
+
# validate_boolean - Validate boolean value
|
|
120
|
+
# Args: $1=value, $2=name
|
|
121
|
+
validate_boolean() {
|
|
122
|
+
local value="$1"
|
|
123
|
+
local name="$2"
|
|
124
|
+
|
|
125
|
+
if [[ "$value" != "true" && "$value" != "false" ]]; then
|
|
126
|
+
echo "ERROR: $name must be 'true' or 'false', got '$value'" >&2
|
|
127
|
+
return 1
|
|
128
|
+
fi
|
|
129
|
+
|
|
130
|
+
return 0
|
|
131
|
+
}
|
|
132
|
+
|
|
133
|
+
# validate_directory_writable - Validate directory is writable
|
|
134
|
+
# Args: $1=path, $2=name
|
|
135
|
+
validate_directory_writable() {
|
|
136
|
+
local path="$1"
|
|
137
|
+
local name="$2"
|
|
138
|
+
local parent_dir
|
|
139
|
+
parent_dir="$(dirname "$path")"
|
|
140
|
+
|
|
141
|
+
if [ ! -d "$parent_dir" ]; then
|
|
142
|
+
echo "ERROR: Parent directory of $name does not exist: $parent_dir" >&2
|
|
143
|
+
return 1
|
|
144
|
+
fi
|
|
145
|
+
|
|
146
|
+
if [ ! -w "$parent_dir" ]; then
|
|
147
|
+
echo "ERROR: Cannot write to parent directory of $name: $parent_dir" >&2
|
|
148
|
+
return 1
|
|
149
|
+
fi
|
|
150
|
+
|
|
151
|
+
return 0
|
|
152
|
+
}
|
|
153
|
+
|
|
154
|
+
# validate_config - Validate all configuration values
|
|
155
|
+
validate_config() {
|
|
156
|
+
local errors=0
|
|
157
|
+
|
|
158
|
+
# Validate performance configuration
|
|
159
|
+
if ! validate_numeric_range "$CFN_MAX_AGENTS" 1 1000 "CFN_MAX_AGENTS"; then
|
|
160
|
+
errors=$((errors + 1))
|
|
161
|
+
fi
|
|
162
|
+
|
|
163
|
+
if ! validate_numeric_range "$CFN_SHARD_COUNT" 1 64 "CFN_SHARD_COUNT"; then
|
|
164
|
+
errors=$((errors + 1))
|
|
165
|
+
fi
|
|
166
|
+
|
|
167
|
+
if ! validate_numeric_range "$CFN_BATCH_SIZE" 1 100 "CFN_BATCH_SIZE"; then
|
|
168
|
+
errors=$((errors + 1))
|
|
169
|
+
fi
|
|
170
|
+
|
|
171
|
+
# Validate timeout configuration
|
|
172
|
+
if ! validate_numeric_range "$CFN_COORDINATION_TIMEOUT" 100 300000 "CFN_COORDINATION_TIMEOUT"; then
|
|
173
|
+
errors=$((errors + 1))
|
|
174
|
+
fi
|
|
175
|
+
|
|
176
|
+
if ! validate_numeric_range "$CFN_HEALTH_TIMEOUT" 1 300 "CFN_HEALTH_TIMEOUT"; then
|
|
177
|
+
errors=$((errors + 1))
|
|
178
|
+
fi
|
|
179
|
+
|
|
180
|
+
if ! validate_numeric_range "$CFN_MESSAGE_TIMEOUT" 100 60000 "CFN_MESSAGE_TIMEOUT"; then
|
|
181
|
+
errors=$((errors + 1))
|
|
182
|
+
fi
|
|
183
|
+
|
|
184
|
+
# Validate monitoring configuration
|
|
185
|
+
if ! validate_boolean "$CFN_METRICS_ENABLED" "CFN_METRICS_ENABLED"; then
|
|
186
|
+
errors=$((errors + 1))
|
|
187
|
+
fi
|
|
188
|
+
|
|
189
|
+
if ! validate_boolean "$CFN_ALERTING_ENABLED" "CFN_ALERTING_ENABLED"; then
|
|
190
|
+
errors=$((errors + 1))
|
|
191
|
+
fi
|
|
192
|
+
|
|
193
|
+
if ! validate_numeric_range "$CFN_ALERT_INTERVAL" 1 3600 "CFN_ALERT_INTERVAL"; then
|
|
194
|
+
errors=$((errors + 1))
|
|
195
|
+
fi
|
|
196
|
+
|
|
197
|
+
# Validate threshold configuration
|
|
198
|
+
if ! validate_numeric_range "$CFN_ALERT_COORD_TIME_MS" 100 60000 "CFN_ALERT_COORD_TIME_MS"; then
|
|
199
|
+
errors=$((errors + 1))
|
|
200
|
+
fi
|
|
201
|
+
|
|
202
|
+
if ! validate_numeric_range "$CFN_ALERT_DELIVERY_RATE" 1 100 "CFN_ALERT_DELIVERY_RATE"; then
|
|
203
|
+
errors=$((errors + 1))
|
|
204
|
+
fi
|
|
205
|
+
|
|
206
|
+
if ! validate_numeric_range "$CFN_ALERT_CONSENSUS_SCORE" 1 100 "CFN_ALERT_CONSENSUS_SCORE"; then
|
|
207
|
+
errors=$((errors + 1))
|
|
208
|
+
fi
|
|
209
|
+
|
|
210
|
+
if ! validate_numeric_range "$CFN_ALERT_CONFIDENCE_SCORE" 1 100 "CFN_ALERT_CONFIDENCE_SCORE"; then
|
|
211
|
+
errors=$((errors + 1))
|
|
212
|
+
fi
|
|
213
|
+
|
|
214
|
+
# Validate retention configuration
|
|
215
|
+
if ! validate_numeric_range "$CFN_METRICS_RETENTION_HOURS" 1 720 "CFN_METRICS_RETENTION_HOURS"; then
|
|
216
|
+
errors=$((errors + 1))
|
|
217
|
+
fi
|
|
218
|
+
|
|
219
|
+
if ! validate_numeric_range "$CFN_ALERT_RETENTION_HOURS" 1 720 "CFN_ALERT_RETENTION_HOURS"; then
|
|
220
|
+
errors=$((errors + 1))
|
|
221
|
+
fi
|
|
222
|
+
|
|
223
|
+
if ! validate_numeric_range "$CFN_HEALTH_RETENTION_HOURS" 1 720 "CFN_HEALTH_RETENTION_HOURS"; then
|
|
224
|
+
errors=$((errors + 1))
|
|
225
|
+
fi
|
|
226
|
+
|
|
227
|
+
# Validate directory permissions
|
|
228
|
+
if ! validate_directory_writable "$CFN_BASE_DIR" "CFN_BASE_DIR"; then
|
|
229
|
+
errors=$((errors + 1))
|
|
230
|
+
fi
|
|
231
|
+
|
|
232
|
+
return $errors
|
|
233
|
+
}
|
|
234
|
+
|
|
235
|
+
# ==============================================================================
|
|
236
|
+
# INITIALIZATION FUNCTIONS
|
|
237
|
+
# ==============================================================================
|
|
238
|
+
|
|
239
|
+
# init_directories - Create required directories
|
|
240
|
+
init_directories() {
|
|
241
|
+
mkdir -p "$CFN_BASE_DIR" "$CFN_METRICS_DIR" "$CFN_HEALTH_DIR" "$CFN_ALERT_DIR" 2>/dev/null || true
|
|
242
|
+
|
|
243
|
+
# Verify directories were created
|
|
244
|
+
for dir in "$CFN_BASE_DIR" "$CFN_METRICS_DIR" "$CFN_HEALTH_DIR" "$CFN_ALERT_DIR"; do
|
|
245
|
+
if [ ! -d "$dir" ]; then
|
|
246
|
+
echo "ERROR: Failed to create directory: $dir" >&2
|
|
247
|
+
return 1
|
|
248
|
+
fi
|
|
249
|
+
done
|
|
250
|
+
|
|
251
|
+
return 0
|
|
252
|
+
}
|
|
253
|
+
|
|
254
|
+
# load_config - Load and validate configuration
|
|
255
|
+
load_config() {
|
|
256
|
+
# Validate configuration
|
|
257
|
+
if ! validate_config; then
|
|
258
|
+
echo "ERROR: Configuration validation failed" >&2
|
|
259
|
+
return 1
|
|
260
|
+
fi
|
|
261
|
+
|
|
262
|
+
# Initialize directories
|
|
263
|
+
if ! init_directories; then
|
|
264
|
+
echo "ERROR: Failed to initialize directories" >&2
|
|
265
|
+
return 1
|
|
266
|
+
fi
|
|
267
|
+
|
|
268
|
+
echo "Configuration loaded successfully" >&2
|
|
269
|
+
return 0
|
|
270
|
+
}
|
|
271
|
+
|
|
272
|
+
# print_config - Display current configuration
|
|
273
|
+
print_config() {
|
|
274
|
+
cat <<EOF
|
|
275
|
+
CLI Coordination Configuration:
|
|
276
|
+
|
|
277
|
+
STORAGE:
|
|
278
|
+
Base Directory: $CFN_BASE_DIR
|
|
279
|
+
Metrics Directory: $CFN_METRICS_DIR
|
|
280
|
+
Health Directory: $CFN_HEALTH_DIR
|
|
281
|
+
Alert Directory: $CFN_ALERT_DIR
|
|
282
|
+
|
|
283
|
+
PERFORMANCE:
|
|
284
|
+
Max Agents: $CFN_MAX_AGENTS
|
|
285
|
+
Shard Count: $CFN_SHARD_COUNT
|
|
286
|
+
Batch Size: $CFN_BATCH_SIZE
|
|
287
|
+
|
|
288
|
+
TIMEOUTS (ms):
|
|
289
|
+
Coordination: $CFN_COORDINATION_TIMEOUT
|
|
290
|
+
Health Check: $CFN_HEALTH_TIMEOUT (seconds)
|
|
291
|
+
Message Delivery: $CFN_MESSAGE_TIMEOUT
|
|
292
|
+
|
|
293
|
+
MONITORING:
|
|
294
|
+
Metrics Enabled: $CFN_METRICS_ENABLED
|
|
295
|
+
Alerting Enabled: $CFN_ALERTING_ENABLED
|
|
296
|
+
Alert Interval: $CFN_ALERT_INTERVAL seconds
|
|
297
|
+
|
|
298
|
+
THRESHOLDS:
|
|
299
|
+
Coordination Time: ${CFN_ALERT_COORD_TIME_MS}ms
|
|
300
|
+
Delivery Rate: ${CFN_ALERT_DELIVERY_RATE}%
|
|
301
|
+
Consensus Score: ${CFN_ALERT_CONSENSUS_SCORE}%
|
|
302
|
+
Confidence Score: ${CFN_ALERT_CONFIDENCE_SCORE}%
|
|
303
|
+
|
|
304
|
+
RETENTION (hours):
|
|
305
|
+
Metrics: $CFN_METRICS_RETENTION_HOURS
|
|
306
|
+
Alerts: $CFN_ALERT_RETENTION_HOURS
|
|
307
|
+
Health Checks: $CFN_HEALTH_RETENTION_HOURS
|
|
308
|
+
EOF
|
|
309
|
+
}
|
|
310
|
+
|
|
311
|
+
# ==============================================================================
|
|
312
|
+
# MAIN EXECUTION
|
|
313
|
+
# ==============================================================================
|
|
314
|
+
|
|
315
|
+
# Load configuration if sourced
|
|
316
|
+
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
|
|
317
|
+
# Script executed directly - print configuration
|
|
318
|
+
if load_config; then
|
|
319
|
+
print_config
|
|
320
|
+
exit 0
|
|
321
|
+
else
|
|
322
|
+
exit 1
|
|
323
|
+
fi
|
|
324
|
+
else
|
|
325
|
+
# Script sourced - auto-load configuration
|
|
326
|
+
load_config
|
|
327
|
+
fi
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# Docker Development Environment
|
|
2
|
+
# Used by: docker-compose.dev.yml
|
|
3
|
+
|
|
4
|
+
# Environment
|
|
5
|
+
NODE_ENV=development
|
|
6
|
+
|
|
7
|
+
# Agent Configuration (Development: 10 agents)
|
|
8
|
+
CFN_MAX_AGENTS=10
|
|
9
|
+
CFN_SHARD_COUNT=4
|
|
10
|
+
CFN_METRICS_ENABLED=true
|
|
11
|
+
CFN_ALERTING_ENABLED=false
|
|
12
|
+
CFN_LOG_LEVEL=debug
|
|
13
|
+
CFN_CONSENSUS_THRESHOLD=0.90
|
|
14
|
+
CFN_ALERT_COORD_TIME_MS=3000
|
|
15
|
+
|
|
16
|
+
# Memory and Storage
|
|
17
|
+
CFN_BASE_DIR=/tmp/cfn
|
|
18
|
+
CFN_AGENT_MEMORY_LIMIT_MB=100
|
|
19
|
+
CFN_TOTAL_MEMORY_LIMIT_MB=2048
|
|
20
|
+
|
|
21
|
+
# Performance
|
|
22
|
+
CFN_AGENT_TIMEOUT_MS=30000
|
|
23
|
+
CFN_CONSENSUS_TIMEOUT_MS=60000
|
|
24
|
+
CFN_SWARM_INIT_TIMEOUT_MS=10000
|
|
25
|
+
CFN_ENABLE_CACHING=true
|
|
26
|
+
CFN_CACHE_TTL_SECONDS=300
|
|
27
|
+
CFN_MAX_CONCURRENT_OPERATIONS=10
|
|
28
|
+
|
|
29
|
+
# Monitoring
|
|
30
|
+
CFN_METRICS_COLLECTION_INTERVAL_MS=5000
|
|
31
|
+
CFN_METRICS_RETENTION_HOURS=24
|
|
32
|
+
CFN_TRACK_COORDINATION_TIME=true
|
|
33
|
+
CFN_TRACK_MEMORY_USAGE=true
|
|
34
|
+
CFN_TRACK_AGENT_LIFECYCLE=true
|
|
35
|
+
|
|
36
|
+
# Security (Development: Disabled)
|
|
37
|
+
CFN_ENABLE_AGENT_AUTH=false
|
|
38
|
+
CFN_ENABLE_TLS=false
|
|
39
|
+
CFN_ENABLE_RATE_LIMITING=false
|
|
40
|
+
|
|
41
|
+
# MCP Server
|
|
42
|
+
CFN_MCP_SERVER_ENABLED=true
|
|
43
|
+
CFN_MCP_SERVER_PORT=3000
|
|
44
|
+
|
|
45
|
+
# Testing and Debugging
|
|
46
|
+
CFN_TEST_MODE=false
|
|
47
|
+
CFN_DEBUG_AGENT_SPAWN=true
|
|
48
|
+
CFN_DEBUG_CONSENSUS=true
|
|
49
|
+
CFN_DEBUG_MEMORY=true
|
|
50
|
+
CFN_VERBOSE_LOGGING=true
|
|
51
|
+
|
|
52
|
+
# Consensus Algorithm
|
|
53
|
+
CFN_CONSENSUS_ALGORITHM=byzantine
|