cfsa-antigravity 2.0.0 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +14 -0
- package/package.json +1 -1
- package/template/.agent/instructions/commands.md +8 -32
- package/template/.agent/instructions/example.md +21 -0
- package/template/.agent/instructions/patterns.md +3 -3
- package/template/.agent/instructions/tech-stack.md +71 -23
- package/template/.agent/instructions/workflow.md +12 -1
- package/template/.agent/rules/completion-checklist.md +6 -0
- package/template/.agent/rules/security-first.md +3 -3
- package/template/.agent/rules/vertical-slices.md +1 -1
- package/template/.agent/skill-library/MANIFEST.md +6 -0
- package/template/.agent/skill-library/stack/devops/git-advanced/SKILL.md +972 -0
- package/template/.agent/skill-library/stack/devops/git-workflow/SKILL.md +420 -0
- package/template/.agent/skills/api-versioning/SKILL.md +44 -298
- package/template/.agent/skills/api-versioning/references/typescript.md +157 -0
- package/template/.agent/skills/architecture-mapping/SKILL.md +13 -13
- package/template/.agent/skills/bootstrap-agents/SKILL.md +151 -152
- package/template/.agent/skills/clean-code/SKILL.md +64 -118
- package/template/.agent/skills/clean-code/references/typescript.md +126 -0
- package/template/.agent/skills/database-schema-design/SKILL.md +93 -317
- package/template/.agent/skills/database-schema-design/references/relational.md +228 -0
- package/template/.agent/skills/error-handling-patterns/SKILL.md +62 -557
- package/template/.agent/skills/error-handling-patterns/references/go.md +162 -0
- package/template/.agent/skills/error-handling-patterns/references/python.md +262 -0
- package/template/.agent/skills/error-handling-patterns/references/rust.md +112 -0
- package/template/.agent/skills/error-handling-patterns/references/typescript.md +178 -0
- package/template/.agent/skills/idea-extraction/SKILL.md +322 -224
- package/template/.agent/skills/logging-best-practices/SKILL.md +108 -767
- package/template/.agent/skills/logging-best-practices/references/go.md +49 -0
- package/template/.agent/skills/logging-best-practices/references/python.md +52 -0
- package/template/.agent/skills/logging-best-practices/references/typescript.md +215 -0
- package/template/.agent/skills/migration-management/SKILL.md +127 -311
- package/template/.agent/skills/migration-management/references/relational.md +214 -0
- package/template/.agent/skills/parallel-feature-development/SKILL.md +34 -43
- package/template/.agent/skills/pipeline-rubrics/references/be-rubric.md +1 -1
- package/template/.agent/skills/pipeline-rubrics/references/ia-rubric.md +2 -2
- package/template/.agent/skills/pipeline-rubrics/references/scoring.md +1 -1
- package/template/.agent/skills/pipeline-rubrics/references/vision-rubric.md +2 -1
- package/template/.agent/skills/prd-templates/SKILL.md +23 -6
- package/template/.agent/skills/prd-templates/references/be-spec-template.md +2 -2
- package/template/.agent/skills/prd-templates/references/decomposition-templates.md +2 -2
- package/template/.agent/skills/prd-templates/references/engineering-standards-template.md +2 -0
- package/template/.agent/skills/prd-templates/references/fe-spec-template.md +1 -1
- package/template/.agent/skills/prd-templates/references/fractal-cx-template.md +58 -0
- package/template/.agent/skills/prd-templates/references/fractal-feature-template.md +93 -0
- package/template/.agent/skills/prd-templates/references/fractal-node-index-template.md +55 -0
- package/template/.agent/skills/prd-templates/references/ideation-crosscut-template.md +26 -47
- package/template/.agent/skills/prd-templates/references/ideation-index-template.md +47 -31
- package/template/.agent/skills/prd-templates/references/operational-templates.md +1 -1
- package/template/.agent/skills/prd-templates/references/placeholder-workflow-mapping.md +50 -21
- package/template/.agent/skills/prd-templates/references/skill-loading-protocol.md +32 -0
- package/template/.agent/skills/prd-templates/references/slice-completion-gates.md +29 -0
- package/template/.agent/skills/prd-templates/references/spec-coverage-sweep.md +3 -3
- package/template/.agent/skills/prd-templates/references/tdd-testing-policy.md +39 -0
- package/template/.agent/skills/prd-templates/references/vision-template.md +8 -8
- package/template/.agent/skills/regex-patterns/SKILL.md +122 -540
- package/template/.agent/skills/regex-patterns/references/go.md +44 -0
- package/template/.agent/skills/regex-patterns/references/javascript.md +63 -0
- package/template/.agent/skills/regex-patterns/references/python.md +77 -0
- package/template/.agent/skills/regex-patterns/references/rust.md +43 -0
- package/template/.agent/skills/resolve-ambiguity/SKILL.md +1 -1
- package/template/.agent/skills/session-continuity/SKILL.md +11 -9
- package/template/.agent/skills/session-continuity/protocols/02-progress-generation.md +2 -2
- package/template/.agent/skills/session-continuity/protocols/04-pattern-extraction.md +1 -1
- package/template/.agent/skills/session-continuity/protocols/05-session-close.md +1 -1
- package/template/.agent/skills/session-continuity/protocols/09-parallel-claim.md +1 -1
- package/template/.agent/skills/session-continuity/protocols/10-placeholder-verification-gate.md +57 -78
- package/template/.agent/skills/session-continuity/protocols/11-parallel-synthesis.md +1 -1
- package/template/.agent/skills/spec-writing/SKILL.md +1 -1
- package/template/.agent/skills/tdd-workflow/SKILL.md +94 -317
- package/template/.agent/skills/tdd-workflow/references/typescript.md +231 -0
- package/template/.agent/skills/testing-strategist/SKILL.md +74 -687
- package/template/.agent/skills/testing-strategist/references/typescript.md +328 -0
- package/template/.agent/skills/workflow-automation/SKILL.md +62 -154
- package/template/.agent/skills/workflow-automation/references/inngest.md +88 -0
- package/template/.agent/skills/workflow-automation/references/temporal.md +64 -0
- package/template/.agent/workflows/bootstrap-agents-fill.md +85 -143
- package/template/.agent/workflows/bootstrap-agents-provision.md +90 -107
- package/template/.agent/workflows/create-prd-architecture.md +23 -16
- package/template/.agent/workflows/create-prd-compile.md +11 -12
- package/template/.agent/workflows/create-prd-design-system.md +1 -1
- package/template/.agent/workflows/create-prd-security.md +9 -11
- package/template/.agent/workflows/create-prd-stack.md +10 -4
- package/template/.agent/workflows/create-prd.md +9 -9
- package/template/.agent/workflows/decompose-architecture-structure.md +4 -6
- package/template/.agent/workflows/decompose-architecture-validate.md +18 -1
- package/template/.agent/workflows/decompose-architecture.md +18 -3
- package/template/.agent/workflows/evolve-contract.md +11 -11
- package/template/.agent/workflows/evolve-feature-classify.md +14 -6
- package/template/.agent/workflows/ideate-discover.md +72 -107
- package/template/.agent/workflows/ideate-extract.md +84 -63
- package/template/.agent/workflows/ideate-validate.md +26 -22
- package/template/.agent/workflows/ideate.md +9 -9
- package/template/.agent/workflows/implement-slice-setup.md +25 -23
- package/template/.agent/workflows/implement-slice-tdd.md +73 -89
- package/template/.agent/workflows/implement-slice.md +4 -4
- package/template/.agent/workflows/plan-phase-preflight.md +6 -2
- package/template/.agent/workflows/plan-phase-write.md +6 -8
- package/template/.agent/workflows/remediate-pipeline-assess.md +2 -1
- package/template/.agent/workflows/resolve-ambiguity.md +2 -2
- package/template/.agent/workflows/update-architecture-map.md +22 -5
- package/template/.agent/workflows/validate-phase-quality.md +155 -0
- package/template/.agent/workflows/validate-phase-readiness.md +167 -0
- package/template/.agent/workflows/validate-phase.md +19 -157
- package/template/.agent/workflows/verify-infrastructure.md +10 -10
- package/template/.agent/workflows/write-architecture-spec-design.md +23 -14
- package/template/.agent/workflows/write-be-spec-classify.md +25 -21
- package/template/.agent/workflows/write-be-spec.md +1 -1
- package/template/.agent/workflows/write-fe-spec-classify.md +6 -12
- package/template/.agent/workflows/write-fe-spec-write.md +1 -1
- package/template/AGENTS.md +6 -2
- package/template/GEMINI.md +5 -3
- package/template/docs/README.md +10 -10
- package/template/docs/kit-architecture.md +126 -33
- package/template/docs/plans/ideation/README.md +8 -3
- package/template/.agent/skills/prd-templates/references/ideation-domain-template.md +0 -55
|
@@ -20,689 +20,145 @@ Comprehensive guide to implementing structured, secure, and performant logging a
|
|
|
20
20
|
- Debugging production issues
|
|
21
21
|
- Compliance with logging regulations
|
|
22
22
|
|
|
23
|
-
##
|
|
24
|
-
|
|
25
|
-
### 1. **Log Levels**
|
|
26
|
-
|
|
27
|
-
#### Standard Log Levels
|
|
28
|
-
```typescript
|
|
29
|
-
// logger.ts
|
|
30
|
-
enum LogLevel {
|
|
31
|
-
DEBUG = 0, // Detailed information for debugging
|
|
32
|
-
INFO = 1, // General informational messages
|
|
33
|
-
WARN = 2, // Warning messages, potentially harmful
|
|
34
|
-
ERROR = 3, // Error messages, application can continue
|
|
35
|
-
FATAL = 4 // Critical errors, application must stop
|
|
36
|
-
}
|
|
23
|
+
## Stack-Specific References
|
|
37
24
|
|
|
38
|
-
|
|
39
|
-
constructor(private minLevel: LogLevel = LogLevel.INFO) {}
|
|
40
|
-
|
|
41
|
-
debug(message: string, context?: object) {
|
|
42
|
-
if (this.minLevel <= LogLevel.DEBUG) {
|
|
43
|
-
this.log(LogLevel.DEBUG, message, context);
|
|
44
|
-
}
|
|
45
|
-
}
|
|
46
|
-
|
|
47
|
-
info(message: string, context?: object) {
|
|
48
|
-
if (this.minLevel <= LogLevel.INFO) {
|
|
49
|
-
this.log(LogLevel.INFO, message, context);
|
|
50
|
-
}
|
|
51
|
-
}
|
|
52
|
-
|
|
53
|
-
warn(message: string, context?: object) {
|
|
54
|
-
if (this.minLevel <= LogLevel.WARN) {
|
|
55
|
-
this.log(LogLevel.WARN, message, context);
|
|
56
|
-
}
|
|
57
|
-
}
|
|
58
|
-
|
|
59
|
-
error(message: string, error?: Error, context?: object) {
|
|
60
|
-
if (this.minLevel <= LogLevel.ERROR) {
|
|
61
|
-
this.log(LogLevel.ERROR, message, {
|
|
62
|
-
...context,
|
|
63
|
-
error: {
|
|
64
|
-
message: error?.message,
|
|
65
|
-
stack: error?.stack,
|
|
66
|
-
name: error?.name
|
|
67
|
-
}
|
|
68
|
-
});
|
|
69
|
-
}
|
|
70
|
-
}
|
|
71
|
-
|
|
72
|
-
fatal(message: string, error?: Error, context?: object) {
|
|
73
|
-
this.log(LogLevel.FATAL, message, {
|
|
74
|
-
...context,
|
|
75
|
-
error: {
|
|
76
|
-
message: error?.message,
|
|
77
|
-
stack: error?.stack,
|
|
78
|
-
name: error?.name
|
|
79
|
-
}
|
|
80
|
-
});
|
|
81
|
-
process.exit(1);
|
|
82
|
-
}
|
|
83
|
-
|
|
84
|
-
private log(level: LogLevel, message: string, context?: object) {
|
|
85
|
-
const logEntry = {
|
|
86
|
-
timestamp: new Date().toISOString(),
|
|
87
|
-
level: LogLevel[level],
|
|
88
|
-
message,
|
|
89
|
-
...context
|
|
90
|
-
};
|
|
91
|
-
console.log(JSON.stringify(logEntry));
|
|
92
|
-
}
|
|
93
|
-
}
|
|
25
|
+
After reading the methodology below, read the reference matching your surface's Languages column:
|
|
94
26
|
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
27
|
+
| Language | Reference |
|
|
28
|
+
|----------|-----------|
|
|
29
|
+
| TypeScript / JavaScript | `references/typescript.md` |
|
|
30
|
+
| Python | `references/python.md` |
|
|
31
|
+
| Go | `references/go.md` |
|
|
99
32
|
|
|
100
|
-
|
|
101
|
-
logger.info('User logged in', { userId: '123' });
|
|
102
|
-
logger.warn('Rate limit approaching', { userId: '123', count: 95 });
|
|
103
|
-
logger.error('Database connection failed', dbError, { query: 'SELECT ...' });
|
|
104
|
-
```
|
|
33
|
+
---
|
|
105
34
|
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
#### Node.js with Winston
|
|
109
|
-
```typescript
|
|
110
|
-
// winston-logger.ts
|
|
111
|
-
import winston from 'winston';
|
|
112
|
-
|
|
113
|
-
const logger = winston.createLogger({
|
|
114
|
-
level: process.env.LOG_LEVEL || 'info',
|
|
115
|
-
format: winston.format.combine(
|
|
116
|
-
winston.format.timestamp(),
|
|
117
|
-
winston.format.errors({ stack: true }),
|
|
118
|
-
winston.format.json()
|
|
119
|
-
),
|
|
120
|
-
defaultMeta: {
|
|
121
|
-
service: 'user-service',
|
|
122
|
-
environment: process.env.NODE_ENV
|
|
123
|
-
},
|
|
124
|
-
transports: [
|
|
125
|
-
// Write to console
|
|
126
|
-
new winston.transports.Console({
|
|
127
|
-
format: winston.format.combine(
|
|
128
|
-
winston.format.colorize(),
|
|
129
|
-
winston.format.simple()
|
|
130
|
-
)
|
|
131
|
-
}),
|
|
132
|
-
// Write to file
|
|
133
|
-
new winston.transports.File({
|
|
134
|
-
filename: 'logs/error.log',
|
|
135
|
-
level: 'error',
|
|
136
|
-
maxsize: 5242880, // 5MB
|
|
137
|
-
maxFiles: 5
|
|
138
|
-
}),
|
|
139
|
-
new winston.transports.File({
|
|
140
|
-
filename: 'logs/combined.log',
|
|
141
|
-
maxsize: 5242880,
|
|
142
|
-
maxFiles: 5
|
|
143
|
-
})
|
|
144
|
-
]
|
|
145
|
-
});
|
|
146
|
-
|
|
147
|
-
// Usage
|
|
148
|
-
logger.info('User created', {
|
|
149
|
-
userId: user.id,
|
|
150
|
-
email: user.email,
|
|
151
|
-
requestId: req.id
|
|
152
|
-
});
|
|
153
|
-
|
|
154
|
-
logger.error('Payment processing failed', {
|
|
155
|
-
error: error.message,
|
|
156
|
-
stack: error.stack,
|
|
157
|
-
orderId: order.id,
|
|
158
|
-
amount: order.total,
|
|
159
|
-
userId: user.id
|
|
160
|
-
});
|
|
161
|
-
```
|
|
35
|
+
## 1. Log Levels
|
|
162
36
|
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
structlog.configure(
|
|
171
|
-
processors=[
|
|
172
|
-
structlog.stdlib.filter_by_level,
|
|
173
|
-
structlog.stdlib.add_logger_name,
|
|
174
|
-
structlog.stdlib.add_log_level,
|
|
175
|
-
structlog.stdlib.PositionalArgumentsFormatter(),
|
|
176
|
-
structlog.processors.TimeStamper(fmt="iso"),
|
|
177
|
-
structlog.processors.StackInfoRenderer(),
|
|
178
|
-
structlog.processors.format_exc_info,
|
|
179
|
-
structlog.processors.UnicodeDecoder(),
|
|
180
|
-
structlog.processors.JSONRenderer()
|
|
181
|
-
],
|
|
182
|
-
context_class=dict,
|
|
183
|
-
logger_factory=structlog.stdlib.LoggerFactory(),
|
|
184
|
-
cache_logger_on_first_use=True,
|
|
185
|
-
)
|
|
186
|
-
|
|
187
|
-
logger = structlog.get_logger()
|
|
188
|
-
|
|
189
|
-
# Usage
|
|
190
|
-
logger.info("user_created",
|
|
191
|
-
user_id=user.id,
|
|
192
|
-
email=user.email,
|
|
193
|
-
request_id=request.id
|
|
194
|
-
)
|
|
195
|
-
|
|
196
|
-
logger.error("payment_failed",
|
|
197
|
-
error=str(error),
|
|
198
|
-
order_id=order.id,
|
|
199
|
-
amount=order.total,
|
|
200
|
-
user_id=user.id
|
|
201
|
-
)
|
|
202
|
-
```
|
|
37
|
+
| Level | When to Use | Production Default |
|
|
38
|
+
|-------|------------|-------------------|
|
|
39
|
+
| **DEBUG** | Detailed info for debugging — request payloads, intermediate values | OFF |
|
|
40
|
+
| **INFO** | General operational events — user actions, transactions, startup | ON |
|
|
41
|
+
| **WARN** | Potentially harmful — rate limits approaching, retry, deprecations | ON |
|
|
42
|
+
| **ERROR** | Failures the app can recover from — failed request, DB timeout | ON |
|
|
43
|
+
| **FATAL** | Critical failures — the app must stop | ON (triggers alerts) |
|
|
203
44
|
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
import (
|
|
210
|
-
"go.uber.org/zap"
|
|
211
|
-
"go.uber.org/zap"
|
|
212
|
-
)
|
|
213
|
-
|
|
214
|
-
func main() {
|
|
215
|
-
// Production config (JSON)
|
|
216
|
-
logger, _ := zap.NewProduction()
|
|
217
|
-
defer logger.Sync()
|
|
218
|
-
|
|
219
|
-
// Development config (human-readable)
|
|
220
|
-
// logger, _ := zap.NewDevelopment()
|
|
221
|
-
|
|
222
|
-
logger.Info("User created",
|
|
223
|
-
zap.String("userId", user.ID),
|
|
224
|
-
zap.String("email", user.Email),
|
|
225
|
-
zap.String("requestId", req.ID),
|
|
226
|
-
)
|
|
227
|
-
|
|
228
|
-
logger.Error("Payment processing failed",
|
|
229
|
-
zap.Error(err),
|
|
230
|
-
zap.String("orderId", order.ID),
|
|
231
|
-
zap.Float64("amount", order.Total),
|
|
232
|
-
zap.String("userId", user.ID),
|
|
233
|
-
)
|
|
234
|
-
|
|
235
|
-
// Sugared logger for less structured logs
|
|
236
|
-
sugar := logger.Sugar()
|
|
237
|
-
sugar.Infow("User login",
|
|
238
|
-
"userId", user.ID,
|
|
239
|
-
"ip", req.IP,
|
|
240
|
-
)
|
|
241
|
-
}
|
|
242
|
-
```
|
|
45
|
+
**Environment rules:**
|
|
46
|
+
- Development: DEBUG and above
|
|
47
|
+
- Staging: INFO and above
|
|
48
|
+
- Production: INFO and above (DEBUG only via feature flag for specific modules)
|
|
243
49
|
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
#### Request Context Middleware
|
|
247
|
-
```typescript
|
|
248
|
-
// request-logger.ts
|
|
249
|
-
import { v4 as uuidv4 } from 'uuid';
|
|
250
|
-
import { AsyncLocalStorage } from 'async_hooks';
|
|
251
|
-
|
|
252
|
-
const asyncLocalStorage = new AsyncLocalStorage();
|
|
253
|
-
|
|
254
|
-
// Middleware to add request context
|
|
255
|
-
export function requestLogger(req, res, next) {
|
|
256
|
-
const requestId = req.headers['x-request-id'] || uuidv4();
|
|
257
|
-
const context = {
|
|
258
|
-
requestId,
|
|
259
|
-
method: req.method,
|
|
260
|
-
path: req.path,
|
|
261
|
-
ip: req.ip,
|
|
262
|
-
userAgent: req.headers['user-agent'],
|
|
263
|
-
userId: req.user?.id
|
|
264
|
-
};
|
|
265
|
-
|
|
266
|
-
asyncLocalStorage.run(context, () => {
|
|
267
|
-
logger.info('Request started', context);
|
|
268
|
-
|
|
269
|
-
// Log response when finished
|
|
270
|
-
res.on('finish', () => {
|
|
271
|
-
logger.info('Request completed', {
|
|
272
|
-
...context,
|
|
273
|
-
statusCode: res.statusCode,
|
|
274
|
-
duration: Date.now() - req.startTime
|
|
275
|
-
});
|
|
276
|
-
});
|
|
277
|
-
|
|
278
|
-
req.startTime = Date.now();
|
|
279
|
-
next();
|
|
280
|
-
});
|
|
281
|
-
}
|
|
50
|
+
---
|
|
282
51
|
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
}
|
|
52
|
+
## 2. Structured Logging (JSON)
|
|
53
|
+
|
|
54
|
+
All production logs MUST be structured (JSON format), not free-text. Structured logs enable:
|
|
55
|
+
- Machine-parseable log aggregation
|
|
56
|
+
- Field-based search and filtering
|
|
57
|
+
- Dashboards and alerting
|
|
58
|
+
|
|
59
|
+
**Every log entry must include:**
|
|
60
|
+
- `timestamp` — ISO 8601 format
|
|
61
|
+
- `level` — log level string
|
|
62
|
+
- `message` — human-readable description
|
|
63
|
+
- `service` — service name
|
|
64
|
+
- `environment` — deployment environment
|
|
297
65
|
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
log.error('Failed to fetch user', error, { userId: req.params.id });
|
|
310
|
-
res.status(500).json({ error: 'Internal server error' });
|
|
311
|
-
}
|
|
312
|
-
});
|
|
66
|
+
**Example output (any language):**
|
|
67
|
+
```json
|
|
68
|
+
{
|
|
69
|
+
"timestamp": "2024-01-15T10:30:00.000Z",
|
|
70
|
+
"level": "INFO",
|
|
71
|
+
"message": "User created",
|
|
72
|
+
"service": "user-service",
|
|
73
|
+
"environment": "production",
|
|
74
|
+
"userId": "abc-123",
|
|
75
|
+
"requestId": "req-456"
|
|
76
|
+
}
|
|
313
77
|
```
|
|
314
78
|
|
|
315
|
-
|
|
316
|
-
```typescript
|
|
317
|
-
// correlation-id.ts
|
|
318
|
-
export class CorrelationIdManager {
|
|
319
|
-
private static storage = new AsyncLocalStorage<string>();
|
|
79
|
+
---
|
|
320
80
|
|
|
321
|
-
|
|
322
|
-
return this.storage.run(correlationId, callback);
|
|
323
|
-
}
|
|
81
|
+
## 3. Contextual Logging
|
|
324
82
|
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
83
|
+
### Request Context
|
|
84
|
+
Attach request metadata to every log within a request lifecycle:
|
|
85
|
+
- **Request ID** — unique identifier for correlating logs from one request
|
|
86
|
+
- **Correlation ID** — propagated across service boundaries in distributed systems
|
|
87
|
+
- **User ID** — authenticated user (if available)
|
|
88
|
+
- **HTTP method/path** — what was requested
|
|
329
89
|
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
const correlationId = req.headers['x-correlation-id'] || uuidv4();
|
|
333
|
-
res.setHeader('x-correlation-id', correlationId);
|
|
334
|
-
|
|
335
|
-
CorrelationIdManager.run(correlationId, () => {
|
|
336
|
-
next();
|
|
337
|
-
});
|
|
338
|
-
});
|
|
339
|
-
|
|
340
|
-
// Enhanced logger
|
|
341
|
-
const enhancedLogger = {
|
|
342
|
-
info: (message: string, meta?: object) =>
|
|
343
|
-
logger.info(message, {
|
|
344
|
-
correlationId: CorrelationIdManager.get(),
|
|
345
|
-
...meta
|
|
346
|
-
})
|
|
347
|
-
};
|
|
348
|
-
```
|
|
90
|
+
### Correlation IDs
|
|
91
|
+
In distributed systems, propagate a correlation ID via headers (`X-Correlation-Id`) so that logs from multiple services can be traced together.
|
|
349
92
|
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
#### Data Sanitization
|
|
353
|
-
```typescript
|
|
354
|
-
// sanitizer.ts
|
|
355
|
-
const SENSITIVE_FIELDS = [
|
|
356
|
-
'password',
|
|
357
|
-
'token',
|
|
358
|
-
'apiKey',
|
|
359
|
-
'ssn',
|
|
360
|
-
'creditCard',
|
|
361
|
-
'email', // depending on regulations
|
|
362
|
-
'phone' // depending on regulations
|
|
363
|
-
];
|
|
364
|
-
|
|
365
|
-
function sanitize(obj: any): any {
|
|
366
|
-
if (typeof obj !== 'object' || obj === null) {
|
|
367
|
-
return obj;
|
|
368
|
-
}
|
|
369
|
-
|
|
370
|
-
if (Array.isArray(obj)) {
|
|
371
|
-
return obj.map(sanitize);
|
|
372
|
-
}
|
|
373
|
-
|
|
374
|
-
const sanitized = {};
|
|
375
|
-
for (const [key, value] of Object.entries(obj)) {
|
|
376
|
-
if (SENSITIVE_FIELDS.some(field =>
|
|
377
|
-
key.toLowerCase().includes(field.toLowerCase())
|
|
378
|
-
)) {
|
|
379
|
-
sanitized[key] = '[REDACTED]';
|
|
380
|
-
} else if (typeof value === 'object') {
|
|
381
|
-
sanitized[key] = sanitize(value);
|
|
382
|
-
} else {
|
|
383
|
-
sanitized[key] = value;
|
|
384
|
-
}
|
|
385
|
-
}
|
|
386
|
-
return sanitized;
|
|
387
|
-
}
|
|
93
|
+
---
|
|
388
94
|
|
|
389
|
-
|
|
390
|
-
logger.info('User data', sanitize({
|
|
391
|
-
userId: '123',
|
|
392
|
-
email: 'user@example.com', // Will be redacted
|
|
393
|
-
password: 'secret123', // Will be redacted
|
|
394
|
-
name: 'John Doe' // Will be logged
|
|
395
|
-
}));
|
|
396
|
-
|
|
397
|
-
// Output:
|
|
398
|
-
// {
|
|
399
|
-
// "userId": "123",
|
|
400
|
-
// "email": "[REDACTED]",
|
|
401
|
-
// "password": "[REDACTED]",
|
|
402
|
-
// "name": "John Doe"
|
|
403
|
-
// }
|
|
404
|
-
```
|
|
95
|
+
## 4. PII and Sensitive Data Handling
|
|
405
96
|
|
|
406
|
-
|
|
407
|
-
```typescript
|
|
408
|
-
// masking.ts
|
|
409
|
-
function maskEmail(email: string): string {
|
|
410
|
-
const [local, domain] = email.split('@');
|
|
411
|
-
const maskedLocal = local[0] + '*'.repeat(local.length - 2) + local[local.length - 1];
|
|
412
|
-
return `${maskedLocal}@${domain}`;
|
|
413
|
-
}
|
|
97
|
+
**CRITICAL:** PII must NEVER appear in plaintext in logs.
|
|
414
98
|
|
|
415
|
-
|
|
416
|
-
|
|
417
|
-
|
|
99
|
+
### Sensitive Fields (always redact or mask)
|
|
100
|
+
- Passwords, tokens, API keys
|
|
101
|
+
- SSN, credit card numbers
|
|
102
|
+
- Email addresses (depending on regulation)
|
|
103
|
+
- Phone numbers (depending on regulation)
|
|
418
104
|
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
105
|
+
### Strategies
|
|
106
|
+
| Strategy | When to Use |
|
|
107
|
+
|----------|-------------|
|
|
108
|
+
| **Redaction** — replace with `[REDACTED]` | Passwords, API keys, tokens |
|
|
109
|
+
| **Masking** — partial reveal (`u***r@example.com`) | Email, phone, credit card |
|
|
110
|
+
| **Hashing** — one-way hash | When you need to correlate without revealing |
|
|
111
|
+
| **Omission** — don't log the field at all | When the field serves no diagnostic purpose |
|
|
422
112
|
|
|
423
|
-
|
|
424
|
-
logger.info('User registered', {
|
|
425
|
-
userId: user.id,
|
|
426
|
-
email: maskEmail(user.email), // u***r@example.com
|
|
427
|
-
phone: maskPhone(user.phone), // ******1234
|
|
428
|
-
creditCard: maskCreditCard(user.card) // ************1234
|
|
429
|
-
});
|
|
430
|
-
```
|
|
113
|
+
---
|
|
431
114
|
|
|
432
|
-
|
|
433
|
-
|
|
434
|
-
```typescript
|
|
435
|
-
// performance-logger.ts
|
|
436
|
-
class PerformanceLogger {
|
|
437
|
-
private timers = new Map<string, number>();
|
|
438
|
-
|
|
439
|
-
start(operation: string) {
|
|
440
|
-
this.timers.set(operation, Date.now());
|
|
441
|
-
}
|
|
442
|
-
|
|
443
|
-
end(operation: string, metadata?: object) {
|
|
444
|
-
const startTime = this.timers.get(operation);
|
|
445
|
-
if (!startTime) return;
|
|
446
|
-
|
|
447
|
-
const duration = Date.now() - startTime;
|
|
448
|
-
this.timers.delete(operation);
|
|
449
|
-
|
|
450
|
-
logger.info(`Performance: ${operation}`, {
|
|
451
|
-
operation,
|
|
452
|
-
duration,
|
|
453
|
-
durationMs: duration,
|
|
454
|
-
...metadata
|
|
455
|
-
});
|
|
456
|
-
|
|
457
|
-
// Alert if slow
|
|
458
|
-
if (duration > 1000) {
|
|
459
|
-
logger.warn(`Slow operation: ${operation}`, {
|
|
460
|
-
operation,
|
|
461
|
-
duration,
|
|
462
|
-
threshold: 1000,
|
|
463
|
-
...metadata
|
|
464
|
-
});
|
|
465
|
-
}
|
|
466
|
-
}
|
|
467
|
-
|
|
468
|
-
async measure<T>(operation: string, fn: () => Promise<T>, metadata?: object): Promise<T> {
|
|
469
|
-
this.start(operation);
|
|
470
|
-
try {
|
|
471
|
-
return await fn();
|
|
472
|
-
} finally {
|
|
473
|
-
this.end(operation, metadata);
|
|
474
|
-
}
|
|
475
|
-
}
|
|
476
|
-
}
|
|
115
|
+
## 5. Performance Logging
|
|
477
116
|
|
|
478
|
-
|
|
479
|
-
|
|
117
|
+
Track operation timing for performance monitoring:
|
|
118
|
+
- **Start timer** before operation
|
|
119
|
+
- **End timer** after operation
|
|
120
|
+
- **Log duration** with context
|
|
121
|
+
- **Alert on threshold** if operation exceeds expected time
|
|
480
122
|
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
|
|
484
|
-
|
|
123
|
+
Key operations to time:
|
|
124
|
+
- Database queries
|
|
125
|
+
- External API calls
|
|
126
|
+
- File I/O operations
|
|
127
|
+
- Complex computations
|
|
485
128
|
|
|
486
|
-
|
|
487
|
-
const result = await perfLogger.measure(
|
|
488
|
-
'complex-operation',
|
|
489
|
-
async () => await processData(),
|
|
490
|
-
{ userId: '123' }
|
|
491
|
-
);
|
|
492
|
-
```
|
|
129
|
+
---
|
|
493
130
|
|
|
494
|
-
|
|
495
|
-
|
|
496
|
-
#### ELK Stack (Elasticsearch, Logstash, Kibana)
|
|
497
|
-
```yaml
|
|
498
|
-
# docker-compose.yml
|
|
499
|
-
version: '3'
|
|
500
|
-
services:
|
|
501
|
-
elasticsearch:
|
|
502
|
-
image: elasticsearch:8.0.0
|
|
503
|
-
environment:
|
|
504
|
-
- discovery.type=single-node
|
|
505
|
-
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
|
|
506
|
-
ports:
|
|
507
|
-
- "9200:9200"
|
|
508
|
-
|
|
509
|
-
logstash:
|
|
510
|
-
image: logstash:8.0.0
|
|
511
|
-
volumes:
|
|
512
|
-
- ./logstash.conf:/usr/share/logstash/pipeline/logstash.conf
|
|
513
|
-
ports:
|
|
514
|
-
- "5000:5000"
|
|
515
|
-
depends_on:
|
|
516
|
-
- elasticsearch
|
|
517
|
-
|
|
518
|
-
kibana:
|
|
519
|
-
image: kibana:8.0.0
|
|
520
|
-
ports:
|
|
521
|
-
- "5601:5601"
|
|
522
|
-
depends_on:
|
|
523
|
-
- elasticsearch
|
|
524
|
-
```
|
|
131
|
+
## 6. Centralized Logging
|
|
525
132
|
|
|
526
|
-
|
|
527
|
-
# logstash.conf
|
|
528
|
-
input {
|
|
529
|
-
tcp {
|
|
530
|
-
port => 5000
|
|
531
|
-
codec => json
|
|
532
|
-
}
|
|
533
|
-
}
|
|
133
|
+
For distributed systems, aggregate logs to a central system:
|
|
534
134
|
|
|
535
|
-
|
|
536
|
-
|
|
537
|
-
|
|
538
|
-
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
if [ip] {
|
|
543
|
-
geoip {
|
|
544
|
-
source => "ip"
|
|
545
|
-
}
|
|
546
|
-
}
|
|
547
|
-
}
|
|
135
|
+
| Tool | Type |
|
|
136
|
+
|------|------|
|
|
137
|
+
| **ELK Stack** | Elasticsearch + Logstash + Kibana (self-hosted) |
|
|
138
|
+
| **Grafana + Loki** | Lightweight log aggregation (self-hosted) |
|
|
139
|
+
| **Datadog** | Cloud monitoring and logging |
|
|
140
|
+
| **AWS CloudWatch** | AWS-native log management |
|
|
141
|
+
| **Splunk** | Enterprise log management |
|
|
548
142
|
|
|
549
|
-
|
|
550
|
-
elasticsearch {
|
|
551
|
-
hosts => ["elasticsearch:9200"]
|
|
552
|
-
index => "app-logs-%{+YYYY.MM.dd}"
|
|
553
|
-
}
|
|
554
|
-
}
|
|
555
|
-
```
|
|
143
|
+
---
|
|
556
144
|
|
|
557
|
-
|
|
558
|
-
```typescript
|
|
559
|
-
// winston-elk.ts
|
|
560
|
-
import winston from 'winston';
|
|
561
|
-
import 'winston-logstash';
|
|
562
|
-
|
|
563
|
-
const logger = winston.createLogger({
|
|
564
|
-
transports: [
|
|
565
|
-
new winston.transports.Logstash({
|
|
566
|
-
port: 5000,
|
|
567
|
-
host: 'logstash',
|
|
568
|
-
node_name: 'user-service',
|
|
569
|
-
max_connect_retries: -1
|
|
570
|
-
})
|
|
571
|
-
]
|
|
572
|
-
});
|
|
573
|
-
```
|
|
145
|
+
## 7. Distributed Tracing
|
|
574
146
|
|
|
575
|
-
|
|
576
|
-
|
|
577
|
-
|
|
578
|
-
|
|
579
|
-
import WinstonCloudWatch from 'winston-cloudwatch';
|
|
580
|
-
|
|
581
|
-
const logger = winston.createLogger({
|
|
582
|
-
transports: [
|
|
583
|
-
new WinstonCloudWatch({
|
|
584
|
-
logGroupName: '/aws/lambda/user-service',
|
|
585
|
-
logStreamName: () => {
|
|
586
|
-
const date = new Date().toISOString().split('T')[0];
|
|
587
|
-
return `${date}-${process.env.LAMBDA_VERSION}`;
|
|
588
|
-
},
|
|
589
|
-
awsRegion: 'us-east-1',
|
|
590
|
-
jsonMessage: true
|
|
591
|
-
})
|
|
592
|
-
]
|
|
593
|
-
});
|
|
594
|
-
```
|
|
147
|
+
For microservice architectures, use OpenTelemetry (or similar) to trace requests across services:
|
|
148
|
+
- Create spans for each operation
|
|
149
|
+
- Propagate trace context via headers
|
|
150
|
+
- Export to Jaeger, Zipkin, or Datadog
|
|
595
151
|
|
|
596
|
-
|
|
597
|
-
|
|
598
|
-
```typescript
|
|
599
|
-
// tracing.ts
|
|
600
|
-
import opentelemetry from '@opentelemetry/api';
|
|
601
|
-
import { NodeTracerProvider } from '@opentelemetry/node';
|
|
602
|
-
import { SimpleSpanProcessor } from '@opentelemetry/tracing';
|
|
603
|
-
import { JaegerExporter } from '@opentelemetry/exporter-jaeger';
|
|
604
|
-
|
|
605
|
-
// Setup tracer
|
|
606
|
-
const provider = new NodeTracerProvider();
|
|
607
|
-
provider.addSpanProcessor(
|
|
608
|
-
new SimpleSpanProcessor(
|
|
609
|
-
new JaegerExporter({
|
|
610
|
-
serviceName: 'user-service',
|
|
611
|
-
endpoint: 'http://jaeger:14268/api/traces'
|
|
612
|
-
})
|
|
613
|
-
)
|
|
614
|
-
);
|
|
615
|
-
provider.register();
|
|
616
|
-
|
|
617
|
-
const tracer = opentelemetry.trace.getTracer('user-service');
|
|
618
|
-
|
|
619
|
-
// Usage in application
|
|
620
|
-
app.get('/api/users/:id', async (req, res) => {
|
|
621
|
-
const span = tracer.startSpan('get-user', {
|
|
622
|
-
attributes: {
|
|
623
|
-
'http.method': req.method,
|
|
624
|
-
'http.url': req.url,
|
|
625
|
-
'user.id': req.params.id
|
|
626
|
-
}
|
|
627
|
-
});
|
|
628
|
-
|
|
629
|
-
try {
|
|
630
|
-
const user = await fetchUser(req.params.id, span);
|
|
631
|
-
span.setStatus({ code: opentelemetry.SpanStatusCode.OK });
|
|
632
|
-
res.json(user);
|
|
633
|
-
} catch (error) {
|
|
634
|
-
span.setStatus({
|
|
635
|
-
code: opentelemetry.SpanStatusCode.ERROR,
|
|
636
|
-
message: error.message
|
|
637
|
-
});
|
|
638
|
-
res.status(500).json({ error: 'Internal server error' });
|
|
639
|
-
} finally {
|
|
640
|
-
span.end();
|
|
641
|
-
}
|
|
642
|
-
});
|
|
643
|
-
|
|
644
|
-
async function fetchUser(userId: string, parentSpan: Span) {
|
|
645
|
-
const span = tracer.startSpan('database-query', {
|
|
646
|
-
parent: parentSpan,
|
|
647
|
-
attributes: { 'db.statement': 'SELECT * FROM users WHERE id = ?' }
|
|
648
|
-
});
|
|
649
|
-
|
|
650
|
-
try {
|
|
651
|
-
const user = await db.query('SELECT * FROM users WHERE id = ?', [userId]);
|
|
652
|
-
return user;
|
|
653
|
-
} finally {
|
|
654
|
-
span.end();
|
|
655
|
-
}
|
|
656
|
-
}
|
|
657
|
-
```
|
|
152
|
+
---
|
|
658
153
|
|
|
659
|
-
|
|
660
|
-
|
|
661
|
-
|
|
662
|
-
|
|
663
|
-
|
|
664
|
-
|
|
665
|
-
|
|
666
|
-
|
|
667
|
-
) {}
|
|
668
|
-
|
|
669
|
-
info(message: string, meta?: object) {
|
|
670
|
-
if (this.shouldSample()) {
|
|
671
|
-
this.logger.info(message, meta);
|
|
672
|
-
}
|
|
673
|
-
}
|
|
674
|
-
|
|
675
|
-
// Always log warnings and errors
|
|
676
|
-
warn(message: string, meta?: object) {
|
|
677
|
-
this.logger.warn(message, meta);
|
|
678
|
-
}
|
|
679
|
-
|
|
680
|
-
error(message: string, error: Error, meta?: object) {
|
|
681
|
-
this.logger.error(message, error, meta);
|
|
682
|
-
}
|
|
683
|
-
|
|
684
|
-
private shouldSample(): boolean {
|
|
685
|
-
return Math.random() < this.sampleRate;
|
|
686
|
-
}
|
|
687
|
-
|
|
688
|
-
// Sample based on user ID (consistent sampling)
|
|
689
|
-
infoSampled(userId: string, message: string, meta?: object) {
|
|
690
|
-
const hash = this.hashUserId(userId);
|
|
691
|
-
if (hash % 100 < this.sampleRate * 100) {
|
|
692
|
-
this.logger.info(message, { ...meta, sampled: true });
|
|
693
|
-
}
|
|
694
|
-
}
|
|
695
|
-
|
|
696
|
-
private hashUserId(userId: string): number {
|
|
697
|
-
let hash = 0;
|
|
698
|
-
for (let i = 0; i < userId.length; i++) {
|
|
699
|
-
hash = ((hash << 5) - hash) + userId.charCodeAt(i);
|
|
700
|
-
hash |= 0;
|
|
701
|
-
}
|
|
702
|
-
return Math.abs(hash);
|
|
703
|
-
}
|
|
704
|
-
}
|
|
705
|
-
```
|
|
154
|
+
## 8. Log Sampling (High-Volume Services)
|
|
155
|
+
|
|
156
|
+
For high-volume services, sample INFO/DEBUG logs to reduce volume:
|
|
157
|
+
- **Random sampling** — log N% of requests
|
|
158
|
+
- **Consistent sampling** — hash user ID so same user always gets logged (or not)
|
|
159
|
+
- **Always log** WARN and ERROR — never sample these
|
|
160
|
+
|
|
161
|
+
---
|
|
706
162
|
|
|
707
163
|
## Best Practices
|
|
708
164
|
|
|
@@ -723,7 +179,7 @@ class SamplingLogger {
|
|
|
723
179
|
|
|
724
180
|
### ❌ DON'T
|
|
725
181
|
- Log passwords, tokens, or sensitive data
|
|
726
|
-
- Use console.log in production
|
|
182
|
+
- Use print/console.log in production
|
|
727
183
|
- Log at DEBUG level in production by default
|
|
728
184
|
- Log inside tight loops (use sampling)
|
|
729
185
|
- Include PII without anonymization
|
|
@@ -734,118 +190,3 @@ class SamplingLogger {
|
|
|
734
190
|
- Log binary data or large objects
|
|
735
191
|
- Use string concatenation (use structured fields)
|
|
736
192
|
- Log every single request in high-volume APIs
|
|
737
|
-
|
|
738
|
-
## Common Patterns
|
|
739
|
-
|
|
740
|
-
### Pattern 1: Error Boundary Logging
|
|
741
|
-
```typescript
|
|
742
|
-
class ErrorBoundary {
|
|
743
|
-
static async handle(fn: () => Promise<void>) {
|
|
744
|
-
try {
|
|
745
|
-
await fn();
|
|
746
|
-
} catch (error) {
|
|
747
|
-
logger.error('Unhandled error', error, {
|
|
748
|
-
function: fn.name,
|
|
749
|
-
stack: error.stack
|
|
750
|
-
});
|
|
751
|
-
throw error;
|
|
752
|
-
}
|
|
753
|
-
}
|
|
754
|
-
}
|
|
755
|
-
```
|
|
756
|
-
|
|
757
|
-
### Pattern 2: Audit Logging
|
|
758
|
-
```typescript
|
|
759
|
-
function auditLog(action: string, resource: string) {
|
|
760
|
-
return function(target: any, propertyKey: string, descriptor: PropertyDescriptor) {
|
|
761
|
-
const originalMethod = descriptor.value;
|
|
762
|
-
|
|
763
|
-
descriptor.value = async function(...args: any[]) {
|
|
764
|
-
const result = await originalMethod.apply(this, args);
|
|
765
|
-
|
|
766
|
-
logger.info('Audit', {
|
|
767
|
-
action,
|
|
768
|
-
resource,
|
|
769
|
-
userId: this.userId,
|
|
770
|
-
timestamp: new Date().toISOString(),
|
|
771
|
-
result: sanitize(result)
|
|
772
|
-
});
|
|
773
|
-
|
|
774
|
-
return result;
|
|
775
|
-
};
|
|
776
|
-
|
|
777
|
-
return descriptor;
|
|
778
|
-
};
|
|
779
|
-
}
|
|
780
|
-
|
|
781
|
-
// Usage
|
|
782
|
-
class UserService {
|
|
783
|
-
@auditLog('DELETE', 'user')
|
|
784
|
-
async deleteUser(userId: string) {
|
|
785
|
-
// ...
|
|
786
|
-
}
|
|
787
|
-
}
|
|
788
|
-
```
|
|
789
|
-
|
|
790
|
-
## Tools & Resources
|
|
791
|
-
|
|
792
|
-
- **Winston**: Versatile Node.js logger
|
|
793
|
-
- **Pino**: Fast JSON logger for Node.js
|
|
794
|
-
- **structlog**: Structured logging for Python
|
|
795
|
-
- **zap**: Fast structured logging for Go
|
|
796
|
-
- **Logback**: Java logging framework
|
|
797
|
-
- **ELK Stack**: Elasticsearch, Logstash, Kibana
|
|
798
|
-
- **Splunk**: Enterprise log management
|
|
799
|
-
- **Datadog**: Cloud monitoring and logging
|
|
800
|
-
- **CloudWatch**: AWS log management
|
|
801
|
-
- **Jaeger**: Distributed tracing
|
|
802
|
-
|
|
803
|
-
## Observability Architecture Interview
|
|
804
|
-
|
|
805
|
-
This interview runs during `/create-prd-security` §7.5. All 5 decisions must be confirmed before the security section is complete.
|
|
806
|
-
|
|
807
|
-
### Decision 1 — Logging Strategy
|
|
808
|
-
|
|
809
|
-
- **Logging library name** — the specific library (e.g., Pino, Winston, structlog, zap).
|
|
810
|
-
- **Structured JSON in production** — yes or no.
|
|
811
|
-
- **Log levels per environment** — dev: debug, staging: info, prod: warn.
|
|
812
|
-
- **PII field names that are never logged** — enumerate explicitly (e.g., `password`, `ssn`, `creditCard`, `token`).
|
|
813
|
-
- **Log destination** — stdout, file, cloud service — name it.
|
|
814
|
-
|
|
815
|
-
**Bootstrap fire:** When logging is confirmed, always fire `/bootstrap-agents OBSERVABILITY=structured-logging` first to provision baseline logging guidance. If the confirmed library or stack maps to an additional observability tool (e.g., Datadog, OpenTelemetry, Pino), also fire `/bootstrap-agents OBSERVABILITY=[tool-specific value]`.
|
|
816
|
-
|
|
817
|
-
### Decision 2 — Tracing Strategy
|
|
818
|
-
|
|
819
|
-
- **Which service boundaries are traced** — name the services or layers where trace spans are created.
|
|
820
|
-
- **Sampling rate per environment** — e.g., dev: 100%, staging: 50%, prod: 10%.
|
|
821
|
-
- **Trace ID propagation to API clients** — header name used to propagate trace IDs (e.g., `X-Trace-Id`, `traceparent`).
|
|
822
|
-
|
|
823
|
-
**Bootstrap fire:** If a specific tracing tool is confirmed, invoke `/bootstrap-agents OBSERVABILITY=[confirmed value]`.
|
|
824
|
-
|
|
825
|
-
### Decision 3 — Alerting Thresholds
|
|
826
|
-
|
|
827
|
-
- **Error rate percentage that triggers alert** — e.g., 5% of requests returning 5xx in a 5-minute window.
|
|
828
|
-
- **Latency threshold (ms) + duration before alert** — e.g., p95 > 500ms for 3 consecutive minutes.
|
|
829
|
-
- **Queue depth warning level** — e.g., background job queue exceeds 1000 items.
|
|
830
|
-
- **Delivery mechanism** — PagerDuty, Slack, email — name it.
|
|
831
|
-
|
|
832
|
-
**Bootstrap fire:** If a specific monitoring tool is confirmed, invoke `/bootstrap-agents MONITORING=[confirmed value]`.
|
|
833
|
-
|
|
834
|
-
### Decision 4 — Launch Dashboards
|
|
835
|
-
|
|
836
|
-
- **Minimum required panels** — name each panel (e.g., request rate, error rate, p50/p95/p99 latency, active connections, queue depth, CPU/memory utilization).
|
|
837
|
-
- **Tool** — Grafana, Datadog, CloudWatch — name it.
|
|
838
|
-
- **Dashboard owner** — role, not person (e.g., "on-call engineer", "platform team lead").
|
|
839
|
-
|
|
840
|
-
### Decision 5 — Retention
|
|
841
|
-
|
|
842
|
-
- **Log retention duration** — e.g., 30 days hot, 90 days cold.
|
|
843
|
-
- **Trace retention duration** — e.g., 7 days.
|
|
844
|
-
- **Compliance alignment** — if applicable (e.g., SOC2 requires 1 year of audit logs).
|
|
845
|
-
|
|
846
|
-
### User Presentation Prompts
|
|
847
|
-
|
|
848
|
-
Present these two questions to the user for confirmation:
|
|
849
|
-
|
|
850
|
-
1. "Are these logging levels and PII exclusions correct for your compliance requirements?"
|
|
851
|
-
2. "Are the alerting thresholds appropriate for your expected traffic?"
|