start-vibing-stacks 2.6.0 → 2.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/index.js +16 -2
- package/dist/migrate.d.ts +27 -0
- package/dist/migrate.js +217 -0
- package/dist/setup.js +10 -0
- package/package.json +1 -1
- package/stacks/_shared/agents/claude-md-compactor.md +1 -0
- package/stacks/_shared/agents/commit-manager.md +1 -0
- package/stacks/_shared/agents/documenter.md +1 -0
- package/stacks/_shared/agents/domain-updater.md +1 -0
- package/stacks/_shared/agents/research-web.md +1 -0
- package/stacks/_shared/agents/security-auditor.md +168 -0
- package/stacks/_shared/agents/tester.md +1 -0
- package/stacks/_shared/hooks/final-check.ts +205 -0
- package/stacks/_shared/hooks/stop-validator.ts +77 -1
- package/stacks/_shared/skills/accessibility-wcag22/SKILL.md +284 -0
- package/stacks/_shared/skills/ci-pipelines/SKILL.md +166 -0
- package/stacks/_shared/skills/codebase-knowledge/SKILL.md +5 -0
- package/stacks/_shared/skills/database-migrations/SKILL.md +256 -0
- package/stacks/_shared/skills/debugging-patterns/SKILL.md +5 -0
- package/stacks/_shared/skills/docker-patterns/SKILL.md +5 -0
- package/stacks/_shared/skills/docs-tracker/SKILL.md +5 -0
- package/stacks/_shared/skills/error-handling/SKILL.md +335 -0
- package/stacks/_shared/skills/final-check/SKILL.md +74 -37
- package/stacks/_shared/skills/git-workflow/SKILL.md +5 -0
- package/stacks/_shared/skills/hook-development/SKILL.md +5 -0
- package/stacks/_shared/skills/observability/SKILL.md +351 -0
- package/stacks/_shared/skills/performance-patterns/SKILL.md +5 -0
- package/stacks/_shared/skills/playwright-automation/SKILL.md +5 -0
- package/stacks/_shared/skills/quality-gate/SKILL.md +5 -0
- package/stacks/_shared/skills/research-cache/SKILL.md +5 -0
- package/stacks/_shared/skills/secrets-management/SKILL.md +245 -0
- package/stacks/_shared/skills/security-baseline/SKILL.md +202 -0
- package/stacks/_shared/skills/test-coverage/SKILL.md +5 -0
- package/stacks/_shared/skills/ui-ux-audit/SKILL.md +5 -0
- package/stacks/frontend/react/skills/preline-ui/SKILL.md +5 -0
- package/stacks/frontend/react/skills/react-patterns/SKILL.md +5 -0
- package/stacks/frontend/react/skills/react-standards/SKILL.md +5 -0
- package/stacks/frontend/react/skills/react-ui-patterns/SKILL.md +5 -0
- package/stacks/frontend/react/skills/shadcn-ui/SKILL.md +5 -0
- package/stacks/frontend/react/skills/tailwind-patterns/SKILL.md +5 -0
- package/stacks/frontend/react/skills/zod-validation/SKILL.md +5 -0
- package/stacks/frontend/react-inertia/skills/inertia-react/SKILL.md +5 -0
- package/stacks/frontend/react-inertia/skills/react-standards/SKILL.md +5 -0
- package/stacks/nodejs/skills/api-security-node/SKILL.md +275 -0
- package/stacks/nodejs/skills/bun-runtime/SKILL.md +5 -0
- package/stacks/nodejs/skills/mongoose-patterns/SKILL.md +5 -0
- package/stacks/nodejs/skills/nextjs-app-router/SKILL.md +5 -0
- package/stacks/nodejs/skills/trpc-api/SKILL.md +5 -0
- package/stacks/nodejs/skills/typescript-strict/SKILL.md +5 -0
- package/stacks/nodejs/stack.json +2 -1
- package/stacks/nodejs/workflows/ci.yml +90 -0
- package/stacks/nodejs/workflows/security.yml +45 -0
- package/stacks/php/skills/api-design/SKILL.md +5 -0
- package/stacks/php/skills/api-security/SKILL.md +5 -0
- package/stacks/php/skills/composer-workflow/SKILL.md +5 -0
- package/stacks/php/skills/external-api-patterns/SKILL.md +5 -0
- package/stacks/php/skills/inertia-react/SKILL.md +5 -0
- package/stacks/php/skills/laravel-inertia-i18n/SKILL.md +5 -0
- package/stacks/php/skills/laravel-octane/SKILL.md +5 -0
- package/stacks/php/skills/laravel-patterns/SKILL.md +5 -0
- package/stacks/php/skills/mariadb-octane/SKILL.md +5 -0
- package/stacks/php/skills/php-patterns/SKILL.md +5 -0
- package/stacks/php/skills/phpstan-analysis/SKILL.md +5 -0
- package/stacks/php/skills/phpunit-testing/SKILL.md +5 -0
- package/stacks/php/skills/security-scan-php/SKILL.md +5 -0
- package/stacks/php/workflows/ci.yml +106 -0
- package/stacks/php/workflows/security.yml +36 -0
- package/stacks/python/skills/api-security-python/SKILL.md +312 -0
- package/stacks/python/skills/async-patterns/SKILL.md +5 -0
- package/stacks/python/skills/django-patterns/SKILL.md +5 -0
- package/stacks/python/skills/fastapi-patterns/SKILL.md +5 -0
- package/stacks/python/skills/pydantic-validation/SKILL.md +5 -0
- package/stacks/python/skills/pytest-testing/SKILL.md +5 -0
- package/stacks/python/skills/python-patterns/SKILL.md +5 -0
- package/stacks/python/skills/python-performance/SKILL.md +5 -0
- package/stacks/python/skills/scripting-automation/SKILL.md +5 -0
- package/stacks/python/stack.json +2 -1
- package/stacks/python/workflows/ci.yml +76 -0
- package/stacks/python/workflows/security.yml +56 -0
|
@@ -0,0 +1,351 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: observability
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
description: Structured logging, correlation IDs, OpenTelemetry tracing, error tracking (Sentry), metrics, and PII redaction. Invoke when adding logging, instrumentation, or debugging production issues.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Observability — Logs, Traces, Metrics
|
|
8
|
+
|
|
9
|
+
**ALWAYS invoke when adding logging, instrumentation, error tracking, or analyzing production issues.**
|
|
10
|
+
|
|
11
|
+
> Three pillars: **Logs** (what happened) + **Traces** (where time was spent) + **Metrics** (how much / how often).
|
|
12
|
+
> One signal: **correlation/trace IDs** that thread through all three.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## 1. Structured Logging — Mandatory
|
|
17
|
+
|
|
18
|
+
Logs are JSON, not text. Free-form strings can't be queried, aggregated, or alerted on.
|
|
19
|
+
|
|
20
|
+
### Required fields per log line
|
|
21
|
+
|
|
22
|
+
| Field | Source |
|
|
23
|
+
|---|---|
|
|
24
|
+
| `timestamp` | ISO-8601, UTC |
|
|
25
|
+
| `level` | `trace` / `debug` / `info` / `warn` / `error` / `fatal` |
|
|
26
|
+
| `msg` | Short human description |
|
|
27
|
+
| `service` | Service name |
|
|
28
|
+
| `env` | `production` / `staging` / `development` |
|
|
29
|
+
| `trace_id` | OpenTelemetry trace id (W3C `traceparent`) |
|
|
30
|
+
| `span_id` | Current span |
|
|
31
|
+
| `request_id` | Inbound HTTP request id (mirror to `x-request-id` header) |
|
|
32
|
+
| `user_id` | Authenticated user (hash if PII concerns) — **never** email/full name |
|
|
33
|
+
|
|
34
|
+
### Node.js — pino
|
|
35
|
+
```ts
|
|
36
|
+
// lib/logger.ts
|
|
37
|
+
import pino from 'pino';
|
|
38
|
+
import { randomUUID } from 'crypto';
|
|
39
|
+
|
|
40
|
+
export const logger = pino({
|
|
41
|
+
level: process.env['LOG_LEVEL'] ?? 'info',
|
|
42
|
+
base: {
|
|
43
|
+
service: 'api',
|
|
44
|
+
env: process.env['NODE_ENV'],
|
|
45
|
+
pid: process.pid,
|
|
46
|
+
},
|
|
47
|
+
timestamp: pino.stdTimeFunctions.isoTime,
|
|
48
|
+
redact: {
|
|
49
|
+
paths: [
|
|
50
|
+
'req.headers.authorization',
|
|
51
|
+
'req.headers.cookie',
|
|
52
|
+
'req.body.password',
|
|
53
|
+
'req.body.token',
|
|
54
|
+
'*.password',
|
|
55
|
+
'*.creditCard',
|
|
56
|
+
'*.ssn',
|
|
57
|
+
],
|
|
58
|
+
censor: '[REDACTED]',
|
|
59
|
+
},
|
|
60
|
+
});
|
|
61
|
+
|
|
62
|
+
// HTTP middleware: attach request_id + child logger to req
|
|
63
|
+
export function requestLogger(req, res, next) {
|
|
64
|
+
const requestId = req.headers['x-request-id'] ?? randomUUID();
|
|
65
|
+
res.setHeader('x-request-id', requestId);
|
|
66
|
+
req.log = logger.child({ request_id: requestId, method: req.method, path: req.path });
|
|
67
|
+
req.log.info({ event: 'request.start' });
|
|
68
|
+
res.on('finish', () => {
|
|
69
|
+
req.log.info({ event: 'request.end', status: res.statusCode, duration_ms: Date.now() - req.startTime });
|
|
70
|
+
});
|
|
71
|
+
next();
|
|
72
|
+
}
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### Python — structlog
|
|
76
|
+
```python
|
|
77
|
+
import structlog, logging, sys, uuid
|
|
78
|
+
|
|
79
|
+
structlog.configure(
|
|
80
|
+
processors=[
|
|
81
|
+
structlog.contextvars.merge_contextvars,
|
|
82
|
+
structlog.processors.add_log_level,
|
|
83
|
+
structlog.processors.TimeStamper(fmt="iso", utc=True),
|
|
84
|
+
structlog.processors.StackInfoRenderer(),
|
|
85
|
+
structlog.processors.format_exc_info,
|
|
86
|
+
structlog.processors.JSONRenderer(),
|
|
87
|
+
],
|
|
88
|
+
wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
|
|
89
|
+
logger_factory=structlog.PrintLoggerFactory(file=sys.stdout),
|
|
90
|
+
)
|
|
91
|
+
|
|
92
|
+
log = structlog.get_logger()
|
|
93
|
+
|
|
94
|
+
# FastAPI middleware
|
|
95
|
+
@app.middleware("http")
|
|
96
|
+
async def request_logger(request, call_next):
|
|
97
|
+
request_id = request.headers.get("x-request-id", str(uuid.uuid4()))
|
|
98
|
+
structlog.contextvars.bind_contextvars(
|
|
99
|
+
request_id=request_id, method=request.method, path=request.url.path
|
|
100
|
+
)
|
|
101
|
+
log.info("request.start")
|
|
102
|
+
response = await call_next(request)
|
|
103
|
+
response.headers["x-request-id"] = request_id
|
|
104
|
+
log.info("request.end", status=response.status_code)
|
|
105
|
+
structlog.contextvars.clear_contextvars()
|
|
106
|
+
return response
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### PHP — Monolog
|
|
110
|
+
```php
|
|
111
|
+
// config/logging.php
|
|
112
|
+
'channels' => [
|
|
113
|
+
'json' => [
|
|
114
|
+
'driver' => 'monolog',
|
|
115
|
+
'handler' => Monolog\Handler\StreamHandler::class,
|
|
116
|
+
'with' => ['stream' => 'php://stdout'],
|
|
117
|
+
'formatter' => Monolog\Formatter\JsonFormatter::class,
|
|
118
|
+
'processors' => [
|
|
119
|
+
Monolog\Processor\WebProcessor::class,
|
|
120
|
+
Monolog\Processor\UidProcessor::class,
|
|
121
|
+
],
|
|
122
|
+
],
|
|
123
|
+
],
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## 2. Log Levels — Use Them Right
|
|
129
|
+
|
|
130
|
+
| Level | When |
|
|
131
|
+
|---|---|
|
|
132
|
+
| `trace` | Verbose diagnostics, off in prod |
|
|
133
|
+
| `debug` | Development helper, off in prod by default |
|
|
134
|
+
| `info` | Business events: signup, payment, login, job completed |
|
|
135
|
+
| `warn` | Recoverable anomaly: retry, deprecated path, fallback used |
|
|
136
|
+
| `error` | Operation failed for one user/request, system continues |
|
|
137
|
+
| `fatal` | Cannot continue, process exits |
|
|
138
|
+
|
|
139
|
+
Default prod level: `info`. Errors should always page or alert. `debug` flooding logs is a cost issue.
|
|
140
|
+
|
|
141
|
+
---
|
|
142
|
+
|
|
143
|
+
## 3. PII Redaction — Mandatory
|
|
144
|
+
|
|
145
|
+
Never log:
|
|
146
|
+
- Passwords, hashes, tokens, cookies, `Authorization` headers
|
|
147
|
+
- Full credit card / IBAN / SSN / passport
|
|
148
|
+
- Plaintext email if your jurisdiction (GDPR/LGPD) treats it as PII without legitimate basis
|
|
149
|
+
- Full request body or full response body without filtering
|
|
150
|
+
|
|
151
|
+
Redact at the logger level (so it can't be bypassed by a forgetful caller):
|
|
152
|
+
|
|
153
|
+
```ts
|
|
154
|
+
// pino redact paths (above)
|
|
155
|
+
// or wrap manually:
|
|
156
|
+
function safeBody(body: any) {
|
|
157
|
+
const out = { ...body };
|
|
158
|
+
for (const k of ['password', 'token', 'cookie', 'authorization']) {
|
|
159
|
+
if (k in out) out[k] = '[REDACTED]';
|
|
160
|
+
}
|
|
161
|
+
return out;
|
|
162
|
+
}
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
For email: log a hash of the email or the first 2 chars + domain (`jo***@example.com`). Document the policy.
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## 4. Distributed Tracing — OpenTelemetry
|
|
170
|
+
|
|
171
|
+
Tracing answers "where did the request spend time" across services.
|
|
172
|
+
|
|
173
|
+
### Node.js (auto-instrumentation)
|
|
174
|
+
```ts
|
|
175
|
+
// instrumentation.ts — load BEFORE any other import
|
|
176
|
+
import { NodeSDK } from '@opentelemetry/sdk-node';
|
|
177
|
+
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
|
|
178
|
+
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
|
|
179
|
+
import { Resource } from '@opentelemetry/resources';
|
|
180
|
+
import { ATTR_SERVICE_NAME } from '@opentelemetry/semantic-conventions';
|
|
181
|
+
|
|
182
|
+
new NodeSDK({
|
|
183
|
+
resource: new Resource({ [ATTR_SERVICE_NAME]: 'api' }),
|
|
184
|
+
traceExporter: new OTLPTraceExporter({ url: process.env['OTEL_EXPORTER_OTLP_ENDPOINT'] }),
|
|
185
|
+
instrumentations: [getNodeAutoInstrumentations()],
|
|
186
|
+
}).start();
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
Run: `node --import ./instrumentation.js dist/index.js`
|
|
190
|
+
|
|
191
|
+
### Python (FastAPI)
|
|
192
|
+
```python
|
|
193
|
+
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
|
|
194
|
+
from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor
|
|
195
|
+
from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
|
|
196
|
+
|
|
197
|
+
FastAPIInstrumentor.instrument_app(app)
|
|
198
|
+
SQLAlchemyInstrumentor().instrument(engine=engine)
|
|
199
|
+
HTTPXClientInstrumentor().instrument()
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### Manual span — when you want to time business logic
|
|
203
|
+
```ts
|
|
204
|
+
import { trace } from '@opentelemetry/api';
|
|
205
|
+
const tracer = trace.getTracer('billing');
|
|
206
|
+
|
|
207
|
+
await tracer.startActiveSpan('charge_customer', async (span) => {
|
|
208
|
+
try {
|
|
209
|
+
span.setAttribute('customer.id', customerId);
|
|
210
|
+
const charge = await stripe.charges.create({ amount, customer: customerId });
|
|
211
|
+
span.setAttribute('charge.id', charge.id);
|
|
212
|
+
return charge;
|
|
213
|
+
} catch (err) {
|
|
214
|
+
span.recordException(err);
|
|
215
|
+
span.setStatus({ code: 2 /* ERROR */ });
|
|
216
|
+
throw err;
|
|
217
|
+
} finally {
|
|
218
|
+
span.end();
|
|
219
|
+
}
|
|
220
|
+
});
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
---
|
|
224
|
+
|
|
225
|
+
## 5. Error Tracking — Sentry
|
|
226
|
+
|
|
227
|
+
Pair with structured logging. Sentry is for **alertable** errors with full stack + breadcrumbs; logs are for **everything**.
|
|
228
|
+
|
|
229
|
+
### Node.js
|
|
230
|
+
```ts
|
|
231
|
+
import * as Sentry from '@sentry/node';
|
|
232
|
+
|
|
233
|
+
Sentry.init({
|
|
234
|
+
dsn: process.env['SENTRY_DSN'],
|
|
235
|
+
environment: process.env['NODE_ENV'],
|
|
236
|
+
tracesSampleRate: 0.1,
|
|
237
|
+
profilesSampleRate: 0.1,
|
|
238
|
+
beforeSend(event) {
|
|
239
|
+
// Defense in depth — strip if logger redaction missed it
|
|
240
|
+
if (event.request?.cookies) delete event.request.cookies;
|
|
241
|
+
if (event.request?.headers?.['authorization']) {
|
|
242
|
+
event.request.headers['authorization'] = '[REDACTED]';
|
|
243
|
+
}
|
|
244
|
+
return event;
|
|
245
|
+
},
|
|
246
|
+
});
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
### Python
|
|
250
|
+
```python
|
|
251
|
+
import sentry_sdk
|
|
252
|
+
from sentry_sdk.integrations.fastapi import FastApiIntegration
|
|
253
|
+
from sentry_sdk.scrubber import EventScrubber, DEFAULT_DENYLIST
|
|
254
|
+
|
|
255
|
+
sentry_sdk.init(
|
|
256
|
+
dsn=os.environ["SENTRY_DSN"],
|
|
257
|
+
environment=os.environ["APP_ENV"],
|
|
258
|
+
traces_sample_rate=0.1,
|
|
259
|
+
integrations=[FastApiIntegration()],
|
|
260
|
+
event_scrubber=EventScrubber(denylist=DEFAULT_DENYLIST + ["jwt", "session"]),
|
|
261
|
+
send_default_pii=False,
|
|
262
|
+
)
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## 6. Metrics
|
|
268
|
+
|
|
269
|
+
Track **RED** (Rate / Errors / Duration) on every endpoint and **USE** (Utilization / Saturation / Errors) on every resource.
|
|
270
|
+
|
|
271
|
+
OpenTelemetry metrics + Prometheus exporter, or vendor SDK (Datadog, New Relic).
|
|
272
|
+
|
|
273
|
+
```ts
|
|
274
|
+
import { metrics } from '@opentelemetry/api';
|
|
275
|
+
const meter = metrics.getMeter('api');
|
|
276
|
+
|
|
277
|
+
const httpDuration = meter.createHistogram('http.server.duration', { unit: 'ms' });
|
|
278
|
+
const ordersCreated = meter.createCounter('orders.created');
|
|
279
|
+
|
|
280
|
+
httpDuration.record(durationMs, { method, route, status: String(statusCode) });
|
|
281
|
+
ordersCreated.add(1, { plan: order.plan });
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
Cardinality rule: never tag with user_id, email, or unbounded values — explodes time-series storage. Use `route`, `status`, `plan`, etc.
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## 7. Health Checks
|
|
289
|
+
|
|
290
|
+
Two endpoints, distinct semantics:
|
|
291
|
+
|
|
292
|
+
```ts
|
|
293
|
+
app.get('/healthz', (_, res) => res.json({ ok: true })); // liveness — am I running?
|
|
294
|
+
|
|
295
|
+
app.get('/readyz', async (_, res) => { // readiness — can I serve?
|
|
296
|
+
const checks = await Promise.allSettled([
|
|
297
|
+
db.raw('SELECT 1'),
|
|
298
|
+
redis.ping(),
|
|
299
|
+
]);
|
|
300
|
+
const allOk = checks.every(c => c.status === 'fulfilled');
|
|
301
|
+
res.status(allOk ? 200 : 503).json({ ok: allOk, checks });
|
|
302
|
+
});
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
Kubernetes uses `livenessProbe` → `/healthz` (restart on fail) and `readinessProbe` → `/readyz` (remove from LB on fail).
|
|
306
|
+
|
|
307
|
+
---
|
|
308
|
+
|
|
309
|
+
## 8. Audit Log — Separate Stream
|
|
310
|
+
|
|
311
|
+
Security-relevant events go to a dedicated, append-only stream:
|
|
312
|
+
|
|
313
|
+
- Logins (success + failure)
|
|
314
|
+
- Authorization denials
|
|
315
|
+
- Permission/role changes
|
|
316
|
+
- Admin actions (impersonation, data exports, deletes)
|
|
317
|
+
- Payment events
|
|
318
|
+
- Data access (when required by compliance: HIPAA, SOC 2)
|
|
319
|
+
|
|
320
|
+
Different retention (often longer), different access (security team), different integrity guarantees (immutable, hash-chained).
|
|
321
|
+
|
|
322
|
+
---
|
|
323
|
+
|
|
324
|
+
## Pre-Commit Checklist
|
|
325
|
+
|
|
326
|
+
- [ ] Logger configured with redact paths for password/token/cookie/authorization
|
|
327
|
+
- [ ] All endpoints emit `request.start` / `request.end` with `request_id`
|
|
328
|
+
- [ ] Errors include stack trace and `trace_id`
|
|
329
|
+
- [ ] No `console.log` / `print` / `dd()` left in code (use the logger)
|
|
330
|
+
- [ ] No PII in info-level logs
|
|
331
|
+
- [ ] Sentry initialized with `beforeSend` scrubber
|
|
332
|
+
- [ ] Tracing initialized at process start (before app code)
|
|
333
|
+
- [ ] `/healthz` + `/readyz` endpoints exist
|
|
334
|
+
|
|
335
|
+
## FORBIDDEN
|
|
336
|
+
|
|
337
|
+
| Pattern | Reason |
|
|
338
|
+
|---|---|
|
|
339
|
+
| `console.log(req.body)` | Leaks passwords, tokens |
|
|
340
|
+
| Logger without redaction | Future caller will leak |
|
|
341
|
+
| Unbounded label cardinality (user_id as metric tag) | Cost explosion |
|
|
342
|
+
| Same severity for everything (all `info`) | No alerting signal |
|
|
343
|
+
| Tracing in dev only | Prod is where you need it |
|
|
344
|
+
| Sentry without scrubber | PII in error reports |
|
|
345
|
+
| Audit log in same stream as app log | Tamper risk, retention conflict |
|
|
346
|
+
|
|
347
|
+
## See Also
|
|
348
|
+
|
|
349
|
+
- `secrets-management` — what NEVER to log
|
|
350
|
+
- `error-handling` — Result types and error taxonomy
|
|
351
|
+
- `security-baseline` — A09: Security Logging Failures
|
|
@@ -0,0 +1,245 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: secrets-management
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
description: Environment variable hygiene, secret detection, rotation, and secret-store patterns. Invoke whenever .env, secrets, API keys, or env-var-reading code are touched.
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Secrets Management
|
|
8
|
+
|
|
9
|
+
**ALWAYS invoke when touching `.env`, `process.env`, `os.environ`, secrets, API keys, or auth credentials.**
|
|
10
|
+
|
|
11
|
+
## Core Rules
|
|
12
|
+
|
|
13
|
+
1. **No secrets in code, ever.** Not in tests, not in fixtures, not in comments.
|
|
14
|
+
2. **No secrets in logs.** Redact before logging.
|
|
15
|
+
3. **No secrets in URLs / query strings.** Headers or body only — URLs land in proxy logs.
|
|
16
|
+
4. **No secrets in error messages** returned to clients.
|
|
17
|
+
5. **`.env` is in `.gitignore`. `.env.example` is committed.** No exceptions.
|
|
18
|
+
6. **Rotate after any leak.** Even suspected. The risk window is the time it takes to rotate, not when you think the leak started.
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## File Layout
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
.env # gitignored, real values, local dev
|
|
26
|
+
.env.example # committed, placeholders only
|
|
27
|
+
.env.production # NEVER committed; use platform secret store
|
|
28
|
+
.env.test # gitignored if real keys; committed if all-fake
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
`.gitignore`:
|
|
32
|
+
```
|
|
33
|
+
.env
|
|
34
|
+
.env.*
|
|
35
|
+
!.env.example
|
|
36
|
+
!.env.test.example
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
`.env.example` template:
|
|
40
|
+
```bash
|
|
41
|
+
# Application
|
|
42
|
+
NODE_ENV=development
|
|
43
|
+
APP_URL=http://localhost:3000
|
|
44
|
+
|
|
45
|
+
# Database
|
|
46
|
+
DATABASE_URL=postgres://user:pass@localhost:5432/dbname
|
|
47
|
+
|
|
48
|
+
# Auth
|
|
49
|
+
JWT_SECRET=<generate: openssl rand -base64 32>
|
|
50
|
+
SESSION_SECRET=<generate: openssl rand -base64 32>
|
|
51
|
+
|
|
52
|
+
# Third-party (use service prefix to scan with allowlists)
|
|
53
|
+
STRIPE_SECRET_KEY=sk_test_...
|
|
54
|
+
OPENAI_API_KEY=sk-...
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
## Reading Env Vars Safely
|
|
60
|
+
|
|
61
|
+
### Node.js / TypeScript
|
|
62
|
+
```ts
|
|
63
|
+
// Validate at boot — fail fast if missing
|
|
64
|
+
import { z } from 'zod';
|
|
65
|
+
|
|
66
|
+
const Env = z.object({
|
|
67
|
+
NODE_ENV: z.enum(['development', 'test', 'production']),
|
|
68
|
+
DATABASE_URL: z.string().url(),
|
|
69
|
+
JWT_SECRET: z.string().min(32),
|
|
70
|
+
STRIPE_SECRET_KEY: z.string().startsWith('sk_'),
|
|
71
|
+
});
|
|
72
|
+
|
|
73
|
+
export const env = Env.parse(process.env);
|
|
74
|
+
// Now `env.JWT_SECRET` is typed and guaranteed present
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Bracket notation only (`tsconfig` strict):
|
|
78
|
+
```ts
|
|
79
|
+
process.env['JWT_SECRET'] // CORRECT
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Python
|
|
83
|
+
```python
|
|
84
|
+
from pydantic_settings import BaseSettings, SettingsConfigDict
|
|
85
|
+
|
|
86
|
+
class Settings(BaseSettings):
|
|
87
|
+
model_config = SettingsConfigDict(env_file=".env", extra="forbid")
|
|
88
|
+
database_url: str
|
|
89
|
+
jwt_secret: str
|
|
90
|
+
stripe_secret_key: str
|
|
91
|
+
|
|
92
|
+
settings = Settings() # raises if missing/invalid
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### PHP / Laravel
|
|
96
|
+
```php
|
|
97
|
+
// config/services.php — read once, cached via `php artisan config:cache`
|
|
98
|
+
return [
|
|
99
|
+
'stripe' => ['secret' => env('STRIPE_SECRET_KEY')],
|
|
100
|
+
];
|
|
101
|
+
// Use config('services.stripe.secret') in app code — NEVER env() at runtime
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
---
|
|
105
|
+
|
|
106
|
+
## Public vs Server-Only Vars
|
|
107
|
+
|
|
108
|
+
| Stack | Public prefix | Rule |
|
|
109
|
+
|---|---|---|
|
|
110
|
+
| Next.js | `NEXT_PUBLIC_` | Anything with this prefix is shipped to the browser |
|
|
111
|
+
| Vite | `VITE_` | Same — bundled into JS |
|
|
112
|
+
| CRA | `REACT_APP_` | Same |
|
|
113
|
+
|
|
114
|
+
`security-rules.json` enforces: nothing matching `*SECRET|*TOKEN|*PRIVATE|*PASSWORD|*CREDENTIAL` may have a public prefix.
|
|
115
|
+
|
|
116
|
+
---
|
|
117
|
+
|
|
118
|
+
## Secret Stores (Production)
|
|
119
|
+
|
|
120
|
+
| Tool | When |
|
|
121
|
+
|---|---|
|
|
122
|
+
| **Vercel/Netlify env vars** | Simple deployments |
|
|
123
|
+
| **AWS Secrets Manager** / **GCP Secret Manager** / **Azure Key Vault** | Cloud-native, IAM-scoped |
|
|
124
|
+
| **Doppler** | Multi-environment sync |
|
|
125
|
+
| **HashiCorp Vault** | On-prem / self-hosted, dynamic creds |
|
|
126
|
+
| **SOPS + age** | Git-encrypted secrets (GitOps) |
|
|
127
|
+
| **1Password Connect / op-cli** | Team-shared dev secrets |
|
|
128
|
+
|
|
129
|
+
**Rule:** `.env.production` is **never** committed. Production secrets live in the platform store and are injected at deploy/runtime.
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Detection — Block Before Commit
|
|
134
|
+
|
|
135
|
+
### gitleaks (recommended)
|
|
136
|
+
|
|
137
|
+
Install:
|
|
138
|
+
```bash
|
|
139
|
+
brew install gitleaks # macOS
|
|
140
|
+
# or: docker run --rm -v $(pwd):/repo zricethezav/gitleaks:latest detect -s /repo
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
Pre-commit hook:
|
|
144
|
+
```bash
|
|
145
|
+
gitleaks protect --staged --redact --verbose
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
CI step:
|
|
149
|
+
```yaml
|
|
150
|
+
- name: Gitleaks scan
|
|
151
|
+
uses: gitleaks/gitleaks-action@v2
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### Quick grep (as a fallback)
|
|
155
|
+
|
|
156
|
+
```bash
|
|
157
|
+
# Block obvious patterns in staged diff
|
|
158
|
+
git diff --cached -U0 | grep -nEi \
|
|
159
|
+
'(api[_-]?key|secret|token|bearer|password|aws_(access|secret)|private_key)\s*[:=]\s*["'\''][a-zA-Z0-9/+=_-]{16,}'
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
---
|
|
163
|
+
|
|
164
|
+
## Rotation Playbook
|
|
165
|
+
|
|
166
|
+
When a secret leaks (or is suspected):
|
|
167
|
+
|
|
168
|
+
1. **Revoke** at the provider immediately. Do not wait for git history rewrites.
|
|
169
|
+
2. **Issue new secret**, deploy with new value.
|
|
170
|
+
3. **Update audit log** — what leaked, when, who, scope of access.
|
|
171
|
+
4. **Scan history**: `gitleaks detect --log-opts="--all"` — find every commit/branch that contained it.
|
|
172
|
+
5. **History rewrite is optional** — git filter-repo / BFG only if you also rotate. The leaked secret in history is already gone from the world's POV once rotated.
|
|
173
|
+
6. **Notify** — if customer data was reachable with the leaked secret, follow incident-disclosure policy.
|
|
174
|
+
|
|
175
|
+
Token lifetimes (default targets):
|
|
176
|
+
- Access token: ≤ 15 min
|
|
177
|
+
- Refresh token: ≤ 7 days, rotated on use
|
|
178
|
+
- Service-to-service token: ≤ 90 days, rotated automatically
|
|
179
|
+
- Long-lived (e.g. cron API keys): document expiry, calendar reminder
|
|
180
|
+
|
|
181
|
+
---
|
|
182
|
+
|
|
183
|
+
## Logging Without Leaks
|
|
184
|
+
|
|
185
|
+
```ts
|
|
186
|
+
// Redact known fields before logging
|
|
187
|
+
const REDACT = ['password', 'token', 'authorization', 'cookie', 'secret', 'apiKey', 'creditCard'];
|
|
188
|
+
|
|
189
|
+
function safeLog(obj: unknown) {
|
|
190
|
+
return JSON.parse(JSON.stringify(obj, (key, val) =>
|
|
191
|
+
REDACT.some(r => key.toLowerCase().includes(r.toLowerCase())) ? '[REDACTED]' : val
|
|
192
|
+
));
|
|
193
|
+
}
|
|
194
|
+
|
|
195
|
+
logger.info({ event: 'login_attempt', body: safeLog(req.body) });
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
Use `pino-noir`, `winston`'s `format.printf` filter, or your APM's redaction. Same for Python: `structlog` processors; PHP: Monolog `RedactProcessor`.
|
|
199
|
+
|
|
200
|
+
See `observability` skill for full structured-logging setup.
|
|
201
|
+
|
|
202
|
+
---
|
|
203
|
+
|
|
204
|
+
## Supply-Chain Hygiene
|
|
205
|
+
|
|
206
|
+
| Stack | Audit |
|
|
207
|
+
|---|---|
|
|
208
|
+
| Node.js | `npm audit --audit-level=high`, `bun audit`, `pnpm audit` |
|
|
209
|
+
| Python | `pip-audit`, `safety check` |
|
|
210
|
+
| PHP | `composer audit` |
|
|
211
|
+
|
|
212
|
+
Lockfile rules:
|
|
213
|
+
- **Always commit** `package-lock.json` / `bun.lock` / `pnpm-lock.yaml` / `poetry.lock` / `uv.lock` / `composer.lock`.
|
|
214
|
+
- **Renovate** or **Dependabot** for automated PRs.
|
|
215
|
+
- Pin major versions; allow minor/patch auto-merge after CI passes.
|
|
216
|
+
|
|
217
|
+
---
|
|
218
|
+
|
|
219
|
+
## FORBIDDEN
|
|
220
|
+
|
|
221
|
+
| Pattern | Reason |
|
|
222
|
+
|---|---|
|
|
223
|
+
| `.env` committed | Trivial leak |
|
|
224
|
+
| Secret in `README.md` / docstring | Leaks via search engines |
|
|
225
|
+
| Secret in test fixtures | Leaks via PR diffs / forks |
|
|
226
|
+
| Secret in `console.log` / `print` / `Log::info` | Ends up in CloudWatch / Datadog forever |
|
|
227
|
+
| Secret in URL query string | Logged by every proxy in the chain |
|
|
228
|
+
| Secret in commit message | Cannot delete, history rewrites costly |
|
|
229
|
+
| `process.env.SECRET ?? "fallback-value"` with real fallback | Hardcoded backup secret |
|
|
230
|
+
| Sharing via Slack/email | Use 1Password / Vault sharing |
|
|
231
|
+
|
|
232
|
+
## Pre-Commit Checklist
|
|
233
|
+
|
|
234
|
+
- [ ] `.env` in `.gitignore`
|
|
235
|
+
- [ ] `.env.example` updated when new var added
|
|
236
|
+
- [ ] gitleaks (or fallback grep) clean
|
|
237
|
+
- [ ] No secret in code, tests, fixtures, comments, logs
|
|
238
|
+
- [ ] No `NEXT_PUBLIC_*SECRET|*TOKEN|*PRIVATE` patterns
|
|
239
|
+
- [ ] Env vars validated at boot (Zod / Pydantic / config())
|
|
240
|
+
|
|
241
|
+
## See Also
|
|
242
|
+
|
|
243
|
+
- `security-baseline` — broader OWASP scope
|
|
244
|
+
- `observability` — log redaction details
|
|
245
|
+
- Stack `api-security-*` — usage patterns
|