@manifest-cyber/observability-ts 0.2.0 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/TRACING_GUIDE.md DELETED
@@ -1,791 +0,0 @@
1
- # Distributed Tracing with @manifest-cyber/observability-ts
2
-
3
- Complete guide to implementing distributed tracing across Manifest Cyber services.
4
-
5
- ## Table of Contents
6
-
7
- - [Quick Start](#quick-start)
8
- - [Core Concepts](#core-concepts)
9
- - [HTTP Tracing](#http-tracing)
10
- - [Message Queue Tracing](#message-queue-tracing)
11
- - [Database Tracing](#database-tracing)
12
- - [Integration with logger-ts](#integration-with-logger-ts)
13
- - [Best Practices](#best-practices)
14
- - [Examples](#examples)
15
-
16
- ---
17
-
18
- ## Quick Start
19
-
20
- ### 1. Installation
21
-
22
- ```bash
23
- npm install @manifest-cyber/observability-ts
24
- npm install @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-grpc
25
- ```
26
-
27
- ### 2. Initialize Tracing
28
-
29
- ```typescript
30
- import { initTracing } from '@manifest-cyber/observability-ts';
31
-
32
- await initTracing({
33
- serviceName: process.env.SERVICE_NAME || 'my-service',
34
- environment: process.env.ENV || 'development',
35
- exporter: {
36
- type: 'otlp-grpc',
37
- endpoint: process.env.OTEL_EXPORTER_OTLP_ENDPOINT || 'http://localhost:4317',
38
- },
39
- sampling: {
40
- rate: process.env.ENV === 'production' ? 0.1 : 1.0,
41
- },
42
- });
43
- ```
44
-
45
- ### 3. Create Spans
46
-
47
- ```typescript
48
- import { withSpan } from '@manifest-cyber/observability-ts';
49
-
50
- const result = await withSpan(
51
- 'operation.name',
52
- async () => {
53
- return await doWork();
54
- },
55
- {
56
- 'custom.attribute': 'value',
57
- },
58
- );
59
- ```
60
-
61
- ---
62
-
63
- ## Core Concepts
64
-
65
- ### Traces, Spans, and Context
66
-
67
- - **Trace**: End-to-end journey of a request through your system
68
- - **Span**: A single operation within a trace (HTTP call, DB query, function execution)
69
- - **Context**: Carries trace information across service boundaries
70
-
71
- ### W3C Trace Context Format
72
-
73
- ```
74
- traceparent: 00-{trace-id}-{span-id}-{flags}
75
- Example: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
76
- ```
77
-
78
- - `00`: Version
79
- - `trace-id`: 32-character hex (128 bits)
80
- - `span-id`: 16-character hex (64 bits)
81
- - `flags`: Sampling decision
82
-
83
- ---
84
-
85
- ## HTTP Tracing
86
-
87
- ### Client (Outgoing Requests)
88
-
89
- ```typescript
90
- import {
91
- injectTraceContext,
92
- withSpan,
93
- SpanStatusCode,
94
- } from '@manifest-cyber/observability-ts';
95
-
96
- await withSpan(
97
- 'http.client.request',
98
- async () => {
99
- const headers = {};
100
- injectTraceContext(headers);
101
-
102
- const response = await fetch('https://api.example.com/users', {
103
- headers: {
104
- ...headers, // Contains traceparent
105
- 'Content-Type': 'application/json',
106
- },
107
- });
108
-
109
- return response.json();
110
- },
111
- {
112
- 'http.method': 'GET',
113
- 'http.url': 'https://api.example.com/users',
114
- 'http.status_code': 200,
115
- },
116
- );
117
- ```
118
-
119
- ### Server (Incoming Requests)
120
-
121
- #### Express Middleware
122
-
123
- ```typescript
124
- import {
125
- extractTraceContext,
126
- getTracer,
127
- SpanStatusCode,
128
- } from '@manifest-cyber/observability-ts';
129
- import { context, trace } from '@opentelemetry/api';
130
-
131
- app.use((req, res, next) => {
132
- const ctx = extractTraceContext(req.headers);
133
- const tracer = getTracer();
134
-
135
- context.with(ctx, () => {
136
- const span = tracer.startSpan('http.server.request', {
137
- attributes: {
138
- 'http.method': req.method,
139
- 'http.url': req.url,
140
- 'http.target': req.path,
141
- 'http.user_agent': req.headers['user-agent'],
142
- },
143
- });
144
-
145
- // Store span for later use
146
- req.span = span;
147
-
148
- res.on('finish', () => {
149
- span.setAttribute('http.status_code', res.statusCode);
150
- span.setStatus({
151
- code: res.statusCode >= 500 ? SpanStatusCode.ERROR : SpanStatusCode.OK,
152
- });
153
- span.end();
154
- });
155
-
156
- next();
157
- });
158
- });
159
- ```
160
-
161
- ---
162
-
163
- ## Message Queue Tracing
164
-
165
- ### Producer (Send Message)
166
-
167
- #### SQS
168
-
169
- ```typescript
170
- import { createMessageTraceContext, withSpan } from '@manifest-cyber/observability-ts';
171
-
172
- await withSpan(
173
- 'messaging.send',
174
- async () => {
175
- const traceContext = createMessageTraceContext();
176
-
177
- await sqs.sendMessage({
178
- QueueUrl: queueUrl,
179
- MessageBody: JSON.stringify(payload),
180
- MessageAttributes: {
181
- traceparent: {
182
- DataType: 'String',
183
- StringValue: traceContext.traceparent,
184
- },
185
- },
186
- });
187
- },
188
- {
189
- 'messaging.system': 'sqs',
190
- 'messaging.destination': queueName,
191
- 'messaging.operation': 'send',
192
- },
193
- );
194
- ```
195
-
196
- #### RabbitMQ
197
-
198
- ```typescript
199
- import { createMessageTraceContext, withSpan } from '@manifest-cyber/observability-ts';
200
-
201
- await withSpan(
202
- 'messaging.send',
203
- async () => {
204
- const traceContext = createMessageTraceContext();
205
-
206
- channel.sendToQueue(queueName, Buffer.from(JSON.stringify(payload)), {
207
- headers: {
208
- traceparent: traceContext.traceparent,
209
- },
210
- persistent: true,
211
- });
212
- },
213
- {
214
- 'messaging.system': 'rabbitmq',
215
- 'messaging.destination': queueName,
216
- },
217
- );
218
- ```
219
-
220
- ### Consumer (Receive Message)
221
-
222
- #### SQS Consumer
223
-
224
- ```typescript
225
- import { extractMessageTraceContext, withSpan } from '@manifest-cyber/observability-ts';
226
- import { context } from '@opentelemetry/api';
227
-
228
- consumer.on('message', async (message) => {
229
- const ctx = extractMessageTraceContext(message.MessageAttributes);
230
-
231
- if (ctx) {
232
- await context.with(ctx, async () => {
233
- await withSpan(
234
- 'messaging.process',
235
- async () => {
236
- await processMessage(message);
237
- },
238
- {
239
- 'messaging.system': 'sqs',
240
- 'messaging.destination': queueName,
241
- 'messaging.message_id': message.MessageId,
242
- },
243
- );
244
- });
245
- } else {
246
- // No trace context - create new trace
247
- await withSpan('messaging.process', async () => {
248
- await processMessage(message);
249
- });
250
- }
251
- });
252
- ```
253
-
254
- #### RabbitMQ Consumer
255
-
256
- ```typescript
257
- import { extractMessageTraceContext, withSpan } from '@manifest-cyber/observability-ts';
258
- import { context } from '@opentelemetry/api';
259
-
260
- channel.consume(queueName, async (msg) => {
261
- if (!msg) return;
262
-
263
- const ctx = extractMessageTraceContext(msg.properties.headers);
264
-
265
- if (ctx) {
266
- await context.with(ctx, async () => {
267
- await withSpan(
268
- 'messaging.process',
269
- async () => {
270
- await processMessage(msg);
271
- },
272
- {
273
- 'messaging.system': 'rabbitmq',
274
- 'messaging.destination': queueName,
275
- },
276
- );
277
- });
278
- } else {
279
- await processMessage(msg);
280
- }
281
-
282
- channel.ack(msg);
283
- });
284
- ```
285
-
286
- ---
287
-
288
- ## Database Tracing
289
-
290
- ### MongoDB
291
-
292
- ```typescript
293
- import { withSpan } from '@manifest-cyber/observability-ts';
294
-
295
- const users = await withSpan(
296
- 'db.query',
297
- async () => {
298
- return await db.collection('users').find({ role: 'admin' }).toArray();
299
- },
300
- {
301
- 'db.system': 'mongodb',
302
- 'db.name': 'manifest',
303
- 'db.operation': 'find',
304
- 'db.collection': 'users',
305
- },
306
- );
307
- ```
308
-
309
- ### PostgreSQL
310
-
311
- ```typescript
312
- import { withSpan } from '@manifest-cyber/observability-ts';
313
-
314
- const result = await withSpan(
315
- 'db.query',
316
- async () => {
317
- return await pool.query('SELECT * FROM users WHERE role = $1', ['admin']);
318
- },
319
- {
320
- 'db.system': 'postgresql',
321
- 'db.operation': 'SELECT',
322
- 'db.statement': 'SELECT * FROM users WHERE role = $1',
323
- },
324
- );
325
- ```
326
-
327
- ---
328
-
329
- ## Integration with logger-ts
330
-
331
- ### Automatic Trace Correlation
332
-
333
- ```typescript
334
- import { OtelLogger } from '@manifest-cyber/logger-ts';
335
- import {
336
- withSpan,
337
- getCurrentTraceId,
338
- getCurrentSpanId,
339
- } from '@manifest-cyber/observability-ts';
340
-
341
- const baseLogger = getEnvLogger();
342
- const logger = OtelLogger.create(baseLogger, { autoTrace: true });
343
-
344
- await withSpan('process.asset', async () => {
345
- // Logger automatically includes trace_id and span_id
346
- logger.info('Processing asset', {
347
- assetId,
348
- organizationId,
349
- });
350
-
351
- // Or manually add trace info
352
- logger.info('Asset processed', {
353
- assetId,
354
- trace_id: getCurrentTraceId(),
355
- span_id: getCurrentSpanId(),
356
- });
357
- });
358
- ```
359
-
360
- ### Message Consumer with Trace-Aware Logging
361
-
362
- ```typescript
363
- import { createLoggerWithMessageTrace } from '@manifest-cyber/job-common';
364
- import { extractMessageTraceContext } from '@manifest-cyber/observability-ts';
365
- import { context } from '@opentelemetry/api';
366
-
367
- consumer.on('message', async (message) => {
368
- const logger = createLoggerWithMessageTrace(message, baseLogger);
369
- const ctx = extractMessageTraceContext(message.MessageAttributes);
370
-
371
- if (ctx) {
372
- await context.with(ctx, async () => {
373
- await withSpan('job.process', async () => {
374
- logger.info('Processing message', { messageId: message.MessageId });
375
- await processMessage(message);
376
- logger.info('Message processed successfully');
377
- });
378
- });
379
- }
380
- });
381
- ```
382
-
383
- ---
384
-
385
- ## Security Considerations
386
-
387
- ### ⚠️ Never Trust Client-Provided Trace Context
388
-
389
- **CRITICAL**: Do NOT extract `traceparent` headers from browser/client requests.
390
-
391
- ```typescript
392
- // ❌ DANGEROUS - Accepting client trace IDs
393
- app.use((req, res, next) => {
394
- extractTraceContext(req.headers); // SECURITY RISK!
395
- next();
396
- });
397
-
398
- // ✅ SECURE - Only extract from trusted services
399
- app.use((req, res, next) => {
400
- const isTrustedService = req.headers['x-internal-service'] === 'true';
401
-
402
- if (isTrustedService && req.headers.traceparent) {
403
- extractTraceContext(req.headers);
404
- }
405
- // Otherwise, OTEL will start a new trace (secure default)
406
-
407
- next();
408
- });
409
- ```
410
-
411
- **Why?** Client-controlled trace IDs can:
412
-
413
- - Pollute trace data with fake/malicious traces
414
- - Attempt to correlate unrelated requests
415
- - Inject arbitrary trace IDs to access other users' traces
416
- - Cause sampling and storage issues
417
-
418
- **Recommended Architecture:**
419
-
420
- ```
421
- Browser (web-app)
422
- └─> NO traceparent header
423
-
424
- manifest-api (Trust Boundary)
425
- └─> Starts NEW trace (root span)
426
-
427
- └─> Injects traceparent for downstream
428
-
429
- SQS/RabbitMQ
430
- └─> Propagates traceparent (trusted)
431
-
432
- job-* services
433
- └─> Extracts traceparent (trusted)
434
-
435
- └─> Continues trace (same trace ID)
436
- ```
437
-
438
- **Implementation:**
439
-
440
- ```typescript
441
- // manifest-api: Only extract from authenticated services
442
- import { extractTraceContext } from '@manifest-cyber/observability-ts';
443
-
444
- export function extractTrustedTraceContext(req, res, next) {
445
- // Check for service-to-service authentication
446
- const serviceAuth = req.headers['x-service-auth'];
447
- const isInternalService = req.headers['x-internal-service'] === 'true';
448
-
449
- if ((serviceAuth || isInternalService) && req.headers.traceparent) {
450
- // Only extract from trusted backend services
451
- extractTraceContext(req.headers);
452
- }
453
- // For browser requests, OTEL auto-instrumentation creates new trace
454
-
455
- next();
456
- }
457
-
458
- app.use(extractTrustedTraceContext);
459
- ```
460
-
461
- ---
462
-
463
- ## Best Practices
464
-
465
- ### 1. Span Naming
466
-
467
- Follow OpenTelemetry semantic conventions:
468
-
469
- ```typescript
470
- // ✅ Good
471
- 'http.client.request';
472
- 'db.query';
473
- 'messaging.process';
474
- 'asset.validate';
475
- 'sbom.parse';
476
-
477
- // ❌ Bad
478
- 'API Call';
479
- 'Database';
480
- 'Process Message';
481
- 'DoWork';
482
- ```
483
-
484
- ### 2. Attribute Naming
485
-
486
- Use standard semantic conventions:
487
-
488
- ```typescript
489
- // ✅ Good - Standard conventions
490
- {
491
- 'http.method': 'POST',
492
- 'http.status_code': 200,
493
- 'db.system': 'mongodb',
494
- 'messaging.system': 'sqs',
495
- }
496
-
497
- // ❌ Bad - Non-standard names
498
- {
499
- 'method': 'POST',
500
- 'status': 200,
501
- 'database': 'mongo',
502
- 'queue': 'sqs',
503
- }
504
- ```
505
-
506
- ### 3. Cardinality Management
507
-
508
- ```typescript
509
- // ✅ Good - Bounded cardinality
510
- span.setAttribute('http.status_code', 200);
511
- span.setAttribute('user.role', 'admin');
512
- span.setAttribute('asset.type', 'sbom');
513
-
514
- // ❌ Bad - Unbounded cardinality
515
- span.setAttribute('user.id', 'user-12345-abcde-67890'); // Too unique
516
- span.setAttribute('request.timestamp', Date.now()); // Too unique
517
- span.setAttribute('request.body', JSON.stringify(body)); // Too large
518
- ```
519
-
520
- ### 4. Error Handling
521
-
522
- Always set span status and record exceptions:
523
-
524
- ```typescript
525
- import {
526
- withSpan,
527
- SpanStatusCode,
528
- recordException,
529
- } from '@manifest-cyber/observability-ts';
530
-
531
- // Automatic error handling
532
- await withSpan('operation', async () => {
533
- await riskyOperation(); // Errors automatically recorded
534
- });
535
-
536
- // Manual error handling
537
- const span = createSpan({ name: 'operation' });
538
- try {
539
- await riskyOperation();
540
- span.setStatus({ code: SpanStatusCode.OK });
541
- } catch (error) {
542
- span.recordException(error);
543
- span.setStatus({
544
- code: SpanStatusCode.ERROR,
545
- message: error.message,
546
- });
547
- throw error;
548
- } finally {
549
- span.end();
550
- }
551
- ```
552
-
553
- ### 5. Sampling Strategy
554
-
555
- ```typescript
556
- // Development: 100% sampling
557
- await initTracing({
558
- sampling: { rate: 1.0 },
559
- });
560
-
561
- // Production: 10% sampling, but always sample errors
562
- await initTracing({
563
- sampling: {
564
- rate: 0.1,
565
- alwaysOnError: true,
566
- },
567
- });
568
- ```
569
-
570
- ---
571
-
572
- ## Examples
573
-
574
- ### Complete Express API Example
575
-
576
- ```typescript
577
- import express from 'express';
578
- import {
579
- initTracing,
580
- extractTraceContext,
581
- withSpan,
582
- getCurrentTraceId,
583
- SpanStatusCode,
584
- } from '@manifest-cyber/observability-ts';
585
- import { OtelLogger } from '@manifest-cyber/logger-ts';
586
-
587
- // Initialize tracing
588
- await initTracing({
589
- serviceName: 'manifest-api',
590
- exporter: { endpoint: 'http://localhost:4317' },
591
- });
592
-
593
- const app = express();
594
- const baseLogger = getEnvLogger();
595
-
596
- // Tracing middleware
597
- app.use((req, res, next) => {
598
- const ctx = extractTraceContext(req.headers);
599
- const logger = OtelLogger.create(baseLogger, { autoTrace: true });
600
-
601
- context.with(ctx, () => {
602
- const span = tracer.startSpan('http.server.request', {
603
- attributes: {
604
- 'http.method': req.method,
605
- 'http.url': req.url,
606
- },
607
- });
608
-
609
- req.span = span;
610
- req.logger = logger;
611
-
612
- res.on('finish', () => {
613
- span.setAttribute('http.status_code', res.statusCode);
614
- span.setStatus({
615
- code: res.statusCode >= 500 ? SpanStatusCode.ERROR : SpanStatusCode.OK,
616
- });
617
- span.end();
618
- });
619
-
620
- next();
621
- });
622
- });
623
-
624
- // Route handler
625
- app.post('/api/sboms/upload', async (req, res) => {
626
- await withSpan(
627
- 'sbom.upload',
628
- async () => {
629
- req.logger.info('Processing SBOM upload', {
630
- organizationId: req.body.organizationId,
631
- });
632
-
633
- // Process SBOM
634
- const result = await processSBOM(req.body);
635
-
636
- // Send to queue with trace context
637
- const traceContext = createMessageTraceContext();
638
- await sqs.sendMessage({
639
- QueueUrl: 'sbom-process-queue',
640
- MessageBody: JSON.stringify(result),
641
- MessageAttributes: {
642
- traceparent: {
643
- DataType: 'String',
644
- StringValue: traceContext.traceparent,
645
- },
646
- },
647
- });
648
-
649
- res.json({ success: true, traceId: getCurrentTraceId() });
650
- },
651
- {
652
- 'sbom.id': req.body.sbomId,
653
- 'organization.id': req.body.organizationId,
654
- },
655
- );
656
- });
657
-
658
- app.listen(3000);
659
- ```
660
-
661
- ### Complete Job Worker Example
662
-
663
- ```typescript
664
- import {
665
- initTracing,
666
- extractMessageTraceContext,
667
- withSpan,
668
- } from '@manifest-cyber/observability-ts';
669
- import { createLoggerWithMessageTrace } from '@manifest-cyber/job-common';
670
- import { Consumer } from 'sqs-consumer';
671
-
672
- // Initialize tracing
673
- await initTracing({
674
- serviceName: 'job-sbom-process',
675
- exporter: { endpoint: 'http://localhost:4317' },
676
- });
677
-
678
- const consumer = Consumer.create({
679
- queueUrl: 'sbom-process-queue',
680
- handleMessage: async (message) => {
681
- const logger = createLoggerWithMessageTrace(message, baseLogger);
682
- const ctx = extractMessageTraceContext(message.MessageAttributes);
683
-
684
- if (ctx) {
685
- await context.with(ctx, async () => {
686
- await withSpan(
687
- 'job.process.sbom',
688
- async () => {
689
- logger.info('Processing SBOM', {
690
- messageId: message.MessageId,
691
- });
692
-
693
- const sbom = JSON.parse(message.Body);
694
-
695
- // Validate SBOM
696
- await withSpan('sbom.validate', async () => {
697
- await validateSBOM(sbom);
698
- });
699
-
700
- // Store in database
701
- await withSpan(
702
- 'db.insert',
703
- async () => {
704
- await db.collection('sboms').insertOne(sbom);
705
- },
706
- {
707
- 'db.system': 'mongodb',
708
- 'db.collection': 'sboms',
709
- },
710
- );
711
-
712
- logger.info('SBOM processed successfully');
713
- },
714
- {
715
- 'messaging.message_id': message.MessageId,
716
- 'sbom.id': sbom.id,
717
- },
718
- );
719
- });
720
- }
721
- },
722
- });
723
-
724
- consumer.start();
725
- ```
726
-
727
- ---
728
-
729
- ## Troubleshooting
730
-
731
- ### Traces Not Appearing
732
-
733
- 1. **Check endpoint reachability**:
734
-
735
- ```bash
736
- curl -v http://localhost:4317
737
- ```
738
-
739
- 2. **Verify initialization**:
740
-
741
- ```typescript
742
- import { isTracingInitialized } from '@manifest-cyber/observability-ts';
743
- console.log('Tracing initialized:', isTracingInitialized());
744
- ```
745
-
746
- 3. **Check Vector/Collector logs** for export errors
747
-
748
- ### Missing Parent-Child Relationships
749
-
750
- Ensure context propagation:
751
-
752
- ```typescript
753
- // ✅ Good - Context is propagated
754
- const ctx = extractTraceContext(headers);
755
- await context.with(ctx, async () => {
756
- await withSpan('child.operation', async () => {
757
- // This span will be a child
758
- });
759
- });
760
-
761
- // ❌ Bad - Context not propagated
762
- const ctx = extractTraceContext(headers);
763
- // Not using context.with()
764
- await withSpan('orphaned.operation', async () => {
765
- // This span won't have a parent
766
- });
767
- ```
768
-
769
- ### High Memory Usage
770
-
771
- Reduce sampling in production:
772
-
773
- ```typescript
774
- await initTracing({
775
- sampling: { rate: 0.1 }, // 10% instead of 100%
776
- });
777
- ```
778
-
779
- ---
780
-
781
- ## Additional Resources
782
-
783
- - [OpenTelemetry Documentation](https://opentelemetry.io/docs/)
784
- - [W3C Trace Context Specification](https://www.w3.org/TR/trace-context/)
785
- - [Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/)
786
- - [TRACING_IMPLEMENTATION_PLAN.md](./TRACING_IMPLEMENTATION_PLAN.md) - Full implementation roadmap
787
-
788
- ---
789
-
790
- **Maintained by**: Manifest Cyber Platform Team
791
- **Last Updated**: November 11, 2025