certlib-log 1.0.0b1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
certlib/log.py ADDED
@@ -0,0 +1,3676 @@
1
+ # Copyright (c) 2026, CERT Polska. All rights reserved.
2
+ #
3
+ # This file's content is free software; you can redistribute and/or
4
+ # modify it under the terms of the *BSD 3-Clause "New" or "Revised"
5
+ # License* (see the `LICENSE.txt` file in the source code repository:
6
+ # https://github.com/CERT-Polska/certlib-log/blob/main/LICENSE.txt).
7
+
8
+
9
+ # (The following module-level docstring contains
10
+ # the **User's Guide** part of the documentation.)
11
+
12
+ """
13
+ ## **Introduction**
14
+
15
+ The primary reason for creating the `certlib.log` library was to
16
+ make it easier to configure *structured logging* across various
17
+ systems created and used by [CERT Polska](https://cert.pl/en/)
18
+ -- in a possibly *consistent way* and with *minimal impact* on
19
+ existing code.
20
+
21
+ However, despite a few opinionated defaults, the library is quite
22
+ versatile, so it may prove useful for a much broader audience of
23
+ developers and system administrators. Apart from the *structured
24
+ logging* stuff, it also offers a few other features...
25
+
26
+ !!! note
27
+
28
+ `certlib.log` uses *only* the Python standard library, i.e.,
29
+ it **does *not* depend on any third-party packages**.
30
+
31
+ ***
32
+
33
+ ### How to Install
34
+
35
+ You can install the `certlib.log` library by running the following
36
+ command (typically, you will do this within a Python [*virtual
37
+ environment*](https://packaging.python.org/en/latest/tutorials/installing-packages/#creating-virtual-environments)):
38
+
39
+ ```bash
40
+ python3 -m pip install certlib.log
41
+ ```
42
+
43
+ The library is compatible with Python 3.10 and all newer versions of
44
+ Python.
45
+
46
+ !!! note
47
+
48
+ The canonical name of the *distribution package* is `certlib-log`
49
+ (with a hyphen), but *pip* and other tools accept also the
50
+ `certlib.log` form (with a dot); the latter may feel more natural,
51
+ as it is also the *importable module*'s name (used in Python code).
52
+
53
+ ***
54
+
55
+ ### TL;DR: How to Quickly Enable *Structured Logging*
56
+
57
+ * Does your program already make use of the standard [`logging`][]
58
+ facilities *and* do you want it to start emitting *structured*
59
+ JSON-serialized log entries? Just make your configuration of logging
60
+ include [`certlib.log.StructuredLogsFormatter`][] as a *formatter*.
61
+ To do that easily, you may want to look at one of these examples
62
+ (please also read the comments included there):
63
+
64
+ * [`logging.config.dictConfig`-Style Configuration Example](#loggingconfigdictconfig-style-configuration-example), or
65
+ * [`logging.config.fileConfig`-Style Configuration Example](#loggingconfigfileconfig-style-configuration-example).
66
+
67
+ * The `system`, `component` and `component_type` keys (which you can
68
+ see in those examples) are intended to be set with the following
69
+ semantics in mind:
70
+
71
+ * `system` -- the name of the *entire system* or *project* your
72
+ script/application is part of;
73
+
74
+ * `component` -- the name of a particular *script* or *application*
75
+ being executed (for a CLI script it should be its *basename*);
76
+
77
+ * `component_type` -- a conventional label of the *type* of that
78
+ script or application, agreed upon in your organization (such
79
+ as: `"web"`, `"parser"`, `"collector"`...).
80
+
81
+ * And that's it! Everything else is optional (but probably worth a try,
82
+ so you might want to read on).
83
+
84
+ ***
85
+
86
+ ### Library Overview
87
+
88
+ The tools provided by `certlib.log` are intended for use with the standard
89
+ [`logging`][] module's toolset. Essentially, they enhance that toolset with
90
+ the following possibilities:
91
+
92
+ * to emit *structured* log entries -- each being a [`dict`][], hereinafter
93
+ referred to as *output data* (serialized in JSON format before actually
94
+ being emitted);
95
+
96
+ * to permanently assign to selected output data keys: *not only* constant
97
+ *defaults*, but also dynamic factories of values, hereinafter referred
98
+ to as *auto-makers*; each *auto-maker* is just an argumentless function
99
+ (or method, or other callable), automatically called to produce a value
100
+ for the respective key -- whenever a new [log record][logging.LogRecord]
101
+ object is created by a logger (which only occurs if the logger is enabled
102
+ for the specified log level), before the log record is processed by any
103
+ *handlers*, *filters* and *formatters*;
104
+
105
+ * to replace the legacy `%`-based style of log message formatting with
106
+ the modern and more convenient `{}`-based one, or (when what you need
107
+ to log is just data) to omit passing the text message altogether; both
108
+ gained by giving a little tweak to each logger method call...
109
+
110
+ While it is possible to use each of these capabilities independently of
111
+ the others, the `certlib.log`'s stuff encourages combining them.
112
+
113
+ The following sections will discuss the two main tools provided by the
114
+ library: [`StructuredLogsFormatter`][] and [`xm`][].
115
+
116
+ ***
117
+
118
+ ## **Tool: `StructuredLogsFormatter`**
119
+
120
+ To make the standard [`logging`][] module's machinery able
121
+ to emit structured log entries (each being a JSON-serialized
122
+ [`dict`][]), you need to configure it to employ an instance
123
+ of [`certlib.log.StructuredLogsFormatter`][] as a *formatter*.
124
+
125
+ !!! note
126
+
127
+ Directly below it is shown how to do that in an *imperative* manner.
128
+ You may prefer, however, a more *declarative* approach (especially
129
+ if your program is not just a small script). In that case, please
130
+ check out at least one of the following subsections (but first
131
+ **read everything above them as well!**):
132
+
133
+ * **[`logging.config.dictConfig`-Style Configuration Example](#loggingconfigdictconfig-style-configuration-example)**,
134
+ * **[`logging.config.fileConfig`-Style Configuration Example](#loggingconfigfileconfig-style-configuration-example)**.
135
+
136
+ ***
137
+
138
+ ### Basic Configuration
139
+
140
+ Let us start by creating our [`StructuredLogsFormatter`][] instance
141
+ (obviously, the concrete values used in the following code snippet
142
+ are just sample ones -- to be replaced with values appropriate for
143
+ your program/system):
144
+
145
+ ```python
146
+ import itertools
147
+ import json
148
+ import logging
149
+ import sys
150
+ from certlib.log import StructuredLogsFormatter
151
+
152
+ structured_logs_formatter = StructuredLogsFormatter(
153
+ defaults={
154
+ # Each key in this dict should be an *output data* key.
155
+ # Each value specifies the *default value* for that key.
156
+ # For example:
157
+ "system": "MyOwn",
158
+ "component": "Portal",
159
+ "component_type": "web",
160
+ "example_custom_default": 42,
161
+ },
162
+ auto_makers={
163
+ # Each key in this dict should be an *output data* key.
164
+ # Each value should be either some argumentless callable
165
+ # (function) or a *dotted path* to such a callable. In
166
+ # particular, that callable *may* be the `get()` method
167
+ # of some `ContextVar` instance (see
168
+ # https://docs.python.org/3/library/contextvars.html).
169
+ # For example:
170
+
171
+ # (here: dotted paths pointing to callables)
172
+ "client_ip": "myown.portal.client_ip_context_var.get",
173
+ "example_nano_time": "time.time_ns",
174
+
175
+ # (here: a callable passed directly)
176
+ "example_local_counter": itertools.count(1).__next__,
177
+ },
178
+ # The value of `serializer` should be either a callable (function)
179
+ # that accepts exactly one argument (being a JSON-serializable dict)
180
+ # and returns a str object, or a *dotted path* to such a callable.
181
+ # Note: the following serializer is the default -- so, in fact, you
182
+ # do not need to specify it. But the possibility to define a custom
183
+ # serializer comes in handy when you want to use, e.g., a faster
184
+ # alternative to the standard `json.dumps()` function (or even a
185
+ # tool which serializes data in some other format...).
186
+ serializer=json.dumps,
187
+ )
188
+ ```
189
+
190
+ !!! info "See also"
191
+
192
+ You may also want to look at the *reference documentation* for
193
+ the **[`StructuredLogsFormatter`][]** class.
194
+
195
+ Technically, each of the three keyword arguments accepted by the
196
+ [`StructuredLogsFormatter`][] constructor is *optional*. However,
197
+ when it comes to the **`defaults`** and **`auto_makers`** ones,
198
+ you need to consider that:
199
+
200
+ * It is *required* that each of the following keys appears in *at least
201
+ one* of those two mappings (in `defaults` and/or `auto_makers`):
202
+
203
+ * `"system"` (the name of the *entire system* or *project* your
204
+ script/application is part of; e.g.: `"My Funny System"`,
205
+ [`"MWDB"`](https://github.com/CERT-Polska/mwdb-core),
206
+ [`"n6"`](https://github.com/CERT-Polska/n6), etc.),
207
+
208
+ * `"component"` (the name of a particular *script* or *application*
209
+ being executed; for a CLI script it should be its *basename*),
210
+
211
+ * `"component_type"` (a conventional label of the *type* of that
212
+ script or application, agreed upon in your organization; e.g.:
213
+ `"web"`, `"parser"`, `"collector"`...).
214
+
215
+ * _[maybe TBD: suggest a list of **valid values** of `component_type`?]_
216
+
217
+ !!! note
218
+
219
+ If it is OK for you/your organization that some (or all) of the
220
+ *output data* items listed above will remain unspecified, you can
221
+ provide such a **`defaults`** mapping in which some (or all) of
222
+ the aforementioned keys will be mapped to **[`None`][]** values.
223
+ Then, the said requirement will still be met, even though such
224
+ *void* items will be automatically omitted from the ultimate
225
+ **[`defaults`][StructuredLogsFormatter.defaults]** collection.
226
+
227
+ See also: **[`StructuredLogsFormatter.get_output_keys_required_in_defaults_or_auto_makers`][]**
228
+ (the method which defines the requirement in question).
229
+
230
+ * It is *recommended* (though not enforced) that each of the following
231
+ keys, *if it is relevant* to the particular `component_type`, appears in
232
+ *at least one* of those two mappings (in `defaults` and/or `auto_makers`,
233
+ typically in the latter):
234
+
235
+ * `"client_ip"` (the *real* IP of the client who communicates with us;
236
+ for this information to be reliable, the way it is obtained needs
237
+ to follow best practices specific to the protocol being used; for
238
+ example, when it comes to HTTP, see:
239
+ [https://httptoolkit.com/blog/what-is-x-forwarded-for](https://httptoolkit.com/blog/what-is-x-forwarded-for/));
240
+
241
+ * `"user_id"`, `"request_id"`, etc.
242
+
243
+ * _[TBD: more of the **recommended output data keys** to be suggested here]_.
244
+
245
+ !!! tip
246
+
247
+ If the presence of some *output data* key makes sense only in
248
+ a certain context (e.g., when handling a HTTP request...), just
249
+ make the respective *auto-maker* return **[`None`][]** in any
250
+ other contexts. Such *void* items will be automatically omitted
251
+ from *output data*.
252
+
253
+ The next step is to prepare the *root logger*, and then to add to it
254
+ some handler *with our formatter attached*:
255
+
256
+ ```python
257
+ # (continuing with the previous example)
258
+
259
+ root_logger = logging.getLogger()
260
+ root_logger.setLevel(logging.INFO)
261
+
262
+ stderr_handler = logging.StreamHandler(sys.stderr)
263
+ stderr_handler.name = "stderr"
264
+ stderr_handler.setFormatter(structured_logs_formatter) # <- Our formatter
265
+
266
+ root_logger.addHandler(stderr_handler)
267
+ ```
268
+
269
+ !!! info "See also"
270
+
271
+ You may also want to take a look at the relevant parts of the
272
+ documentation for the standard **[`logging`][]** module.
273
+
274
+ ***
275
+
276
+ ### Basic Usage
277
+
278
+ OK. Once the stuff is configured, let us emit some structured log entries!
279
+
280
+ We can do that in the legacy (standard, yet old-fashioned) manner...
281
+
282
+ ```python
283
+ import datetime as dt
284
+ import logging
285
+ import sys
286
+
287
+ logger = logging.getLogger(__name__)
288
+
289
+ # [...]
290
+
291
+ logger.info("Hello world!")
292
+
293
+ logger.warning("Hello %s!", sys.platform)
294
+
295
+ logger.error(
296
+ "Here we have %x and %r.", sys.maxsize, sys.byteorder,
297
+ extra={
298
+ "example_stuff": [1, "foo", False],
299
+ "other_example_item": {42: dt.datetime.now()},
300
+ },
301
+ exc_info=True,
302
+ )
303
+ ```
304
+
305
+ ...or (*better!*) by making use of the **[`certlib.log.xm`][]** tool:
306
+
307
+ ```python
308
+ import datetime as dt
309
+ import ipaddress
310
+ import logging
311
+ import sys
312
+ from certlib.log import xm
313
+
314
+ logger = logging.getLogger(__name__)
315
+
316
+ # [...]
317
+
318
+ logger.info(xm("Hello world!"))
319
+
320
+ logger.warning(xm("Hello {}!", sys.platform))
321
+
322
+ logger.error(xm(
323
+ "Here we have {:x} and {!r}.", sys.maxsize, sys.byteorder,
324
+ example_stuff=[1, "foo", False],
325
+ other_example_item={42: dt.datetime.now()},
326
+ exc_info=True,
327
+ ))
328
+
329
+ pure_data_dict = {
330
+ 'this': 123,
331
+ 'that': ipaddress.IPv4Address('192.168.0.42'),
332
+ 'there': 'example.com',
333
+ 'then': dt.datetime(2026, 1, 2, 3, 4, 56, tzinfo=dt.timezone.utc),
334
+ }
335
+ logger.info(xm(pure_data_dict)) # <- No text message at all.
336
+
337
+ logger.warning(xm(
338
+ "{who} owns {fract:.2%} of all issues of {title!r} magazine.",
339
+ who="John",
340
+ fract=0.87239,
341
+ title="Bajtek",
342
+ first_issue_date=dt.date(1985, 9, 1),
343
+ ))
344
+ ```
345
+
346
+ !!! info "See also"
347
+
348
+ To learn more about using the **`xm`** tool, see this guide's section
349
+ **[Tool: `xm`](#tool-xm)** (below).
350
+
351
+ Regarding the last `logger.warning(...)` call in the example above, the
352
+ resultant JSON-serialized *output data* dict (i.e., the ultimate content
353
+ of the log entry to be emitted) could look like the following (note that
354
+ the serialized data presented here contains arbitrary example values for
355
+ many keys, and -- just for visual clarity -- we present it here as being
356
+ sorted by key, and with extra newlines/indentation):
357
+
358
+ ```json
359
+ {
360
+ "client_ip": "192.168.0.123",
361
+ "component": "Portal",
362
+ "component_type": "web",
363
+ "example_custom_default": 42,
364
+ "example_local_counter": 6,
365
+ "example_nano_time": 1771629287019638820,
366
+ "first_issue_date": "1985-09-01",
367
+ "fract": 0.87239,
368
+ "func": "<module>",
369
+ "level": "WARNING",
370
+ "levelno": 30,
371
+ "lineno": 253,
372
+ "logger": "myown.portal.example_module",
373
+ "message": "John owns 87.24% of all issues of 'Bajtek' magazine.",
374
+ "message_base": {
375
+ "pattern": "{who} owns {fract:.2%} of all issues of {title!r} magazine."
376
+ },
377
+ "pid": 324485,
378
+ "process_name": "MainProcess",
379
+ "py_ver": "3.14.3.final.0",
380
+ "script_args": [
381
+ "/opt/MyOwn/conf/web/portal.wsgi"
382
+ ],
383
+ "src": "/opt/MyOwn/py/myown/portal/example_module.py",
384
+ "system": "MyOwn",
385
+ "thread_id": 139781835344768,
386
+ "thread_name": "MainThread",
387
+ "timestamp": "2026-02-20 23:14:47.019574Z",
388
+ "title": "Bajtek",
389
+ "who": "John"
390
+ }
391
+ ```
392
+
393
+ !!! note
394
+
395
+ As you can see, the **[`dt.date`][datetime.date]** instance provided
396
+ as **`first_issue_date`**, before becoming an *output data* value,
397
+ was converted to a string -- thanks to an automatic invocation of the
398
+ **[`prepare_value`][StructuredLogsFormatter.prepare_value]** method
399
+ (*before* the actual data serialization).
400
+
401
+ All *output data* values are subject to preparation by that method
402
+ (which processes them differenttly depending on their types...). By
403
+ extending/overriding it in your **[`StructuredLogsFormatter`][]**
404
+ subclass you can gain full control over that preparation.
405
+
406
+ Nevertheless, you can get quite well just by sticking with the default
407
+ implementation of that method.
408
+
409
+ ***
410
+
411
+ ### `logging.config.dictConfig`-Style Configuration Example
412
+
413
+ ```python
414
+ import logging.config
415
+
416
+ logging_configuration_dict = {
417
+ "formatters": {
418
+ "structured": {
419
+ "()": "certlib.log.StructuredLogsFormatter",
420
+ "defaults": {
421
+ # Each key in this dict should be an *output data* key.
422
+ # Each value specifies the *default value* for that key.
423
+ # For example:
424
+ "system": "MyOwn",
425
+ "component": "Portal",
426
+ "component_type": "web",
427
+ "example_custom_default": 42
428
+ # ^ Important: by default, each of the "system", "component"
429
+ # and "component_type" keys is *required* to be included
430
+ # either *here* or in "auto_makers" (below). Note: *here*
431
+ # each of them can be assigned None -- if excluding this
432
+ # key from *output data* is OK for you/your organization.
433
+ },
434
+ "auto_makers": {
435
+ # Each key in this dict should be an *output data* key.
436
+ # Each value should be either some argumentless callable
437
+ # (function) or a *dotted path* to such a callable. In
438
+ # particular, that callable *may* be the `get()` method
439
+ # of some `ContextVar` instance (see
440
+ # https://docs.python.org/3/library/contextvars.html).
441
+ # For example:
442
+ "client_ip": "myown.portal.client_ip_context_var.get",
443
+ "example_nano_time": "time.time_ns"
444
+ },
445
+ # The value of "serializer", if specified, should be either
446
+ # a callable (function) which accepts exactly one argument
447
+ # (being a JSON-serializable dict) and returns a str object,
448
+ # or a *dotted path* to such a callable. If "serializer" is
449
+ # not specified, the standard `json.dumps()` function will
450
+ # be used.
451
+ "serializer": "some_package.faster_replacement_for_json_dumps"
452
+ }
453
+ },
454
+ "handlers": {
455
+ "stderr": {
456
+ "class": "logging.StreamHandler",
457
+ "formatter": "structured",
458
+ "stream": "ext://sys.stderr"
459
+ }
460
+ },
461
+ "root": {
462
+ "level": "INFO",
463
+ "handlers": ["stderr"]
464
+ },
465
+ "disable_existing_loggers": False,
466
+ "version": 1
467
+ }
468
+
469
+ logging.config.dictConfig(logging_configuration_dict)
470
+ ```
471
+
472
+ !!! tip
473
+
474
+ Typically, applications load such a configuration dict from some file
475
+ (usually in **[TOML][tomllib]**, **[YAML](https://pypi.org/project/PyYAML/)**
476
+ or **[JSON][json]** format).
477
+
478
+ !!! info "See also"
479
+
480
+ You can learn more about the **[`logging.config.dictConfig`][]**-specific
481
+ configuration dict schema by referring to the **[relevant
482
+ section](https://docs.python.org/3/library/logging.config.html#logging-config-dictschema)**
483
+ of the documentation for the standard `logging.config` module.
484
+
485
+ ***
486
+
487
+ ### `logging.config.fileConfig`-Style Configuration Example
488
+
489
+ ```ini
490
+ [loggers]
491
+ keys = root
492
+
493
+ [handlers]
494
+ keys = stderr
495
+
496
+ [formatters]
497
+ keys = structured
498
+
499
+ [logger_root]
500
+ level = INFO
501
+ handlers = stderr
502
+
503
+ [handler_stderr]
504
+ class = StreamHandler
505
+ formatter = structured
506
+ args = (sys.stderr,)
507
+
508
+ [formatter_structured]
509
+ class = certlib.log.StructuredLogsFormatter
510
+ format = {
511
+ "defaults": {
512
+ # Each key in this dict should be an *output data* key.
513
+ # Each value specifies the *default value* for that key.
514
+ # For example:
515
+ "system": "MyOwn",
516
+ "component": "Portal",
517
+ "component_type": "web",
518
+ "example_custom_default": 42,
519
+ # ^ Important: by default, each of the "system", "component"
520
+ # and "component_type" keys is *required* to be included
521
+ # either *here* or in "auto_makers" (below). Note: *here*
522
+ # each of them can be assigned None -- if excluding this
523
+ # key from *output data* is OK for you/your organization.
524
+ },
525
+ "auto_makers": {
526
+ # Each key in this dict should be an *output data* key.
527
+ # Each value should be a *dotted path* to some argumentless
528
+ # callable (function). In particular, that callable *may* be
529
+ # the `get()` method of some `ContextVar` instance (see
530
+ # https://docs.python.org/3/library/contextvars.html).
531
+ # For example:
532
+ "client_ip": "myown.portal.client_ip_context_var.get",
533
+ "example_nano_time": "time.time_ns",
534
+ },
535
+ # The value of "serializer", if specified, should be a *dotted path*
536
+ # to a callable (function) which accepts exactly one argument (being
537
+ # a JSON-serializable dict) and returns a str object. If "serializer"
538
+ # is not specified, the standard `json.dumps()` function will be used.
539
+ "serializer": "some_package.faster_replacement_for_json_dumps",
540
+ }
541
+ # ^ *Note:* all non-comment and non-blank continuation lines, *including*
542
+ # the one with the closing `}`, *must be indented* (by at least 1 space).
543
+ ```
544
+
545
+ !!! tip
546
+
547
+ If **[`logging.config.fileConfig`][]** is called by your code (rather
548
+ than automatically by some framework/library...), you may want to set
549
+ the **`disable_existing_loggers`** argument to **[`False`][]** (because
550
+ if its default value, **[`True`][]**, is in effect, then some loggers
551
+ created before that call may be turned off, which is usually not what
552
+ you want). This is a general advice (not specific to `certlib.log`).
553
+
554
+ !!! info "See also"
555
+
556
+ You can learn more about the **[`logging.config.fileConfig`][]**-specific
557
+ configuration format by referring to the **[relevant
558
+ section](https://docs.python.org/3/library/logging.config.html#logging-config-fileformat)**
559
+ of the documentation for the standard `logging.config` module.
560
+
561
+ ***
562
+
563
+ ## **Tool: `xm`**
564
+
565
+ Essentially, the purpose of [`xm`][] is two-fold:
566
+
567
+ * to make it more convenient to emit *structured log entries* (each being
568
+ representable as a [`dict`][]), especially if a [`StructuredLogsFormatter`][]
569
+ is in use;
570
+
571
+ * if you choose the traditional *text-message-focused style* of logging
572
+ (rather than a *pure-data-focused*, messageless one) -- to easily replace
573
+ the legacy `%`-based log message formatting style with the modern and
574
+ more convenient `{}`-based one (regardless of what formatter is in use).
575
+
576
+ !!! note
577
+
578
+ **[`xm`][]** is just a convenience alias of **[`ExtendedMessage`][]**
579
+ (the latter is the actual name of the class, but the former is
580
+ definitely more handy when you want to log a message or data).
581
+
582
+ Let examples speak...
583
+
584
+ ***
585
+
586
+ ### Dealing with Pure Data
587
+
588
+ Below -- a couple of examples of logging just some data (without the need
589
+ to specify any text message).
590
+
591
+ ```python
592
+ import logging
593
+ from certlib.log import xm
594
+
595
+ logger = logging.getLogger(__name__)
596
+
597
+ # Logging pure data:
598
+ logger.info(xm(
599
+ some_key=["example", "data"],
600
+ another=lambda: 42, # (<- function/method: to be called by formatter)
601
+ yet_another={"abc": 1.0, "qwerty": [True, False]},
602
+ ))
603
+
604
+ # Same as above, but here we pass our data just *as one dict*:
605
+ my_data = {
606
+ "some_key": ["example", "data"],
607
+ "another": lambda: 42, # (<- function/method: to be called by formatter)
608
+ "yet_another": {"abc": 1.0, "qwerty": [True, False]},
609
+ }
610
+ logger.info(xm(my_data))
611
+ ```
612
+
613
+ !!! tip
614
+
615
+ Regarding the `"another"` item in the above examples as well as some
616
+ of the items/arguments that appear in the next subsection's examples:
617
+ if you pass a function/method (also a [**`lambda`**](https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions)
618
+ expression) instead of a plain value, it will be called (at most
619
+ once per `xm` use, by a formatter of any type, not necessarily a
620
+ **`StructuredLogsFormatter`**) to obtain the actual value.
621
+
622
+ !!! note
623
+
624
+ By default, the mechanism is applied *only* if you pass a
625
+ *function* or *method* object -- *not* just an arbitrary
626
+ callable object (as that could lead to inadvertent calls).
627
+
628
+ In practice, this feature is useful if the creation of a certain
629
+ value is costly, so that you would prefer that to be done *only* if
630
+ (and when) the log entry is to be actually formatted and emitted.
631
+
632
+ ***
633
+
634
+ ### Modern Formatting Style
635
+
636
+ Below there are a few examples of traditional *text-message-focused*
637
+ logging, but -- what using the [`xm`][] tool makes possible -- with the
638
+ modern and convenient [`{}`-based style of message formatting](https://docs.python.org/3/library/string.html#format-string-syntax)
639
+ (rather than the legacy, less convenient and less powerful, `%`-based one).
640
+
641
+ ```python
642
+ import datetime as dt
643
+ import logging
644
+ from certlib.log import xm
645
+
646
+ logger = logging.getLogger(__name__)
647
+
648
+ some_name = "foo"
649
+ some_value = "Bar"
650
+
651
+ logger.info(xm(
652
+ "Note: {} is {!r} (in {:%Y-%m})",
653
+ some_name, some_value,
654
+ dt.date.today, # (<- function/method: to be called by formatter)
655
+ ))
656
+ ```
657
+
658
+ The resultant message will be: `"Note: foo is 'Bar' (in 2026-02)"`
659
+ (assuming that, for this particular example, the [`dt.date.today`][datetime.date.today]
660
+ class method would return an instance of [`dt.date`][datetime.date]
661
+ representing a *February 2026* date, e.g., one equal to `dt.date(2026,
662
+ 2, 21)`).
663
+
664
+ What it means when the logging system is configured to employ a
665
+ [`StructuredLogsFormatter`][], is that:
666
+
667
+ * the formatted message will appear in the JSON-serialized *output
668
+ data* as the item: `"message": "Note: foo is 'Bar' (in 2026-02)"`,
669
+ * and the raw message pattern will also be included, like this:
670
+ `"message_base": {"pattern": "Note: {} is {!r} (in {:%Y-%m})"}`.
671
+
672
+ !!! info
673
+
674
+ Please note that when you use **[`xm`][]**, you still benefit from
675
+ the standard mechanism of deferring message formatting until the log
676
+ entry really needs to be emitted (regardless of what formatter is in
677
+ use).
678
+
679
+ The code in the next example does the same as above; the only difference
680
+ is that here the *replacement fields* in the message pattern are explicitly
681
+ numbered:
682
+
683
+ ```python
684
+ logger.info(xm(
685
+ "Note: {0} is {1!r} (in {2:%Y-%m})",
686
+ some_name, some_value,
687
+ dt.date.today, # (<- function/method: to be called by formatter)
688
+ ))
689
+ ```
690
+
691
+ Below there is an example similar to the previous two, but with some of the
692
+ replacement fields being *named* (and, therefore, with the corresponding
693
+ *keyword arguments* specifying the values to be interpolated):
694
+
695
+ ```python
696
+ logger.info(xm(
697
+ "Note: {} is {val!r} (in {today:%Y-%m})",
698
+ some_name,
699
+ val=some_value,
700
+ today=dt.date.today, # (<- function/method: to be called by formatter)
701
+ ))
702
+ ```
703
+
704
+ It is worth noting that if a [`StructuredLogsFormatter`][] is in use,
705
+ then any *keyword arguments* (*named* ones) passed to [`xm`][], apart
706
+ from being used to fill in the respective replacement fields, are also
707
+ included as *output data* items. For example, *output data* resulting
708
+ from the `logger.info(...)` call in the last example will contain, among
709
+ others, the following items:
710
+
711
+ * `"message": "Note: foo is 'Bar' (in 2026-02)"`,
712
+ * `"message_base": {"pattern": "Note: {} is {val!r} (in {today:%Y-%m})"}`,
713
+ * `"val": "Bar"`,
714
+ * `"today": "2026-02-21"`.
715
+
716
+ And below there is almost the same call as previously, but with a couple
717
+ of extra keyword arguments (conveying some additional data, unrelated to
718
+ message formatting):
719
+
720
+ ```python
721
+ logger.info(xm(
722
+ "Note: {} is {val!r} (in {today:%Y-%m})",
723
+ some_name,
724
+ val=some_value,
725
+ today=dt.date.today, # (<- function/method: to be called...)
726
+ something=lambda: 123456789, # (<- function/method: to be called...)
727
+ something_more=(1, 2, 3, 4, True, None, {5: [6789, 10]}),
728
+ ))
729
+ ```
730
+
731
+ In this case, the resultant *output data* generated by the
732
+ [`StructuredLogsFormatter`][]'s machinery will contain,
733
+ among others, the following items:
734
+
735
+ * `"message": "Note: foo is 'Bar' (in 2026-02)"`,
736
+ * `"message_base": {"pattern": "Note: {} is {val!r} (in {today:%Y-%m})"}`,
737
+ * `"val": "Bar"`,
738
+ * `"today": "2026-02-21"`,
739
+ * `"something": 123456789`,
740
+ * `"something_more": [1, 2, 3, 4, true, null, {"5": [6789, 10]}]`.
741
+
742
+ The entire resultant JSON-serialized *output data* (i.e., the ultimate
743
+ content of the log entry to be emitted) could look like the following
744
+ (note that the serialized data presented here contains arbitrary example
745
+ values for many keys, and -- just for visual clarity -- we present it
746
+ here as being sorted by key, and with extra newlines/indentation):
747
+
748
+ ```json
749
+ {
750
+ "client_ip": "192.168.0.123",
751
+ "component": "Portal",
752
+ "component_type": "web",
753
+ "example_custom_default": 42,
754
+ "example_local_counter": 4,
755
+ "example_nano_time": 1771631594315719605,
756
+ "func": "<module>",
757
+ "level": "INFO",
758
+ "levelno": 20,
759
+ "lineno": 179,
760
+ "logger": "myown.portal.another_example_module",
761
+ "message": "Note: foo is 'Bar' (in 2026-02)",
762
+ "message_base": {
763
+ "pattern": "Note: {} is {val!r} (in {today:%Y-%m})"
764
+ },
765
+ "pid": 327578,
766
+ "process_name": "MainProcess",
767
+ "py_ver": "3.14.3.final.0",
768
+ "script_args": [
769
+ "/opt/MyOwn/conf/web/portal.wsgi"
770
+ ],
771
+ "something": 123456789,
772
+ "something_more": [
773
+ 1, 2, 3, 4, true, null, {
774
+ "5": [
775
+ 6789, 10
776
+ ]
777
+ }
778
+ ],
779
+ "src": "/opt/MyOwn/py/myown/portal/another_example_module.py",
780
+ "system": "MyOwn",
781
+ "thread_id": 140062429502336,
782
+ "thread_name": "MainThread",
783
+ "timestamp": "2026-02-20 23:53:14.315296Z",
784
+ "today": "2026-02-21",
785
+ "val": "Bar"
786
+ }
787
+ ```
788
+
789
+ !!! info "See also"
790
+
791
+ You may also want to look at the *reference documentation* for the
792
+ **[`ExtendedMessage`][]** class (among other things, you will find
793
+ there information about three special arguments you can also pass
794
+ to **`xm`** -- namely: **`exc_info`**, **`stack_info`** and
795
+ **`stacklevel`**).
796
+
797
+ ***
798
+
799
+ ## **Advanced Topics and Finer Points**
800
+
801
+ ***
802
+
803
+ ### More About `StructuredLogsFormatter` (Including Subclassing)
804
+
805
+ If you have not read the *reference documentation* for the
806
+ [`StructuredLogsFormatter`][] class yet, you are strongly encouraged to
807
+ do so. Among other things, you will find there a list of hook methods
808
+ that can be extended/overridden in your subclasses. Apart from that, the
809
+ documentation in question includes (also in the individual descriptions
810
+ of those hook methods) valuable information about other elements of the
811
+ `StructuredLogsFormatter`'s interface and behavior.
812
+
813
+ ***
814
+
815
+ ### Other Stuff Provided by `certlib.log`
816
+
817
+ Besides [`StructuredLogsFormatter`][] and [`xm`][] ([`ExtendedMessage`][]),
818
+ the `certlib.log` module provides the following public stuff:
819
+
820
+ * the [`make_constant_value_provider`][] function (a minor helper,
821
+ useful when you need to create an *auto-maker* that will always
822
+ return the same value);
823
+
824
+ * the [`register_log_record_attr_auto_maker`][] and
825
+ [`unregister_log_record_attr_auto_maker`][] functions
826
+ (typically, they do not need to be used directly --
827
+ see the *reference documentation* for each of them...);
828
+
829
+ * the [`COMMONLY_EXPECTED_NON_STANDARD_OUTPUT_KEYS`][] constant (its value
830
+ is returned by the `StructuredLogsFormatter`'s default implementation of
831
+ the [`get_output_keys_required_in_defaults_or_auto_makers`][.StructuredLogsFormatter.get_output_keys_required_in_defaults_or_auto_makers]
832
+ method);
833
+
834
+ * the [`STANDARD_RECORD_ATTR_TO_OUTPUT_KEY`][] constant
835
+ (defines the standard mapping of [log record attribute
836
+ names](https://docs.python.org/3/library/logging.html#logrecord-attributes)
837
+ to actual *output data* keys; that mapping is used by the
838
+ `StructuredLogsFormatter`'s default implementation of the
839
+ [`make_base_record_attr_to_output_key`][.StructuredLogsFormatter.make_base_record_attr_to_output_key]
840
+ method);
841
+
842
+ * a few [static typing helpers](reference.md#static-typing-helpers)
843
+ (ancillary stuff you do not usually need to pay much attention to).
844
+
845
+ ***
846
+
847
+ ### Roadmap Outline
848
+
849
+ Future ideas under consideration include:
850
+
851
+ * [`StructuredLogsFormatter`][]: add the ability to specify keys related
852
+ to *sensitive* data -- so that values assigned to them in *output data*
853
+ will be automatically masked/anonymized.
854
+
855
+ * [`xm`][]: add dedicated suport for [`pattern`][ExtendedMessage.pattern]
856
+ of type [`string.templatelib.Template`][] (available in Python 3.14 and
857
+ newer).
858
+ """
859
+
860
+
861
+ # mypy: disable_error_code = "unused-ignore"
862
+
863
+
864
+ from __future__ import annotations
865
+
866
+ import ast
867
+ import dataclasses
868
+ import datetime as dt
869
+ import decimal
870
+ import enum
871
+ import fractions
872
+ import functools
873
+ import importlib
874
+ import ipaddress
875
+ import itertools
876
+ import json
877
+ import logging
878
+ import os.path
879
+ import reprlib
880
+ import sys
881
+ import threading
882
+ import traceback
883
+ import types
884
+ import uuid
885
+ from collections.abc import (
886
+ Callable,
887
+ Iterator,
888
+ Mapping,
889
+ Sequence,
890
+ Set,
891
+ )
892
+ from inspect import (
893
+ Parameter,
894
+ signature,
895
+ )
896
+ from typing import (
897
+ Any,
898
+ ClassVar,
899
+ Final,
900
+ Literal,
901
+ Protocol,
902
+ TypeAlias,
903
+ TypeVar,
904
+ cast,
905
+ overload,
906
+ )
907
+
908
+
909
+ __all__ = (
910
+ 'COMMONLY_EXPECTED_NON_STANDARD_OUTPUT_KEYS',
911
+ 'STANDARD_RECORD_ATTR_TO_OUTPUT_KEY',
912
+
913
+ 'StructuredLogsFormatter',
914
+ 'ExtendedMessage',
915
+ 'xm',
916
+
917
+ 'make_constant_value_provider',
918
+ 'register_log_record_attr_auto_maker',
919
+ 'unregister_log_record_attr_auto_maker',
920
+
921
+ 'ValueProvider',
922
+ 'OutputSerializer',
923
+ 'DottedPath',
924
+ 'KwargsMappingAsLiteralEvaluableString',
925
+ )
926
+
927
+
928
+ #
929
+ # Global constants
930
+ #
931
+
932
+
933
+ COMMONLY_EXPECTED_NON_STANDARD_OUTPUT_KEYS: Final[Set[str]] = frozenset({
934
+ # These items are to be provided automatically (at least by default)
935
+ # by the `StructuredLogsFormatter`'s machinery.
936
+ 'py_ver',
937
+ 'script_args',
938
+
939
+ # These items *need* to be provided individually per system/component
940
+ # (in `StructuredLogsFormatter` configuration, or by subclassing...).
941
+ 'system',
942
+ 'component',
943
+ 'component_type',
944
+ })
945
+
946
+
947
+ STANDARD_RECORD_ATTR_TO_OUTPUT_KEY: Final[Mapping[str, str | None]] = types.MappingProxyType({
948
+ 'asctime': 'timestamp',
949
+ 'exc_info': 'exc_info',
950
+ 'exc_text': 'exc_text',
951
+ 'funcName': 'func',
952
+ 'levelname': 'level',
953
+ 'levelno': 'levelno',
954
+ 'lineno': 'lineno',
955
+ 'message': 'message',
956
+ 'msg': 'message_base',
957
+ 'name': 'logger',
958
+ 'pathname': 'src',
959
+ 'process': 'pid',
960
+ 'processName': 'process_name',
961
+ 'stack_info': 'stack_info',
962
+ 'thread': 'thread_id',
963
+ 'threadName': 'thread_name',
964
+ 'taskName': 'async_task_name',
965
+
966
+ # The following log record attributes are
967
+ # to be *discarded* (at least by default):
968
+ 'args': None, # <- Info it conveys is typically redundant (with respect to `message`).
969
+ 'created': None, # <- The `asctime` attribute provides sufficient info.
970
+ 'filename': None, # <- The `pathname` attribute provides sufficient info.
971
+ 'module': None, # <- Redundant and confusing (just `filename` without its suffix).
972
+ 'msecs': None, # <- The `asctime` attribute provides sufficient info.
973
+ 'relativeCreated': None, # <- Confusing and hardly useful. (Uptime in milliseconds? Meh...)
974
+ })
975
+
976
+
977
+ #
978
+ # Actual tools
979
+ #
980
+
981
+
982
+ class StructuredLogsFormatter(logging.Formatter):
983
+
984
+ """
985
+ A subclass of [`logging.Formatter`][] to form structured log entries.
986
+
987
+ !!! tip
988
+
989
+ If the three call signatures defined by the
990
+ **[`StructuredLogsFormatter`][]** constructor seem
991
+ overwhelming, don't worry. In most cases, you will
992
+ only really be interested in the first one (the
993
+ *main* signature). The details are provided below.
994
+
995
+ !!! info "See also"
996
+
997
+ For extra information about **`StructuredLogsFormatter`**,
998
+ including a bunch of usage examples and configuration tips,
999
+ see the **[Tool: `StructuredLogsFormatter`](guide.md#certlib.log--tool-structuredlogsformatter)**
1000
+ section of the *User's Guide*.
1001
+
1002
+ **Constructor arguments** (all *keyword-only*, all *optional*):
1003
+
1004
+ * **`defaults`** (a [`dict`][] or other mapping; default: `{}`):
1005
+ maps *output data* keys to values each of which specifies
1006
+ the *default value* for the respective key (see also the
1007
+ [`make_base_defaults`][] method...).
1008
+
1009
+ * **`auto_makers`** (a [`dict`][] or other mapping; default: `{}`):
1010
+ maps *output data* keys to respective *auto-makers* (argumentless
1011
+ factories of *output data* values -- to be automatically called
1012
+ whenever a log entry is prepared). Each *auto-maker* can be
1013
+ specified either directly or as a string being a *dotted path*
1014
+ (*importable dotted name*) that points to an *auto-maker* (see
1015
+ also the [`make_base_auto_makers`][] method...).
1016
+
1017
+ * **`serializer`** (a function or other callable; default: [`json.dumps`][]):
1018
+ a callable that takes one argument being an *output data* [`dict`][]
1019
+ (of type `dict[str, OutputValue]`, where [`OutputValue`][] denotes
1020
+ whatever can be returned by the [`prepare_value`][] method) and that
1021
+ returns a string (presumably, a JSON-serialized form of that dict,
1022
+ even though you may decide to use some other serialization format,
1023
+ if this is OK for you/your organization). Alternatively, a string
1024
+ being a *dotted path* (*importable dotted name*) that points to
1025
+ such a function (callable) can be given as the **`serializer`**
1026
+ argument.
1027
+
1028
+ !!! note
1029
+
1030
+ The default implementation of the **[`prepare_value`][]** method
1031
+ always returns [`json.dumps`][]-serializable values.
1032
+
1033
+ !!! warning "Interface restriction"
1034
+
1035
+ The **`serializer`** callable should *not* mutate the argument it
1036
+ takes or anything inside it (regardless of the level of nesting,
1037
+ if any nested data is present). If some data needs to be modified,
1038
+ a completely *new* object should be created. Doing otherwise will
1039
+ result in undefined behavior.
1040
+
1041
+ **Alternatively**, a mapping (especially a [`dict`][]) of keyword
1042
+ arguments compatible with the main signature described above, or an
1043
+ [`ast.literal_eval`][]-evaluable string representing such a mapping
1044
+ (`dict`), can be passed to the [`StructuredLogsFormatter`][] constructor
1045
+ as the *first positional argument*.
1046
+
1047
+ **Moreover**, *extra* arguments that match -- by *position*
1048
+ or by *name* -- any _**non**-keyword-only_ parameters defined
1049
+ by [`logging.Formatter`][] are *accepted but ignored* by the
1050
+ `StructuredLogsFormatter` constructor, *provided that* the
1051
+ value of each (if given) is the respective parameter's default
1052
+ value; that is:
1053
+
1054
+ * the *first* or **`fmt`** argument -- needs to be [`None`][] (except
1055
+ that it is fine if the *first* argument is a mapping or a string
1056
+ representing a mapping, as described above...);
1057
+
1058
+ * the *second* or **`datefmt`** argument -- needs to be [`None`][];
1059
+
1060
+ * the *third* or **`style`** argument -- needs to be the `"%"` string;
1061
+
1062
+ * the *fourth* or **`validate`** argument -- needs to be [`True`][].
1063
+
1064
+ If any of them does not comply, [`TypeError`][] is raised.
1065
+
1066
+ !!! note
1067
+
1068
+ Thanks to the interface extensions described above, you can
1069
+ configure a **`StructuredLogsFormatter`** even if you are
1070
+ using the [`logging.config.fileConfig`][]-specific configuration
1071
+ format (which, despite its limitations, is still quite popular).
1072
+
1073
+ See the **`formatter_structured`** section of the `fileConfig`-style
1074
+ [configuration example](guide.md#certlib.log--loggingconfigfileconfig-style-configuration-example)
1075
+ in the *User's Guide*.
1076
+
1077
+ This class defines the following extendable/overridable hook methods:
1078
+
1079
+ * [`get_output_keys_required_in_defaults_or_auto_makers`][]
1080
+ * [`make_base_defaults`][]
1081
+ * [`make_base_auto_makers`][]
1082
+ * [`make_base_record_attr_to_output_key`][]
1083
+ * [`format_timestamp`][]
1084
+ * [`get_prepared_output_data`][]
1085
+ * [`prepare_value`][]
1086
+ * [`prepare_submapping_key`][]
1087
+ * [`serialize_prepared_output_data`][]
1088
+
1089
+ In some of the individual descriptions of these methods, several other
1090
+ elements of the `StructuredLogsFormatter`'s interface and behavior are
1091
+ also discussed -- in particular, the following instance attributes:
1092
+
1093
+ * [`defaults`][]
1094
+ * [`auto_makers`][]
1095
+ * [`auto_made_record_attr_prefix`][]
1096
+ * [`record_attr_to_output_key`][]
1097
+ * [`serializer`][]
1098
+
1099
+ !!! warning "Interface restriction"
1100
+
1101
+ Once an instance of **`StructuredLogsFormatter`** is initialized,
1102
+ the instance attributes listed above should be treated as
1103
+ _**read-only**_ and _**immutable**_ ones (together with all their
1104
+ contents, regardless of the level of nesting, if any nested data
1105
+ is present). Doing otherwise will result in undefined behavior.
1106
+
1107
+ When it comes to customizing the format of log entry *timestamps*, the
1108
+ related attributes defined by the [`logging.Formatter`][] base class
1109
+ (namely: `converter`, `default_time_format` and `default_msec_format`)
1110
+ are _**ignored**_ by the machinery of `StructuredLogsFormatter`.
1111
+
1112
+ To learn how to actually *customize timestamp formatting*, please
1113
+ refer to the description of the `StructuredLogsFormatter`'s
1114
+ [`format_timestamp`][] method.
1115
+
1116
+ !!! warning "Additional requirement"
1117
+
1118
+ Regarding the initialization of a **`StructuredLogsFormatter`**
1119
+ instance, it is also required that every *output data* key (just
1120
+ *key*, as we are *not* talking about *output data* values here)
1121
+ appearing in any of the mappings listed below -- be a *string*
1122
+ and *not* exceed 200 characters (otherwise, respectively,
1123
+ [`TypeError`][] or [`ValueError`][] will be raised by the
1124
+ constructor). The mappings covered by this requirement are:
1125
+
1126
+ * that returned by the **[`make_base_defaults`][]** method,
1127
+
1128
+ * the **`defaults`** argument to the
1129
+ [constructor][StructuredLogsFormatter] (if actually passed),
1130
+
1131
+ * that returned by the **[`make_base_auto_makers`][]** method,
1132
+
1133
+ * the **`auto_makers`** argument to the
1134
+ [constructor][StructuredLogsFormatter] (if actually passed),
1135
+
1136
+ * that returned by the **[`make_base_record_attr_to_output_key`][]**
1137
+ method (note that the requirement in question applies to *output
1138
+ data* keys -- which, when it comes to this mapping, are its
1139
+ *values*, not its *keys*; and note that this mapping's values
1140
+ are also allowed to be [`None`][]).
1141
+ """
1142
+
1143
+ #
1144
+ # Attributes and instance lifecycle (public stuff)
1145
+
1146
+ # * This-class-specific instance attributes:
1147
+
1148
+ defaults: Final[Mapping[str, OutputValue]]
1149
+ auto_makers: Final[Mapping[str, ValueProvider[object]]]
1150
+ auto_made_record_attr_prefix: Final[str]
1151
+ record_attr_to_output_key: Final[Mapping[str, str | None]]
1152
+ serializer: Final[OutputSerializer]
1153
+
1154
+ # * Instance-lifecycle-related stuff:
1155
+
1156
+ @overload
1157
+ def __init__(
1158
+ self, /,
1159
+ *,
1160
+ defaults: Mapping[str, object] | None = None,
1161
+ auto_makers: Mapping[str, ValueProvider[object] | DottedPath] | None = None,
1162
+ serializer: OutputSerializer | DottedPath = json.dumps,
1163
+ ):
1164
+ ...
1165
+
1166
+ @overload
1167
+ # A variant for cases when passing real *keyword arguments* is
1168
+ # impossible (e.g., when using `logging.config.fileConfig()`)
1169
+ def __init__(
1170
+ self,
1171
+
1172
+ # This is required to be a mapping (e.g., a dict) of keyword
1173
+ # arguments compatible with the first `__init__()` signature
1174
+ # variant (declared above), or a string that will result in
1175
+ # such a mapping if evaluated with `ast.literal_eval()`.
1176
+ mapping_of_kwargs_compatible_with_main_signature: (
1177
+ Mapping[Literal['defaults', 'auto_makers', 'serializer'], Any]
1178
+ | KwargsMappingAsLiteralEvaluableString
1179
+ ),
1180
+ /,
1181
+
1182
+ # These three `logging.Formatter`-specific arguments are to
1183
+ # be *accepted and ignored* as long as the value of each is
1184
+ # the respective `logging.Formatter`-specific default value.
1185
+ datefmt: None = None,
1186
+ style: Literal['%'] = '%',
1187
+ validate: Literal[True] = True,
1188
+ ):
1189
+ ...
1190
+
1191
+ @overload
1192
+ # A variant added just for clarity/completeness...
1193
+ def __init__(
1194
+ self, /,
1195
+
1196
+ # These four `logging.Formatter`-specific arguments are to
1197
+ # be *accepted and ignored* as long as the value of each is
1198
+ # the respective `logging.Formatter`-specific default value.
1199
+ fmt: None = None,
1200
+ datefmt: None = None,
1201
+ style: Literal['%'] = '%',
1202
+ validate: Literal[True] = True,
1203
+
1204
+ *,
1205
+ defaults: Mapping[str, object] | None = None,
1206
+ auto_makers: Mapping[str, ValueProvider[object] | DottedPath] | None = None,
1207
+ serializer: OutputSerializer | DottedPath = json.dumps,
1208
+ ):
1209
+ ...
1210
+
1211
+ def __init__(self, /, *args: Any, **kwargs: Any):
1212
+ arguments = self._resolve_init_arguments(args, *args, **kwargs)
1213
+
1214
+ given_defaults = arguments.pop('defaults', None) or {}
1215
+ given_auto_makers = arguments.pop('auto_makers', None) or {}
1216
+ given_serializer = arguments.pop('serializer', json.dumps)
1217
+
1218
+ if arguments:
1219
+ raise TypeError(
1220
+ f'{type(self).__init__.__qualname__}() got unexpected '
1221
+ f'keyword argument(s): {", ".join(map(ascii, arguments))}'
1222
+ )
1223
+
1224
+ super().__init__()
1225
+
1226
+ unfiltered_defaults = self._get_unfiltered_defaults(given_defaults)
1227
+ unprefixed_auto_makers = self._get_unprefixed_auto_makers(given_auto_makers)
1228
+ self._check_output_keys_required_in_defaults_or_auto_makers(
1229
+ unfiltered_defaults,
1230
+ unprefixed_auto_makers,
1231
+ )
1232
+
1233
+ actual_defaults = self._get_actual_defaults(unfiltered_defaults)
1234
+ auto_made_record_attr_prefix = self._get_auto_made_record_attr_prefix()
1235
+ actual_auto_makers = self._get_actual_auto_makers(
1236
+ auto_made_record_attr_prefix,
1237
+ unprefixed_auto_makers,
1238
+ )
1239
+ self.defaults = actual_defaults
1240
+ self.auto_makers = actual_auto_makers
1241
+ self.auto_made_record_attr_prefix = auto_made_record_attr_prefix
1242
+
1243
+ self.record_attr_to_output_key = self._get_record_attr_to_output_key()
1244
+ self.serializer = self._get_actual_serializer(given_serializer)
1245
+
1246
+ for rec_attr, auto_maker in self.auto_makers.items():
1247
+ register_log_record_attr_auto_maker(rec_attr, auto_maker)
1248
+
1249
+ def unregister_auto_makers(self) -> None:
1250
+ """
1251
+ A rarely useful method: you should invoke it on an instance
1252
+ of `StructuredLogsFormatter` *only when* you need to stop
1253
+ using that instance but continue using any `logging` stuff
1254
+ during further program execution (this does not seem to be
1255
+ a common case).
1256
+ """
1257
+ for rec_attr in self.auto_makers.keys():
1258
+ unregister_log_record_attr_auto_maker(rec_attr)
1259
+
1260
+ #
1261
+ # Overridden/extended methods of `logging.Formatter`
1262
+
1263
+ def format(self, record: logging.LogRecord) -> str:
1264
+ """
1265
+ Overrides the [`logging.Formatter`'s
1266
+ implementation][logging.Formatter.format] with
1267
+ a `StructuredLogsFormatter`-specific one.
1268
+
1269
+ In *some* respects, the `StructuredLogsFormatter`'s implementation
1270
+ of this method is similar to the `logging.Formatter`'s original. In
1271
+ particular, it makes use of the [`usesTime`][], [`formatTime`][],
1272
+ [`formatMessage`][] and [`formatException`][logging.Formatter.formatException]
1273
+ methods in a similar way, and assigns values to the same log record
1274
+ attributes: [`message`, `asctime` and `exc_text`](https://docs.python.org/3/library/logging.html#logrecord-attributes),
1275
+ making doing so subject to the same conditions (where applicable).
1276
+ However, it differs from the original in the following ways:
1277
+
1278
+ * of the methods mentioned above, the `formatMessage` one is
1279
+ always invoked last (in particular, *after* the log record's
1280
+ `exc_text` attribute is possibly set to a value returned by
1281
+ `formatException`);
1282
+
1283
+ * the string returned by `formatMessage` becomes the return value
1284
+ of *this* method (so this method *never* appends to that string
1285
+ any *formatted traceback* or *formatted stack information*, and
1286
+ it does *not* invoke [`formatStack`][logging.Formatter.formatStack]
1287
+ either); that string is supposed to represent the *output data*
1288
+ dict after serialization (therefore, it should already include,
1289
+ among others, any exception/stack information, if such stuff was
1290
+ requested and obtained);
1291
+
1292
+ * regarding how the target value of the log record's `message`
1293
+ attribute is determined: if the `msg` attribute of the given
1294
+ log record is an instance of [`ExtendedMessage`][] ([`xm`][]),
1295
+ then that instance's [`get_message_value`][ExtendedMessage.get_message_value]
1296
+ method is invoked (directly), *instead* of the log record's
1297
+ method [`getMessage`][logging.LogRecord.getMessage].
1298
+ """
1299
+ # (Compare to the source code of `logging.Formatter.format()`...)
1300
+ msg = getattr(record, 'msg', None)
1301
+ if isinstance(msg, ExtendedMessage):
1302
+ if args := getattr(record, 'args', None):
1303
+ args_repr = self._get_record_args_repr(args)
1304
+ raise TypeError(
1305
+ f"the specified log message base is an instance "
1306
+ f"of {ExtendedMessage.__qualname__} ({msg=!a}); "
1307
+ f"in such a case, any positional arguments to "
1308
+ f"format the log message should have been passed "
1309
+ f"to the `{ExtendedMessage.__qualname__}(...)` "
1310
+ f"(or `xm(...)`) call, not to the logger method "
1311
+ f"call itself (*args obtained by it: {args_repr})"
1312
+ )
1313
+ record.message = msg.get_message_value()
1314
+ else:
1315
+ record.message = record.getMessage()
1316
+ if self.usesTime():
1317
+ record.asctime = self.formatTime(record, self.datefmt)
1318
+ if record.exc_info and not record.exc_text:
1319
+ record.exc_text = self.formatException(record.exc_info)
1320
+ return self.formatMessage(record)
1321
+
1322
+ def usesTime(self) -> bool:
1323
+ """
1324
+ Overrides the [`logging.Formatter`][]'s implementation with one
1325
+ that always returns [`True`][].
1326
+ """
1327
+ return True
1328
+
1329
+ def formatTime(
1330
+ self,
1331
+ record: logging.LogRecord,
1332
+ datefmt: str | None = None,
1333
+ ) -> str:
1334
+ """
1335
+ Overrides the [`logging.Formatter`'s
1336
+ implementation][logging.Formatter.formatTime] with one that
1337
+ delegates its entire job to the [`format_timestamp`][] method
1338
+ (a `StructuredLogsFormatter`-specific one), but first checks
1339
+ if the **`datefmt`** argument is [`None`][] (if it is anything
1340
+ else, [`TypeError`][] is raised -- which then, typically, is
1341
+ suppressed and, possibly, printed to [`sys.stderr`][] by
1342
+ [`logging.Handler.handleError`][]...).
1343
+ """
1344
+ if datefmt is not None:
1345
+ slf = StructuredLogsFormatter.__qualname__
1346
+ slf_format_timestamp = f'{StructuredLogsFormatter.format_timestamp.__qualname__}()'
1347
+ slf_formatTime = f'{StructuredLogsFormatter.formatTime.__qualname__}()' # noqa
1348
+ raise TypeError(
1349
+ f"{datefmt=!a}, whereas for a `{__name__}.{slf}`-derived "
1350
+ f"formatter it should be None. To customize timestamp "
1351
+ f"formatting in your logs, instead of trying to set "
1352
+ f"`datefmt` or other `logging.Formatter`-specific stuff "
1353
+ f"(*not* used by the `{slf}`'s machinery!), you should "
1354
+ f"rather extend/override (in your custom subclass) the "
1355
+ f"`{slf_format_timestamp}` method. (Alternatively, instead "
1356
+ f"of that, you might decide to completely override the "
1357
+ f"`{slf_formatTime}` method, providing an implementation "
1358
+ f"which, for example, would use the `logging.Formatter`'s "
1359
+ f"legacy timestamp-formatting-related stuff, without any "
1360
+ f"use of the `{slf_format_timestamp}` method...)"
1361
+ )
1362
+ return self.format_timestamp(record)
1363
+
1364
+ def formatMessage(self, record: logging.LogRecord) -> str:
1365
+ """
1366
+ Overrides the [`logging.Formatter`][]'s implementation with
1367
+ one that:
1368
+
1369
+ * obtains a ready *output data* dict by applying the
1370
+ [`get_prepared_output_data`][] method to the given
1371
+ log record;
1372
+
1373
+ * applies the [`serialize_prepared_output_data`][]
1374
+ method to the obtained *output data* dict, and
1375
+ returns the result.
1376
+
1377
+ !!! note
1378
+
1379
+ The **`formatMessage`** name may be slightly misleading. Let
1380
+ us emphasize that the job of this method is *always* -- also
1381
+ in the case of the original **`logging.Formatter`** class --
1382
+ to format the crux of the _**entire**_ log entry, _**not**_
1383
+ just the value of the log record's **`message`** attribute.
1384
+ Formatting the latter is the job of the log record's
1385
+ **[`getMessage`][logging.LogRecord.getMessage]** method,
1386
+ or -- when the machinery of **`StructuredLogsFormatter`**
1387
+ deals with an **[`ExtendedMessage`][]** (**[`xm`][]**)
1388
+ instance -- of the **[`get_message_value`][ExtendedMessage.get_message_value]**
1389
+ method of that instance.
1390
+ """
1391
+ output_data = self.get_prepared_output_data(record)
1392
+ return self.serialize_prepared_output_data(output_data)
1393
+
1394
+ #
1395
+ # This-class-specific overridable/extendable hooks (+ related constants)
1396
+
1397
+ def get_output_keys_required_in_defaults_or_auto_makers(self) -> Set[str]:
1398
+ """
1399
+ A hook method: extend it in a subclass to impose (more)
1400
+ *output data* keys *required to be specified* for the
1401
+ purpose of setting [`defaults`][] and [`auto_makers`][].
1402
+
1403
+ !!! note
1404
+
1405
+ Obviously, you can also extend/override this method to define
1406
+ *fewer* required keys than by default (perhaps even *no* one)
1407
+ if this is OK for you/your organization.
1408
+
1409
+ To be more precise: this method's return value defines the set of
1410
+ keys *required to be included* in *at least one* of the following
1411
+ mappings:
1412
+
1413
+ * that returned by the [`make_base_defaults`][] method,
1414
+
1415
+ * the **`defaults`** argument to the
1416
+ [constructor][StructuredLogsFormatter] (if actually passed),
1417
+
1418
+ * that returned by the [`make_base_auto_makers`][] method,
1419
+
1420
+ * the **`auto_makers`** argument to the
1421
+ [constructor][StructuredLogsFormatter] (if actually passed).
1422
+
1423
+ Whether this requirement is satisfied is checked during the
1424
+ formatter initialization. If the check fails, [`KeyError`][]
1425
+ is raised.
1426
+
1427
+ The default implementation of this method just uses the set of
1428
+ keys defined as [`COMMONLY_EXPECTED_NON_STANDARD_OUTPUT_KEYS`][]
1429
+ (here it is worth noting that, because the default implementation
1430
+ of [`make_base_auto_makers`][] already provides *auto-makers*
1431
+ for certain keys, the *only* keys for which it is *required*
1432
+ to specify *default values* or *auto-makers* when invoking the
1433
+ [`StructuredLogsFormatter`][] constructor -- are: `"system"`,
1434
+ `"component"` and `"component_type"`).
1435
+
1436
+ !!! info
1437
+
1438
+ The requirement in question is considered satisfied _**also**_
1439
+ if some (or all) of the required items are provided *only as
1440
+ defaults* (i.e., only by the **[`make_base_defaults`][]**'s
1441
+ result or the **`defaults`** constructor argument) *and* some
1442
+ (or all) of them define such *default values* that -- after
1443
+ being transformed by the **[`prepare_value`][]** method --
1444
+ are *void* values, such as [`None`][] (despite the fact that
1445
+ such *void* values are always *excluded* from the ultimate
1446
+ collection of *default values* -- see the *Related interfaces*
1447
+ note in the **[`make_base_defaults`][]** method's description).
1448
+
1449
+ ```python
1450
+ my_formatter = StructuredLogsFormatter(
1451
+ # This satisfies the requirement:
1452
+ defaults={
1453
+ "system": None,
1454
+ "component": None,
1455
+ "component_type": None,
1456
+ },
1457
+ )
1458
+ ```
1459
+
1460
+ In other words, the interface does not prevent you from
1461
+ effectively omitting from *output data* the keys specified
1462
+ by this method's result, but you must explicitly state that
1463
+ you really want this (so that it can be safely assumed that
1464
+ this is OK for you/your organization).
1465
+ """
1466
+ assert isinstance(COMMONLY_EXPECTED_NON_STANDARD_OUTPUT_KEYS, frozenset)
1467
+ return COMMONLY_EXPECTED_NON_STANDARD_OUTPUT_KEYS
1468
+
1469
+ def make_base_defaults(self) -> Mapping[str, object]:
1470
+ """
1471
+ A hook method: extend it in a subclass to define basic *default
1472
+ values* for output.
1473
+
1474
+ Automatically invoked on the formatter initialization. Each key in
1475
+ the resultant mapping needs to be an *output data* key, and each
1476
+ value in that mapping needs to be the desired *default value* for
1477
+ that key.
1478
+
1479
+ The default implementation of this method returns an empty mapping.
1480
+
1481
+ !!! info "Related interfaces"
1482
+
1483
+ For every instance, the **[`defaults`][]** mapping (which is
1484
+ supposed to specify all *default values* for any *output data*
1485
+ to be generated by the instance) is based on this method's
1486
+ result, but is then updated with all items from the **`defaults`**
1487
+ [constructor argument][StructuredLogsFormatter] (if given), and
1488
+ adjusted by applying the **[`prepare_value`][]** method to each
1489
+ value, and -- then -- by deleting each key to which a *void*
1490
+ value is assigned (by *void* value we mean any *falsy* value
1491
+ that is *not equal* to `0`, for example: [`None`][], `""`,
1492
+ `[]` or `{}` -- but _**not:**_ [`False`][], `0`, `0.0`, etc.).
1493
+ """
1494
+ return {}
1495
+
1496
+ def make_base_auto_makers(self) -> Mapping[str, ValueProvider[object] | DottedPath]:
1497
+ """
1498
+ A hook method: extend it in a subclass to define (more)
1499
+ *auto-makers* for output.
1500
+
1501
+ Automatically invoked on the formatter initialization. Each key
1502
+ in the resultant mapping needs to be an *output data* key, and
1503
+ each value in that mapping needs to be either an *auto-maker*
1504
+ or a *dotted path* (*importable dotted name*) that points to an
1505
+ *auto-maker*. Each *auto-maker* is supposed to be an *argumentless
1506
+ function* (or *callable object* of some other type) returning --
1507
+ whenever it is called -- a candidate for an *output data* value
1508
+ (to be assigned to the respective *output data* key).
1509
+
1510
+ !!! info "See also"
1511
+
1512
+ You may want to learn more about *auto-makers*
1513
+ themselves by referring to the description of the
1514
+ **[`register_log_record_attr_auto_maker`][]** function
1515
+ (the machinery of **`StructuredLogsFormatter`**
1516
+ automatically makes use of that function, as appropriate).
1517
+
1518
+ It should also be noted, given how the default implementation of the
1519
+ [`get_prepared_output_data`][] method works, that every candidate for
1520
+ an *output data* value -- including those produced by *auto-makers*
1521
+ -- will be transformed by applying the [`prepare_value`][] method
1522
+ to it. Furthermore, whenever the result of this transformation
1523
+ is a *void* value (by which we mean any *falsy* value *not equal*
1524
+ to `0`, for example: [`None`][], `""`, `[]` or `{}` -- but
1525
+ _**not**_ [`False`][], `0`, `0.0`, etc.), then the respective
1526
+ key will *not* be included in the *output data* dict (*even*
1527
+ if a *default value* is defined for that key; and *even* if
1528
+ that key was among those included in the set returned by the
1529
+ [`get_output_keys_required_in_defaults_or_auto_makers`][] method
1530
+ when the formatter instance was initialized).
1531
+
1532
+ The default implementation of this method provides a couple
1533
+ of *auto-makers* which acquire some basic information about
1534
+ the execution environment (e.g., the Python version).
1535
+
1536
+ !!! info "Related interfaces"
1537
+
1538
+ For every instance, the **[`auto_makers`][]** mapping
1539
+ (which is supposed to specify all *auto-makers* related
1540
+ to the instance) is based on this method's result, but
1541
+ is then updated with all items from the **`auto_makers`**
1542
+ [constructor argument][StructuredLogsFormatter] (if given),
1543
+ and -- then -- adjusted by prefixing each key with the value
1544
+ of the **[`auto_made_record_attr_prefix`][]** attribute
1545
+ (which is an automatically generated opaque string, created
1546
+ separately for each instance of **`StructuredLogsFormatter`**,
1547
+ guaranteed to be unique within a Python interpreter run).
1548
+
1549
+ (Therefore -- concerning the stuff produced by a particular
1550
+ formatter instance's *auto-makers* -- the respective
1551
+ *output data* items will be obtained by picking those
1552
+ log record object's attributes whose names are prefixed
1553
+ with the formatter's **`auto_made_record_attr_prefix`**
1554
+ and using those names *with that prefix removed* as the
1555
+ corresponding *output data* keys. On the other hand, the
1556
+ formatter will *ignore* any record attribute names prefixed
1557
+ with **`auto_made_record_attr_prefix`** of any *other*
1558
+ formatter instances, as if such attributes did not exist.
1559
+ Thanks to that, more than one **`StructuredLogsFormatter`**
1560
+ can be used -- and they will work independently of each
1561
+ other, handling just one's own *auto-made* stuff.)
1562
+ """
1563
+ return {
1564
+ key: make_constant_value_provider(value)
1565
+ for key, value in (
1566
+ ('py_ver', '.'.join(map(str, (sys.version_info or ())))),
1567
+ ('script_args', tuple(sys.argv or ())),
1568
+ )
1569
+ }
1570
+
1571
+ def make_base_record_attr_to_output_key(self) -> Mapping[str, str | None]:
1572
+ """
1573
+ A hook method: extend it in a subclass to modify the mapping of
1574
+ [log record attribute names](https://docs.python.org/3/library/logging.html#logrecord-attributes)
1575
+ to actual *output data* keys.
1576
+
1577
+ Automatically invoked on the formatter initialization. Each key
1578
+ in the resultant mapping needs to be the name of a (perhaps just
1579
+ hypothetic) log record attribute, and each value in that mapping
1580
+ needs to be either the corresponding *output data* key or [`None`][].
1581
+ In the latter case -- given how the default implementation of the
1582
+ [`get_prepared_output_data`][] method works -- the attribute will
1583
+ always be omitted whenever *output data* is generated (note that
1584
+ this does *not* apply to log record attributes *not included* in
1585
+ the mapping).
1586
+
1587
+ The default implementation of this method returns a mapping that
1588
+ contains all items of [`STANDARD_RECORD_ATTR_TO_OUTPUT_KEY`][].
1589
+ In many cases this will be quite sufficient.
1590
+
1591
+ !!! info "Related interfaces"
1592
+
1593
+ For every instance, the **[`record_attr_to_output_key`][]**
1594
+ mapping (which is supposed to specify the ultimate mapping of
1595
+ log record objects' attribute names to actual keys in *output
1596
+ data*) is based on this method's result, but is then updated
1597
+ with items suitably derived from **[`auto_makers`][]** (to
1598
+ cause that any log record attribute name prefixed with the
1599
+ instance's **[`auto_made_record_attr_prefix`][]** is mapped
1600
+ to an *output data* key being just the unprefixed version of
1601
+ that name; see the *Related interfaces* note in the description
1602
+ of the **[`make_base_auto_makers`][]** method).
1603
+ """
1604
+ return dict(STANDARD_RECORD_ATTR_TO_OUTPUT_KEY)
1605
+
1606
+ def format_timestamp(
1607
+ self,
1608
+ record: logging.LogRecord,
1609
+ *,
1610
+ timezone: dt.tzinfo | None = dt.timezone.utc,
1611
+ timestamp_as_datetime: Callable[[float, dt.tzinfo | None], dt.datetime] = (
1612
+ dt.datetime.fromtimestamp
1613
+ ),
1614
+ utc_offset_to_custom_suffix: Mapping[dt.timedelta | None, str] = types.MappingProxyType({
1615
+ # By default, if the suffix were to be
1616
+ # `+00:00`, we want it to be `Z` instead
1617
+ # (as it means the same but is shorter).
1618
+ dt.timedelta(0): 'Z',
1619
+
1620
+ # By default, if there is no explicit
1621
+ # timezone information, we want to
1622
+ # emphasize this in a visible way.
1623
+ None: ' <UNCERTAIN TIMEZONE>',
1624
+ }),
1625
+ ) -> str:
1626
+ """
1627
+ A hook method: extend/override it in a subclass to modify/redefine
1628
+ how, for each log entry, the *formatted timestamp* (`asctime`) is
1629
+ determined.
1630
+
1631
+ This method is invoked by the [`formatTime`][] method, with a log
1632
+ record (typically, an instance of [`logging.LogRecord`][]) as the
1633
+ sole argument. The log record is expected to have its [`created`
1634
+ attribute](https://docs.python.org/3/library/logging.html#logrecord-attributes)
1635
+ already set to a [`float`][] number representing a Unix timestamp.
1636
+
1637
+ What should be returned by this method is a string (presumably,
1638
+ derived somehow from the aforementioned `created` attribute of
1639
+ the log record) that will later be assigned (by the [`format`][]
1640
+ method) to the log record's `asctime` attribute.
1641
+
1642
+ The default implementation of this method should be sufficient
1643
+ in most cases. It converts the value of the given log record's
1644
+ `created` attribute to a string being an *ISO-8601-compliant*
1645
+ date and time representation, with *microsecond* resolution.
1646
+ If *no optional keyword arguments* are given (which is how
1647
+ this method is invoked by `formatTime`), the resultant time
1648
+ representation is a *UTC* one (with `Z`, rather than `+00:00`,
1649
+ as its suffix), e.g.: `"2026-03-15 13:48:56.726403Z"`.
1650
+
1651
+ !!! info
1652
+
1653
+ The [`logging.Formatter`][]-specific attributes related to
1654
+ timestamp formatting (`converter`, `default_time_format` and
1655
+ `default_msec_format`) are _**ignored**_.
1656
+
1657
+ !!! tip
1658
+
1659
+ When extending this method in a subclass, you may want to make
1660
+ your custom implementation invoke the default one with some
1661
+ keyword arguments specified. In such a case, you may want to
1662
+ reach for their default values defined by the signature of
1663
+ **[`StructuredLogsFormatter.format_timestamp`][]**; if so, refer
1664
+ to the **[`StructuredLogsFormatter.FORMAT_TIMESTAMP_DEFAULT_KWARGS`][]**
1665
+ mapping. For example:
1666
+
1667
+ ```python
1668
+ import datetime as dt
1669
+ import types
1670
+ from certlib.log import StructuredLogsFormatter
1671
+
1672
+ class EstTimezoneOrientedStructuredLogsFormatter(StructuredLogsFormatter):
1673
+
1674
+ UTC_OFFSET_FOR_EST = dt.timedelta(hours=(-5))
1675
+
1676
+ DEFAULT_TIMEZONE = dt.timezone(UTC_OFFSET_FOR_EST)
1677
+ DEFAULT_UTC_OFFSET_TO_CUSTOM_SUFFIX = types.MappingProxyType({
1678
+
1679
+ # Let's use the base class's stuff in a *DRY* manner...
1680
+ **StructuredLogsFormatter.FORMAT_TIMESTAMP_DEFAULT_KWARGS[
1681
+ 'utc_offset_to_custom_suffix'
1682
+ ],
1683
+
1684
+ # ...and extend it with this-class-specific stuff:
1685
+ **{
1686
+ # If the suffix were to be `-05:00`,
1687
+ # we want it to be ` EST` instead.
1688
+ UTC_OFFSET_FOR_EST: ' EST',
1689
+ },
1690
+ })
1691
+
1692
+ def format_timestamp(
1693
+ self,
1694
+ record,
1695
+ *,
1696
+ timezone=DEFAULT_TIMEZONE,
1697
+ utc_offset_to_custom_suffix=DEFAULT_UTC_OFFSET_TO_CUSTOM_SUFFIX,
1698
+ **kwargs,
1699
+ ):
1700
+ return super().format_timestamp(
1701
+ record,
1702
+ timezone=timezone,
1703
+ utc_offset_to_custom_suffix=utc_offset_to_custom_suffix,
1704
+ **kwargs,
1705
+ )
1706
+ ```
1707
+ """
1708
+ dt_timestamp = timestamp_as_datetime(record.created, timezone)
1709
+ custom_suffix = utc_offset_to_custom_suffix.get(dt_timestamp.utcoffset())
1710
+ if custom_suffix is None:
1711
+ return dt_timestamp.isoformat(' ', 'microseconds')
1712
+ dt_without_tzinfo = dt_timestamp.replace(tzinfo=None)
1713
+ return f"{dt_without_tzinfo.isoformat(' ', 'microseconds')}{custom_suffix}"
1714
+
1715
+ # *Note*: the `type: ignore[...]` comment below prevents *mypy* from
1716
+ # rejecting `Final` nested in `ClassVar` (which is OK in Python 3.13
1717
+ # and newer; and we use `from __future__ import annotations` anyway,
1718
+ # so at runtime we are safe regardless of Python version).
1719
+ FORMAT_TIMESTAMP_DEFAULT_KWARGS: ClassVar[Final[ # type: ignore[valid-type]
1720
+ Mapping[str, Any]
1721
+ ]] = types.MappingProxyType({
1722
+ p.name: p.default
1723
+ for p in signature(format_timestamp).parameters.values()
1724
+ if p.kind is Parameter.KEYWORD_ONLY
1725
+ })
1726
+ """
1727
+ Default values of all [`StructuredLogsFormatter.format_timestamp`][]'s
1728
+ *keyword-only* parameters (this mapping may come in handy when you
1729
+ extend that method in a subclass...).
1730
+ """
1731
+
1732
+ def get_prepared_output_data(self, record: logging.LogRecord) -> dict[str, OutputValue]:
1733
+ """
1734
+ A hook method: extend/override it in a subclass to modify/redefine
1735
+ how an *output data* dict is obtained from a log record object
1736
+ (which, at least typically, is a [`logging.LogRecord`][] instance).
1737
+
1738
+ !!! warning "Subclass behavior restriction"
1739
+
1740
+ This method should *not* mutate the given log record or any
1741
+ data it carries (regardless of the level of nesting, if any
1742
+ nested data is present). If some data needs to be modified,
1743
+ a completely *new* object should be created. Doing otherwise
1744
+ will result in undefined behavior.
1745
+
1746
+ The default implementation of this method should be sufficient
1747
+ in most cases. To build a new *output data* dict, it digs into
1748
+ the given log record (and if that log record's `msg` attribute is
1749
+ an [`ExtendedMessage`][] instance -- also into that instance...).
1750
+ While doing that, it also looks at the formatter attributes:
1751
+ [`record_attr_to_output_key`][] (when determining *output data*
1752
+ keys; see also: [`make_base_record_attr_to_output_key`][]) and
1753
+ [`defaults`][] (to suitably complement the extracted *output
1754
+ data* with *default items*; see also: [`make_base_defaults`][]),
1755
+ as well as makes intensive use of the [`prepare_value`][] method
1756
+ (to ensure that each value in the resultant *output data* dict
1757
+ will be prepared for serialization). To make this description
1758
+ comprehensive, several details -- regarding the resultant *output
1759
+ data* dict's **top-level** *keys* and *values* -- need to be
1760
+ clarified:
1761
+
1762
+ * when those **keys** and **values** are being determined based
1763
+ on the log record's contents, any log record attributes that
1764
+ have been created by *auto-makers* belonging to some *other*
1765
+ instances of `StructuredLogsFormatter` (i.e., *not* belonging
1766
+ to `self`) are *excluded* from consideration, meaning no *output
1767
+ data* items are created from them (for certain low-level details,
1768
+ see the *Related interfaces* note in the description of the
1769
+ [`make_base_auto_makers`][] method...);
1770
+
1771
+ * all existing log record attributes whose names are mapped in
1772
+ [`record_attr_to_output_key`][] to some *output data* **keys**
1773
+ -- are being *included* in the *output data* dict; this applies,
1774
+ in particular, to any log record attributes that have been
1775
+ created by *auto-makers* belonging to *this* (`self`) instance
1776
+ of `StructuredLogsFormatter` (see the *Related interfaces* note
1777
+ in the description of the [`make_base_record_attr_to_output_key`][]
1778
+ method...);
1779
+
1780
+ * all existing log record attributes whose names are mapped
1781
+ in `record_attr_to_output_key` to [`None`][] -- are being
1782
+ *excluded*;
1783
+
1784
+ * all existing log record attributes *not* created by any
1785
+ *auto-maker* and *not* included in `record_attr_to_output_key`
1786
+ -- are being *included* (!) in the *output data* dict, using
1787
+ each record attribute name as the corresponding *output data*
1788
+ **key** (as if it was mapped in `record_attr_to_output_key` to
1789
+ itself);
1790
+
1791
+ * *only* **keys** that are instances of [`str`][] are ever included
1792
+ (meaning that any non-string keys, even if they appeared at some
1793
+ stage of processing, are always *excluded*), and every key is
1794
+ *truncated* to a maximum length of 200 characters (if it was
1795
+ longer); compare this with the treatment of *nested keys* (see
1796
+ the description of the [`prepare_submapping_key`][] method...);
1797
+
1798
+ * when it comes to transforming every **value** by applying the
1799
+ aforementioned `prepare_value` method to it, if the result of
1800
+ this transformation turns out to be a *void* value (by which
1801
+ we mean any *falsy* value *not equal* to `0`, for example:
1802
+ [`None`][], `""`, `[]` or `{}` -- but _**not:**_ [`False`][],
1803
+ `0`, `0.0`, etc.), then the respective **key** is *excluded*
1804
+ (*even* if it should be included according to any other rule
1805
+ described above; and *even* if some *default value* is defined
1806
+ for that key!); note that *nested* values, even if *void*, are
1807
+ *never* subject to such an *exclusion* (at least if the default
1808
+ implementation of `prepare_value` is used);
1809
+
1810
+ * potential *item collisions* (which might occur, for example,
1811
+ when some **key** is present *both* in the `ExtendedMessage`'s
1812
+ [`data`][ExtendedMessage.data] mapping *and* among other data
1813
+ obtained from the log record's content, and the **value** to be
1814
+ assigned to that key varies depending on which of those two
1815
+ sources of information is checked) -- are avoided by suffixing
1816
+ problematic keys with one or more underscore character(s), as
1817
+ needed to prevent key duplication; such cases are expected to
1818
+ be rare.
1819
+
1820
+ !!! note
1821
+
1822
+ The said key truncation occurs *before* the said key
1823
+ deduplication -- so it is possible, although very rare
1824
+ in practice, that appending underscore(s) to certain keys
1825
+ (as described above) will result in some keys ending up
1826
+ a little longer than 200 characters.
1827
+ """
1828
+ output_data: dict[str, OutputValue] = {}
1829
+ actual_defaults = dict(self.defaults)
1830
+ handle_output_item = functools.partial(
1831
+ self._handle_output_item,
1832
+ self._DESIRED_MAX_KEY_LENGTH,
1833
+ self.prepare_value,
1834
+ actual_defaults,
1835
+ output_data,
1836
+ )
1837
+
1838
+ xm_instance = getattr(record, 'msg', None)
1839
+ if isinstance(xm_instance, ExtendedMessage):
1840
+ self._extract_output_from_xm(record, xm_instance, handle_output_item)
1841
+ else:
1842
+ xm_instance = None
1843
+
1844
+ self._extract_output_from_record(record, xm_instance, handle_output_item)
1845
+
1846
+ for key, value_prepared in actual_defaults.items():
1847
+ output_data.setdefault(key, value_prepared)
1848
+
1849
+ return output_data
1850
+
1851
+ def prepare_value(
1852
+ self,
1853
+ value: object,
1854
+ *,
1855
+ to_str_types: tuple[type, ...] = (
1856
+ dt.date, dt.datetime, dt.time,
1857
+ decimal.Decimal, enum.Enum, fractions.Fraction,
1858
+ ipaddress.IPv4Address, ipaddress.IPv4Interface, ipaddress.IPv4Network,
1859
+ ipaddress.IPv6Address, ipaddress.IPv6Interface, ipaddress.IPv6Network,
1860
+ uuid.UUID,
1861
+ ),
1862
+ pass_thru_types: tuple[type, ...] = (str, int, float, bool, type(None)),
1863
+ exclude_from_seq_types: tuple[type, ...] = (str, bytes, bytearray),
1864
+ is_dataclass: Callable[[object], bool] = dataclasses.is_dataclass,
1865
+ dataclass_as_dict: Callable[[Any], dict[str, Any]] = dataclasses.asdict,
1866
+ last_resort: Callable[[object], str] = repr,
1867
+ **kwargs: Any,
1868
+ ) -> OutputValue:
1869
+ """
1870
+ A hook method: extend/override it in a subclass to modify/redefine
1871
+ how every *value* in an *output data* dict is prepared before the
1872
+ actual data serialization.
1873
+
1874
+ !!! warning "Subclass behavior restriction"
1875
+
1876
+ This method should *not* mutate its argument or anything inside
1877
+ it (regardless of the level of nesting, if any nested data
1878
+ is present). If some data needs to be modified, a completely
1879
+ *new* object should be created. Doing otherwise will result
1880
+ in undefined behavior.
1881
+
1882
+ The default implementation of this method should be sufficient
1883
+ in most cases. It converts any *value* (even such one that is
1884
+ deeply nested inside sequences/mappings -- thanks to recursive
1885
+ calls, always passing all keyword arguments from the parent
1886
+ call...) to a form that can be serialized with [`json.dumps`][]
1887
+ (and which is -- hopefully -- short yet still readable, especially
1888
+ regarding instances of such types as: [*exceptions*][BaseException],
1889
+ [*dataclasses*][], typical [*named tuples*][collections.namedtuple],
1890
+ [`enum.Enum`][], [`uuid.UUID`][] as well as the essential types
1891
+ from the [`datetime`][] and [`ipaddress`][] modules). When it
1892
+ comes to preparing any *keys* contained in a *value* which is
1893
+ a mapping (e.g., a [`dict`][]) -- see the
1894
+ [`prepare_submapping_key`][] method...
1895
+
1896
+ !!! tip
1897
+
1898
+ When extending this method in a subclass, you may want to make
1899
+ your custom implementation invoke the default one with some
1900
+ keyword arguments specified. In such a case, you may want to
1901
+ reach for their default values defined by the signature of
1902
+ **[`StructuredLogsFormatter.prepare_value`][]**; if so, refer to
1903
+ the **[`StructuredLogsFormatter.PREPARE_VALUE_DEFAULT_KWARGS`][]**
1904
+ mapping. For example:
1905
+
1906
+ ```python
1907
+ import array, pprint
1908
+ import attrs # <- 3rd party package used just in this example
1909
+ from certlib.log import StructuredLogsFormatter
1910
+
1911
+ _BASE_KWARGS = StructuredLogsFormatter.PREPARE_VALUE_DEFAULT_KWARGS
1912
+
1913
+ class MyEnhancedStructuredLogsFormatter(StructuredLogsFormatter):
1914
+
1915
+ @staticmethod
1916
+ def default_is_dataclass(obj):
1917
+ base_is_dataclass = _BASE_KWARGS['is_dataclass']
1918
+ return base_is_dataclass(obj) or attrs.has(type(obj))
1919
+
1920
+ @staticmethod
1921
+ def default_dataclass_as_dict(obj):
1922
+ base_is_dataclass = _BASE_KWARGS['is_dataclass']
1923
+ base_dataclass_as_dict = _BASE_KWARGS['dataclass_as_dict']
1924
+ return (base_dataclass_as_dict(obj) if base_is_dataclass(obj)
1925
+ else attrs.asdict(obj))
1926
+
1927
+ def prepare_value(
1928
+ self,
1929
+ value,
1930
+ *,
1931
+ exclude_from_seq_types = (
1932
+ *_BASE_KWARGS['exclude_from_seq_types'],
1933
+ memoryview,
1934
+ array.array,
1935
+ ),
1936
+ is_dataclass=default_is_dataclass,
1937
+ dataclass_as_dict=default_dataclass_as_dict,
1938
+ last_resort=pprint.pformat,
1939
+ **kwargs,
1940
+ ):
1941
+ return super().prepare_value(
1942
+ value,
1943
+ exclude_from_seq_types=exclude_from_seq_types,
1944
+ is_dataclass=is_dataclass,
1945
+ dataclass_as_dict=dataclass_as_dict,
1946
+ last_resort=last_resort,
1947
+ **kwargs,
1948
+ )
1949
+ ```
1950
+ """
1951
+ if isinstance(value, to_str_types):
1952
+ return str(value)
1953
+
1954
+ if isinstance(value, pass_thru_types):
1955
+ return value
1956
+
1957
+ kwargs.update(
1958
+ to_str_types=to_str_types,
1959
+ pass_thru_types=pass_thru_types,
1960
+ exclude_from_seq_types=exclude_from_seq_types,
1961
+ is_dataclass=is_dataclass,
1962
+ dataclass_as_dict=dataclass_as_dict,
1963
+ last_resort=last_resort,
1964
+ )
1965
+
1966
+ if isinstance(value, Mapping):
1967
+ # Any *mapping* => convert it to a *dict*.
1968
+ prepare_key = self.prepare_submapping_key
1969
+ prepare_value = self.prepare_value
1970
+ return {
1971
+ prepare_key(key): prepare_value(val, **kwargs)
1972
+ for key, val in value.items()
1973
+ }
1974
+
1975
+ if isinstance(value, type):
1976
+ # A runtime *type* (*class*) => convert it to a *str*...
1977
+ module = getattr(value, '__module__', '<unknown module>')
1978
+ qualname = getattr(value, '__qualname__', '<unknown type>')
1979
+ full_qualified_type_name = (
1980
+ qualname if module == 'builtins'
1981
+ else f'{module}.{qualname}'
1982
+ )
1983
+ return self.prepare_value(full_qualified_type_name, **kwargs)
1984
+
1985
+ if isinstance(value, BaseException):
1986
+ # An *exception instance* => convert it to a *dict* of the
1987
+ # crucial exception's components (type, arguments, etc.).
1988
+ exc_components = {
1989
+ key: val
1990
+ for key, val in (
1991
+ ('exc_type', type(value)),
1992
+ ('args', getattr(value, 'args', None)),
1993
+ ('dict', getattr(value, '__dict__', None)),
1994
+ )
1995
+ if val
1996
+ }
1997
+ return self.prepare_value(exc_components, **kwargs)
1998
+
1999
+ if isinstance(value, Sequence) and not isinstance(value, exclude_from_seq_types):
2000
+ seq = cast(Sequence[object], value)
2001
+
2002
+ if (len(seq) == 3
2003
+ and seq[0] is type(seq[1])
2004
+ and isinstance(seq[1], BaseException)
2005
+ and (seq[2] is None
2006
+ or type(seq[2]).__name__ == 'traceback')):
2007
+ # A sequence of 3 items: exception type, that type's
2008
+ # instance and traceback (or None) => treat it as if
2009
+ # it was just the *exception instance*...
2010
+ return self.prepare_value(seq[1], **kwargs)
2011
+
2012
+ if (isinstance(seq, tuple)
2013
+ and callable(dict_from_this := getattr(seq, '_asdict', None))):
2014
+ # A tuple (presumably, a *named tuple*) with an
2015
+ # `_asdict()` method => try to use that method
2016
+ # to convert this tuple (presumably, to a *dict*).
2017
+ try:
2018
+ d = dict_from_this()
2019
+ except TypeError:
2020
+ pass
2021
+ else:
2022
+ return self.prepare_value(d, **kwargs)
2023
+
2024
+ # Some other sequence => convert it to a *list*.
2025
+ prepare_value = self.prepare_value
2026
+ return [
2027
+ prepare_value(val, **kwargs)
2028
+ for val in seq
2029
+ ]
2030
+
2031
+ if is_dataclass(value):
2032
+ # A *dataclass instance* (we're sure it's not a type, see the
2033
+ # type-dedicated check earlier...) => convert it to a *dict*.
2034
+ return self.prepare_value(dataclass_as_dict(value), **kwargs)
2035
+
2036
+ # Any other object...
2037
+ return last_resort(value)
2038
+
2039
+ # *Note*: the `type: ignore[...]` comment below prevents *mypy* from
2040
+ # rejecting `Final` nested in `ClassVar` (which is OK in Python 3.13
2041
+ # and newer; and we use `from __future__ import annotations` anyway,
2042
+ # so at runtime we are safe regardless of Python version).
2043
+ PREPARE_VALUE_DEFAULT_KWARGS: ClassVar[Final[ # type: ignore[valid-type]
2044
+ Mapping[str, Any]
2045
+ ]] = types.MappingProxyType({
2046
+ p.name: p.default
2047
+ for p in signature(prepare_value).parameters.values()
2048
+ if p.kind is Parameter.KEYWORD_ONLY
2049
+ })
2050
+ """
2051
+ Default values of all [`StructuredLogsFormatter.prepare_value`][]'s
2052
+ *keyword-only* parameters (this mapping may come in handy when you
2053
+ extend that method in a subclass...).
2054
+ """
2055
+
2056
+ def prepare_submapping_key(self, key: object) -> str:
2057
+ """
2058
+ A hook method: extend/override it in a subclass to modify/redefine
2059
+ how to prepare, before the actual data serialization, every *key*
2060
+ in every mapping (e.g., in a [`dict`][]) being a *value* inside an
2061
+ *output data* dict (possibly deeply nested within it).
2062
+
2063
+ The default implementation of this method should be sufficient in
2064
+ most cases. It applies [`str`][] to the given key (converting it
2065
+ to a string if it was not one already) and truncates the result to
2066
+ a maximum length of 200 characters (if longer).
2067
+
2068
+ !!! note
2069
+
2070
+ **[`prepare_value`][]** is what invokes this method -- so (let
2071
+ us stress that!) this method is *not* applied to top-level
2072
+ keys in the *output data* dict, but *is* applied to *each key*
2073
+ in every mapping that **`prepare_value`** takes as an input
2074
+ *value* (also, in every dict created by **`prepare_value`**
2075
+ as a result of converting an *exception*, *named tuple* or
2076
+ *dataclass* instance...). All of this is true for the default
2077
+ implementation of **`prepare_value`**. It is recommended
2078
+ (yet not enforced) that any custom implementations of the
2079
+ **`prepare_value`** method make use of *this* method in a
2080
+ similar way.
2081
+ """
2082
+ key_str = str(key)
2083
+ if len(key_str) > self._DESIRED_MAX_KEY_LENGTH:
2084
+ key_str = key_str[:self._DESIRED_MAX_KEY_LENGTH]
2085
+ return key_str
2086
+
2087
+ def serialize_prepared_output_data(self, output_data: dict[str, OutputValue]) -> str:
2088
+ """
2089
+ A hook method: extend/override it in a subclass to modify/redefine
2090
+ the *output data serialization* procedure.
2091
+
2092
+ !!! warning "Subclass behavior restriction"
2093
+
2094
+ This method should *not* mutate the given *output data* dict
2095
+ or anything inside it (regardless of the level of nesting, if
2096
+ any nested data is present). If some data needs to be modified,
2097
+ a completely *new* object should be created. Doing otherwise
2098
+ will result in undefined behavior.
2099
+
2100
+ The default implementation of this method should be sufficient in
2101
+ most cases. It just applies the [`serializer`][] callable to the
2102
+ given *output data* dict, and returns the result.
2103
+
2104
+ !!! info "Related interfaces"
2105
+
2106
+ By default, the **[`serializer`][]** attribute is set to the
2107
+ standard [`json.dumps`][] function, but this can be changed
2108
+ by specifying the **`serializer`** argument when invoking the
2109
+ **[`StructuredLogsFormatter`][]** constructor.
2110
+ """
2111
+ return self.serializer(output_data)
2112
+
2113
+ #
2114
+ # Internals (should not be used or extended/overridden outside this module!)
2115
+
2116
+ _DESIRED_MAX_KEY_LENGTH: Final[int] = 200
2117
+ _COMMON_PART_OF_PER_FORMATTER_AUTO_MADE_RECORD_ATTR_PREFIX: Final[str] = '_auto-made-for#'
2118
+
2119
+ _auto_made_record_attr_prefix_creation_lock: Final[threading.Lock] = threading.Lock()
2120
+ _auto_made_record_attr_prefix_creation_count: Final[Iterator[int]] = itertools.count(start=1)
2121
+
2122
+ def _resolve_init_arguments(
2123
+ self,
2124
+ raw_positional_args: Sequence[Any],
2125
+ /,
2126
+
2127
+ # (Compare to the signature of `logging.Formatter.__init__()`...)
2128
+ fmt: Mapping[str, Any] | KwargsMappingAsLiteralEvaluableString | None = None,
2129
+ datefmt: None = None,
2130
+ style: Literal['%'] = '%',
2131
+ validate: Literal[True] = True,
2132
+
2133
+ *excessive_positional_args: object,
2134
+ **meaningful_arguments: Any,
2135
+ ) -> dict[str, Any]:
2136
+ if excessive_positional_args:
2137
+ raise TypeError(
2138
+ f'{type(self).__init__.__qualname__}() '
2139
+ f'got excessive positional argument(s): '
2140
+ f'{", ".join(map(ascii, excessive_positional_args))})'
2141
+ )
2142
+ if fmt is not None:
2143
+ if not raw_positional_args:
2144
+ raise TypeError(
2145
+ f'for {type(self).__init__.__qualname__}(), '
2146
+ f'argument `fmt` is not customizable'
2147
+ )
2148
+ assert fmt is raw_positional_args[0]
2149
+ first_arg = raw_positional_args[0]
2150
+ if isinstance(first_arg, str):
2151
+ try:
2152
+ first_arg = ast.literal_eval(first_arg)
2153
+ except Exception as exc:
2154
+ raise ValueError(
2155
+ f'an error occurred when trying to evaluate (as '
2156
+ f'a Python expression) the string ({first_arg!a}) '
2157
+ f'passed as the first positional argument to '
2158
+ f'{type(self).__init__.__qualname__}() '
2159
+ f'({type(exc).__qualname__}: {exc})'
2160
+ ) from exc
2161
+ if not isinstance(first_arg, Mapping):
2162
+ raise TypeError(
2163
+ f'for {type(self).__init__.__qualname__}(), the first '
2164
+ f'positional argument, if specified and not None, is '
2165
+ f'expected to be a mapping (or an `ast.literal_eval()`'
2166
+ f'-evaluable string representing a mapping), as an '
2167
+ f'alternative means of providing keyword arguments '
2168
+ f'(in contexts when passing real keyword arguments '
2169
+ f'is not possible); got: {first_arg!a} (not a mapping)'
2170
+ )
2171
+ if meaningful_arguments:
2172
+ listing = ', '.join(map(ascii, meaningful_arguments))
2173
+ raise TypeError(
2174
+ f'for {type(self).__init__.__qualname__}(), when '
2175
+ f'you pass a mapping (or an `ast.literal_eval()`-'
2176
+ f'evaluable string representing a mapping) as the '
2177
+ f'first positional argument, as an alternative '
2178
+ f'means of providing keyword arguments, you should '
2179
+ f'not pass real keyword arguments (whereas you did '
2180
+ f'pass some: {listing})'
2181
+ )
2182
+ meaningful_arguments = dict(first_arg)
2183
+ if datefmt is not None:
2184
+ raise TypeError(
2185
+ f'for {type(self).__init__.__qualname__}(), '
2186
+ f'argument `datefmt` is not customizable'
2187
+ )
2188
+ if style != '%':
2189
+ raise TypeError(
2190
+ f'for {type(self).__init__.__qualname__}(), '
2191
+ f'argument `style` is not customizable'
2192
+ )
2193
+ if validate is not True: # noqa
2194
+ raise TypeError(
2195
+ f'for {type(self).__init__.__qualname__}(), '
2196
+ f'argument `validate` is not customizable'
2197
+ )
2198
+ return meaningful_arguments
2199
+
2200
+ def _get_unfiltered_defaults(
2201
+ self,
2202
+ given_defaults: Mapping[str, object],
2203
+ ) -> Mapping[str, OutputValue]:
2204
+ raw_defaults = dict(self.make_base_defaults())
2205
+ raw_defaults.update(given_defaults)
2206
+ return dict(sorted(
2207
+ (self._validate_output_key(key), self.prepare_value(value))
2208
+ for key, value in raw_defaults.items()
2209
+ ))
2210
+
2211
+ def _get_unprefixed_auto_makers(
2212
+ self,
2213
+ given_auto_makers: Mapping[str, ValueProvider[object] | DottedPath],
2214
+ ) -> Mapping[str, ValueProvider[object]]:
2215
+ raw_auto_makers = dict(self.make_base_auto_makers())
2216
+ raw_auto_makers.update(given_auto_makers)
2217
+ return dict(sorted(
2218
+ self._normalize_and_validate_auto_maker_item(key, auto_maker)
2219
+ for key, auto_maker in raw_auto_makers.items()
2220
+ ))
2221
+
2222
+ def _normalize_and_validate_auto_maker_item(
2223
+ self,
2224
+ key: str,
2225
+ auto_maker: ValueProvider[T] | DottedPath,
2226
+ ) -> tuple[str, ValueProvider[T]]:
2227
+ key = self._validate_output_key(key)
2228
+ if isinstance(auto_maker, str):
2229
+ resolved: ValueProvider[T] = _resolve_dotted_path(auto_maker)
2230
+ auto_maker = resolved
2231
+ if not callable(auto_maker):
2232
+ raise TypeError(
2233
+ f'the {key!a} auto-maker does not appear '
2234
+ f'to be a callable object: {auto_maker!a}'
2235
+ )
2236
+ return key, auto_maker
2237
+
2238
+ def _check_output_keys_required_in_defaults_or_auto_makers(
2239
+ self,
2240
+ unfiltered_defaults: Mapping[str, OutputValue],
2241
+ unprefixed_auto_makers: Mapping[str, ValueProvider[object]],
2242
+ ) -> None:
2243
+ provided_keys = unfiltered_defaults.keys() | unprefixed_auto_makers.keys()
2244
+ required_keys = self.get_output_keys_required_in_defaults_or_auto_makers()
2245
+ if missing_keys := (required_keys - provided_keys):
2246
+ missing_keys_listing = ', '.join(map(ascii, sorted(missing_keys)))
2247
+ raise KeyError(
2248
+ f'missing default values or auto-makers '
2249
+ f'for keys: {missing_keys_listing}'
2250
+ )
2251
+
2252
+ def _get_actual_defaults(
2253
+ self,
2254
+ unfiltered_defaults: Mapping[str, OutputValue],
2255
+ ) -> Mapping[str, OutputValue]:
2256
+ return {
2257
+ key: value_prepared
2258
+ for key, value_prepared in unfiltered_defaults.items()
2259
+ # If `value_prepared` is *non-numeric* and, at the same time,
2260
+ # is *falsy* (i.e., is an object which is considered *false*
2261
+ # in a boolean context) => we skip it as a *void* value (that
2262
+ # is, a value assumed to carry *no sufficiently significant*
2263
+ # information).
2264
+ if value_prepared or value_prepared == 0
2265
+ }
2266
+
2267
+ def _get_auto_made_record_attr_prefix(self) -> str:
2268
+ with self._auto_made_record_attr_prefix_creation_lock:
2269
+ unique_number = next(self._auto_made_record_attr_prefix_creation_count)
2270
+ return (
2271
+ f'{self._COMMON_PART_OF_PER_FORMATTER_AUTO_MADE_RECORD_ATTR_PREFIX}'
2272
+ f'{unique_number:02}'
2273
+ f':'
2274
+ )
2275
+
2276
+ def _get_actual_auto_makers(
2277
+ self,
2278
+ auto_made_record_attr_prefix: str,
2279
+ unprefixed_auto_makers: Mapping[str, ValueProvider[object]],
2280
+ ) -> Mapping[str, ValueProvider[object]]:
2281
+ return {
2282
+ auto_made_record_attr_prefix + key: auto_maker
2283
+ for key, auto_maker in unprefixed_auto_makers.items()
2284
+ }
2285
+
2286
+ def _get_record_attr_to_output_key(self) -> Mapping[str, str | None]:
2287
+ base_mapping = self.make_base_record_attr_to_output_key()
2288
+ assert all(
2289
+ rec_attr.startswith(self.auto_made_record_attr_prefix)
2290
+ for rec_attr in self.auto_makers.keys()
2291
+ )
2292
+ return dict(
2293
+ # Note that here any `rec_attr` duplication
2294
+ # (hardly possible!) would cause TypeError.
2295
+ **{
2296
+ rec_attr: (
2297
+ self._validate_output_key(key) if key is not None
2298
+ else None
2299
+ )
2300
+ for rec_attr, key in base_mapping.items()
2301
+ },
2302
+ **{
2303
+ rec_attr: rec_attr.removeprefix(self.auto_made_record_attr_prefix)
2304
+ for rec_attr in self.auto_makers.keys()
2305
+ },
2306
+ )
2307
+
2308
+ def _validate_output_key(self, key: str) -> str:
2309
+ if not isinstance(key, str):
2310
+ raise TypeError(f'{key=!a} is not a str')
2311
+ if len(key) > self._DESIRED_MAX_KEY_LENGTH:
2312
+ raise ValueError(
2313
+ f'{key=!a} is longer than '
2314
+ f'{self._DESIRED_MAX_KEY_LENGTH} characters'
2315
+ )
2316
+ return str(key)
2317
+
2318
+ def _get_actual_serializer(
2319
+ self,
2320
+ given_serializer: OutputSerializer | DottedPath,
2321
+ ) -> OutputSerializer:
2322
+ serializer: OutputSerializer = (
2323
+ _resolve_dotted_path(given_serializer)
2324
+ if isinstance(given_serializer, str)
2325
+ else given_serializer
2326
+ )
2327
+ if not callable(serializer):
2328
+ raise TypeError(
2329
+ f'{serializer=!a} does not appear '
2330
+ f'to be a callable object'
2331
+ )
2332
+ return serializer
2333
+
2334
+ def _get_record_args_repr(self, args: object) -> str:
2335
+ no_seq = (str, bytes, bytearray)
2336
+ return (
2337
+ ', '.join(map(ascii, args))
2338
+ if isinstance(args, Sequence) and not isinstance(args, no_seq)
2339
+ else ascii(args) # In particular, it may be a dict.
2340
+ )
2341
+
2342
+ def _extract_output_from_xm(
2343
+ self,
2344
+ record: logging.LogRecord,
2345
+ xm_instance: ExtendedMessage,
2346
+ handle_output_item: Callable[[object, object], bool],
2347
+ ) -> None:
2348
+ attr_to_key = self.record_attr_to_output_key
2349
+
2350
+ # Extract from the given `ExtendedMessage` instance and handle...
2351
+
2352
+ # * ...its components conveying the information equivalent
2353
+ # to the standard `msg` and `args` log record attributes:
2354
+ msg_key = attr_to_key.get('msg', 'msg')
2355
+ if msg_key is not None:
2356
+ msg_value = xm_instance.get_record_msg_and_args_equivalent_info(
2357
+ pattern_result_key='pattern',
2358
+ args_result_key=attr_to_key.get('args', 'args'),
2359
+ )
2360
+ handle_output_item(msg_key, msg_value)
2361
+
2362
+ # * ...its `exc_info` attribute (if worth including;
2363
+ # and, optionally, an *exception text* derived from
2364
+ # that `exc_info` by using `self.formatException()`):
2365
+ ei_key = attr_to_key.get('exc_info', 'exc_info')
2366
+ etx_key = attr_to_key.get('exc_text', 'exc_text')
2367
+ if ((ei_key is not None or etx_key is not None)
2368
+ and (xm_ei := xm_instance.exc_info)
2369
+ ):
2370
+ if isinstance(xm_ei, BaseException):
2371
+ xm_ei = (type(xm_ei), xm_ei, xm_ei.__traceback__)
2372
+ if self._is_xm_exc_info_significant(xm_ei, record.exc_info):
2373
+ if xm_ei == (None, None, None):
2374
+ xm_ei = etx_key = None
2375
+ if ei_key is not None:
2376
+ ei_added = handle_output_item(ei_key, xm_ei)
2377
+ if not ei_added:
2378
+ etx_key = None
2379
+ if etx_key is not None and isinstance(xm_ei, tuple):
2380
+ etx_value = self.formatException(xm_ei) # noqa
2381
+ handle_output_item(etx_key, etx_value)
2382
+
2383
+ # * ...its `stack_info` attribute (if worth including):
2384
+ si_key = attr_to_key.get('stack_info', 'stack_info')
2385
+ if (si_key is not None
2386
+ and (xm_si := xm_instance.stack_info)
2387
+ and self._is_xm_stack_info_significant(xm_si, record.stack_info)
2388
+ ):
2389
+ handle_output_item(si_key, xm_si)
2390
+
2391
+ # * ...and any *extra data* (stored in its `data` attribute):
2392
+ for key, value in xm_instance.data.items():
2393
+ handle_output_item(key, value)
2394
+
2395
+ @staticmethod
2396
+ def _is_xm_exc_info_significant(xm_ei: Any, rec_ei: Any) -> bool:
2397
+ return (not rec_ei) or (
2398
+ # If `record.exc_info` is *not* a *falsy* object, then the
2399
+ # `ExtendedMessage` instance's `exc_info` attribute -- to be
2400
+ # considered *significant* -- needs to be an *exc info* tuple
2401
+ # (or an exception which we converted to such a tuple) that
2402
+ # is *different* from `record.exc_info`. So, in particular, a
2403
+ # flag value (such as True) is considered *insignificant* in
2404
+ # such a case.
2405
+ xm_ei is not rec_ei # (<- Fast check first)
2406
+ and isinstance(xm_ei, tuple)
2407
+ and xm_ei != rec_ei
2408
+ )
2409
+
2410
+ @staticmethod
2411
+ def _is_xm_stack_info_significant(xm_si: bool | str, rec_si: str | None) -> bool:
2412
+ return (not rec_si) or (
2413
+ # If `record.stack_info` is *not* a *falsy* object, then
2414
+ # the `ExtendedMessage` instance's `stack_info` attribute,
2415
+ # -- to be considered *significant* -- needs to be a `str`
2416
+ # *different* from `record.stack_info`. So a flag value
2417
+ # (True) is considered *insignificant* in such a case.
2418
+ xm_si is not rec_si # (<- Fast check first)
2419
+ and isinstance(xm_si, str)
2420
+ and xm_si != rec_si
2421
+ )
2422
+
2423
+ def _extract_output_from_record(
2424
+ self,
2425
+ record: logging.LogRecord,
2426
+ xm_instance: ExtendedMessage | None,
2427
+ handle_output_item: Callable[[object, object], bool],
2428
+ ) -> None:
2429
+ common_auto_prefix = self._COMMON_PART_OF_PER_FORMATTER_AUTO_MADE_RECORD_ATTR_PREFIX
2430
+ attr_to_key = self.record_attr_to_output_key
2431
+ exc_info_3none = (record.exc_info == (None, None, None))
2432
+
2433
+ for rec_attr, value in record.__dict__.items():
2434
+ if not isinstance(rec_attr, str):
2435
+ # (Rather unlikely, but just in case...)
2436
+ continue
2437
+
2438
+ if rec_attr.startswith(common_auto_prefix):
2439
+ # The encountered record attribute has been created by an
2440
+ # auto-maker registered by *some* `StructuredLogsFormatter`.
2441
+ # Note that, below, `key` will be set to None -- *unless*
2442
+ # that auto-maker has been registered by *this* instance
2443
+ # of `StructuredLogsFormatter`, i.e., by `self` (remember
2444
+ # that `auto_made_record_attr_prefix` is *different* for
2445
+ # each `StructuredLogsFormatter` instance).
2446
+ key = attr_to_key.get(rec_attr)
2447
+ elif (value is xm_instance is not None) and rec_attr == 'msg':
2448
+ # Already handled by `_extract_output_from_xm()`.
2449
+ continue
2450
+ else:
2451
+ if (exc_info_3none and rec_attr in ('exc_info', 'exc_text')
2452
+ or (rec_attr == 'exc_text' and not record.exc_info) # [sic!]
2453
+ ):
2454
+ value = None
2455
+ key = attr_to_key.get(rec_attr, rec_attr)
2456
+
2457
+ if key is not None:
2458
+ handle_output_item(key, value)
2459
+
2460
+ @staticmethod
2461
+ def _handle_output_item(
2462
+ # Shared (output-data-dict-wide) arguments:
2463
+ desired_max_key_length: int,
2464
+ prepare_value: Callable[[object], OutputValue],
2465
+ actual_defaults: dict[str, OutputValue],
2466
+ output_data: dict[str, OutputValue],
2467
+
2468
+ # Individual (per-output-data-item) arguments:
2469
+ key: object,
2470
+ value: object,
2471
+ ) -> bool:
2472
+
2473
+ if not isinstance(key, str):
2474
+ return False
2475
+
2476
+ if len(key) > desired_max_key_length:
2477
+ # Truncate the key (it's a rare case, hopefully).
2478
+ key = key[:desired_max_key_length]
2479
+
2480
+ value_prepared = prepare_value(value)
2481
+ if (not value_prepared) and value_prepared != 0:
2482
+ # If `value_prepared` is *non-numeric* and, at the same time,
2483
+ # is *falsy* (i.e., is an object which is considered *false*
2484
+ # in a boolean context) => we skip it as a *void* value (that
2485
+ # is, a value assumed to carry *no sufficiently significant*
2486
+ # information); then, however, we also prevent the respective
2487
+ # *default value* (if any) from being set.
2488
+ actual_defaults.pop(key, None)
2489
+ return False
2490
+
2491
+ # Finally, set the prepared item.
2492
+ actually_set_value = output_data.setdefault(key, value_prepared)
2493
+ if actually_set_value is value_prepared:
2494
+ return True
2495
+
2496
+ # Wait! Key deduplication may be needed (it's a rare case, hopefully).
2497
+ while actually_set_value != value_prepared:
2498
+ # Note that, in this case, the key length may
2499
+ # become longer than `desired_max_key_length`.
2500
+ key = f'{key}_'
2501
+ actually_set_value = output_data.setdefault(key, value_prepared)
2502
+
2503
+ # (Comparing also identities -- for cases of such
2504
+ # an object that never compares equal to itself.)
2505
+ if actually_set_value is value_prepared:
2506
+ break
2507
+
2508
+ return True
2509
+
2510
+
2511
+ class ExtendedMessage:
2512
+
2513
+ """
2514
+ A tool thanks to which you can:
2515
+
2516
+ * conveniently emit structured log entries, especially if
2517
+ [`StructuredLogsFormatter`][] is in use;
2518
+
2519
+ * use the modern `{}`-based style of log message formatting, or --
2520
+ if you just need to log pure data -- simply omit passing the text
2521
+ message pattern (regardless of what formatter is in use);
2522
+
2523
+ * defer the creation of some values (if it is costly) until the log
2524
+ entry is to be actually emitted (regardless of what formatter is
2525
+ in use).
2526
+
2527
+ There is a convenience alias of this class: **[`xm`][]**. As being
2528
+ very short, it is simply much more ergonomic than the actual class
2529
+ name -- given that this tool is intended to be used every time you
2530
+ log something. For example:
2531
+
2532
+ ```python
2533
+ import datetime, hashlib, ipaddress, logging, sys
2534
+ from certlib.log import xm
2535
+
2536
+ logging.info(xm("Hello {}!", sys.platform))
2537
+ logging.info(xm("Maxsize is {maxsize:x}", maxsize=sys.maxsize))
2538
+ logging.warning(xm(
2539
+ connection_count=42,
2540
+ client_ip=ipaddress.IPv4Address("192.168.0.121"),
2541
+ local_time=datetime.datetime.now(),
2542
+ # Value creation deferred until the log entry is to be emitted:
2543
+ payload_hash=lambda: hashlib.sha256(b'...payload...').hexdigest(),
2544
+ ))
2545
+ some_data_dict = globals()
2546
+ logging.debug(xm(some_data_dict))
2547
+ ```
2548
+
2549
+ !!! info "See also"
2550
+
2551
+ For extra information about **`ExtendedMessage`**,
2552
+ including a bunch of usage examples, see the **[Tool:
2553
+ `xm`](guide.md#certlib.log--tool-xm)** section of the
2554
+ *User's Guide*.
2555
+
2556
+ **Constructor arguments** (all *optional*):
2557
+
2558
+ * _**first positional argument**_ (default: `""`):
2559
+ the text message pattern. Expected to be a string, or any *truthy*
2560
+ object that could be converted to a string by applying [`str`][]
2561
+ to it. The pattern may contain [`{}`-formatting-style *replacement
2562
+ fields*](https://docs.python.org/3/library/string.html#format-string-syntax)
2563
+ (perhaps with a `'!'`-separated *conversion* marker, and/or a
2564
+ `':'`-separated *format spec*). The given object is assigned to
2565
+ the [`pattern`][] attribute intact, unless it is *falsy* -- then
2566
+ it is ignored, and just `""` (empty string) is assigned to that
2567
+ attribute.
2568
+
2569
+ * _**extra positional arguments**_ (if any):
2570
+ positional *args* to format the text message (they need to match
2571
+ positional/numbered *replacement fields* in the text message
2572
+ pattern -- see the *first positional argument* described above).
2573
+ A [`tuple`][] of these arguments is assigned to the [`args`][]
2574
+ attribute.
2575
+
2576
+ * **`exc_info`** (*keyword-only*; default: [`None`][]):
2577
+ its usage and related behavior are nearly identical to those of
2578
+ the same-named argument to `logging.Logger`'s methods (see [the
2579
+ relevant fragment](https://docs.python.org/3/library/logging.html#logging.Logger.debug)
2580
+ of the documentation for the `logging` module). This argument is
2581
+ assigned to the [`exc_info`][] attribute.
2582
+
2583
+ * **`stack_info`** (*keyword-only*; default: [`False`][]):
2584
+ its usage and related behavior are nearly identical to those of
2585
+ the same-named argument to `logging.Logger`'s methods (see [the
2586
+ relevant fragment](https://docs.python.org/3/library/logging.html#logging.Logger.debug)
2587
+ of the documentation for the `logging` module). This argument is
2588
+ assigned to the [`stack_info`][] attribute.
2589
+
2590
+ * **`stacklevel`** (*keyword-only*; default: `1`):
2591
+ its usage and related behavior are nearly identical to those of
2592
+ the same-named argument to `logging.Logger`'s methods (see [the
2593
+ relevant fragment](https://docs.python.org/3/library/logging.html#logging.Logger.debug)
2594
+ of the documentation for the `logging` module). This argument is
2595
+ assigned to the [`stacklevel`][] attribute.
2596
+
2597
+ * _**extra keyword arguments**_ (if any):
2598
+ all of them become *extra data* items -- to be included in the
2599
+ *output data* dict if [`StructuredLogsFormatter`][] is used, or
2600
+ to be appended to the text message (in a form resembling the
2601
+ keyword arguments syntax) if some other formatter is in use. A
2602
+ [`dict`][] of those *extra data* items is always stored as the
2603
+ [`data`][] attribute. Moreover, any items whose names match some
2604
+ *named replacement fields* in the text message pattern (specified
2605
+ as the *first positional argument*) will take part in formatting
2606
+ the actual text message (regardless of what formatter is in use).
2607
+
2608
+ **Alternatively**, a mapping (e.g., a [`dict`][]) of *extra data*
2609
+ items can be passed to the [constructor][ExtendedMessage] as the
2610
+ *first positional argument*. Then any *extra positional or keyword
2611
+ arguments* are forbidden -- except **`exc_info`**, **`stack_info`**
2612
+ and **`stacklevel`**. The effect is the same as passing each of
2613
+ that mapping's items as an *extra keyword argument* (without passing
2614
+ a message pattern as the *first positional argument*). The mapping,
2615
+ after conversion to a `dict`, is assigned to the [`data`][] attribute.
2616
+
2617
+ !!! warning "Interface restriction"
2618
+
2619
+ When it comes to the arguments **`exc_info`**, **`stack_info`**
2620
+ and **`stacklevel`**, they should *not* be included in that
2621
+ mapping; each of them, if to be specified, should *only* be
2622
+ specified as a real keyword argument (putting any of them in
2623
+ that mapping will result in undefined behavior).
2624
+
2625
+ !!! warning "Interface restriction"
2626
+
2627
+ If you pass a **`stack_info`** and/or **`stacklevel`** argument
2628
+ to the **[`ExtendedMessage`][]** (**[`xm`][]**) constructor, you
2629
+ should *not* pass **`stack_info`** or **`stacklevel`** to the
2630
+ related [logger method call](https://docs.python.org/3/library/logging.html#logging.Logger.debug)
2631
+ (doing so will result in undefined behavior).
2632
+
2633
+ ```python
2634
+ # All WRONG (!!!):
2635
+ logger.info(xm('Foo', stack_info=True), stack_info=True)
2636
+ logger.info(xm('Foo', stack_info=True), stacklevel=2)
2637
+ logger.info(xm('Foo', stacklevel=2), stack_info=True)
2638
+ logger.info(xm('Foo', stacklevel=2), stacklevel=2)
2639
+ logger.info(xm('Foo', stack_info=True), stack_info=True, stacklevel=2)
2640
+ logger.info(xm('Foo', stack_info=True, stacklevel=2), stack_info=True)
2641
+ logger.info(xm('Foo', stack_info=True, stacklevel=2), stack_info=True, stacklevel=2)
2642
+ # (and similar...)
2643
+ ```
2644
+
2645
+ ```python
2646
+ # OK:
2647
+ logger.info(xm('Foo', stack_info=True))
2648
+ logger.info(xm('Foo', stacklevel=2))
2649
+ logger.info(xm('Foo', stack_info=True, stacklevel=2))
2650
+
2651
+ # Also OK:
2652
+ logger.info(xm('Foo'), stack_info=True)
2653
+ logger.info(xm('Foo'), stacklevel=2)
2654
+ logger.info(xm('Foo'), stack_info=True, stacklevel=2)
2655
+ ```
2656
+
2657
+ Whenever a formatter (of any type) processes a log record whose `msg`
2658
+ attribute (which typically is just what has been passed to the logger
2659
+ method call as the first positional argument) is an `ExtendedMessage`
2660
+ (`xm`) instance, that instance's method [`get_message_value`][] is
2661
+ invoked: either *directly* -- by the [`StructuredLogsFormatter`][]'s
2662
+ machinery; or *indirectly*, via [`__str__`][] -- by the standard
2663
+ machinery that other formatter types use.
2664
+
2665
+ !!! warning "Interface restriction"
2666
+
2667
+ When you pass an instance of **`ExtendedMessage`**
2668
+ as the first positional argument to a [logger method
2669
+ call](https://docs.python.org/3/library/logging.html#logging.Logger.debug),
2670
+ you should *not* pass to that call any other *positional*
2671
+ arguments (doing so will result in undefined behavior).
2672
+
2673
+ ```python
2674
+ # WRONG (!!!):
2675
+ logger.info(xm('{}, {} and {}'), 'Athos', 'Porthos', 'Aramis')
2676
+ ```
2677
+
2678
+ ```python
2679
+ # OK:
2680
+ logger.info(xm('{}, {} and {}', 'Athos', 'Porthos', 'Aramis'))
2681
+ ```
2682
+
2683
+ !!! note
2684
+
2685
+ If a text message pattern (*not* a mapping) is passed to the
2686
+ **[`ExtendedMessage`][]** (**[`xm`][]**) constructor as the
2687
+ *first positional argument* _**and**_ no *extra positional
2688
+ or keyword arguments* are provided (except **`exc_info`**,
2689
+ **`stack_info`** and **`stacklevel`**) -- that is, if both
2690
+ of the attributes **[`args`][]** and **[`data`][]** of the
2691
+ resultant **`ExtendedMessage`** instance are *empty* -- then
2692
+ the **[`get_message_value`][]** method will *not* attempt to
2693
+ format the text message with [`str.format`][]; instead, it
2694
+ will treat the pattern as an *already formatted* text message.
2695
+
2696
+ ```python
2697
+ logger.info(xm('answer: {}', 42)) # message will be 'answer: 42'
2698
+ logger.info(xm('answer: {}')) # message will be 'answer: {}' [sic!]
2699
+ ```
2700
+
2701
+ This mimics how the standard [`logging`][] machinery handles
2702
+ a log record whose `args` attribute is empty (when only a
2703
+ text message pattern is specified, without any values to be
2704
+ interpolated).
2705
+
2706
+ If any *extra positional or keyword arguments* to the [constructor][ExtendedMessage]
2707
+ -- except **`exc_info`**, **`stack_info`** and **`stacklevel`** --
2708
+ or any values included in the *extra data* mapping (if passed to the
2709
+ constructor as the *first positional argument*) are *function* or
2710
+ *method* objects (precisely: instances of any types included in
2711
+ [`ExtendedMessage.recognized_callable_arg_or_data_item_types`][]),
2712
+ then -- as part of processing the `ExtendedMessage` instance by a
2713
+ formatter -- each of those functions/methods will be *called* to
2714
+ obtain the *actual value*, which will then replace (respectively,
2715
+ in [`args`][] or [`data`][]) the called function/method.
2716
+
2717
+ Let us be precise: all those calls-and-replacements will be
2718
+ triggered when *any* of the following methods is invoked on the
2719
+ `ExtendedMessage` instance for the first time: [`get_message_value`][],
2720
+ [`get_record_msg_and_args_equivalent_info`][], [`__str__`][] or
2721
+ [`iter_str_parts`][] (with the proviso that the last one returns an
2722
+ [iterator](https://docs.python.org/3/glossary.html#term-iterator)
2723
+ which, to achieve the effect in question, needs to be iterated over,
2724
+ at least partially). Each of those calls-and-replacements will
2725
+ be made at most *once* per instance of `ExtendedMessage` (see
2726
+ the [`_ensure_callable_args_and_data_items_resolved`][] method's
2727
+ description...).
2728
+
2729
+ Thanks to that mechanism, if the creation of some value is expected
2730
+ to be costly, you can wrap it in a function/method (in particular,
2731
+ in an argumentless `lambda`) to defer that costly operation until
2732
+ the value becomes necessary (which may never happen if, for example,
2733
+ the specified log level is lower than the configured threshold).
2734
+ Such a function/method is expected to take no arguments (therefore,
2735
+ if it is a method, it should already be bound to some instance or
2736
+ class).
2737
+
2738
+ !!! warning "Multithreading-related restriction"
2739
+
2740
+ *None* of those functions/methods should acquire any locks
2741
+ that might also be acquired by any code making use of some
2742
+ [`logging`][] stuff (because, in particular, that could result in
2743
+ a [*deadlock*](https://docs.python.org/3/glossary.html#term-deadlock)).
2744
+ """
2745
+
2746
+ __slots__ = (
2747
+ 'pattern',
2748
+ 'args',
2749
+ 'data',
2750
+ 'exc_info',
2751
+ 'stack_info',
2752
+ 'stacklevel',
2753
+
2754
+ '_callable_args_and_data_items_already_resolved',
2755
+ '_callable_args_and_data_items_resolving_lock',
2756
+ '_cached_message',
2757
+ )
2758
+
2759
+ #
2760
+ # Public stuff
2761
+
2762
+ pattern: object
2763
+ args: tuple[object | ValueProvider[object], ...]
2764
+ data: dict[str, object | ValueProvider[object]]
2765
+ exc_info: Any
2766
+ stack_info: bool
2767
+ stacklevel: int
2768
+
2769
+ recognized_callable_arg_or_data_item_types: ClassVar[
2770
+ tuple[type[ValueProvider[object]], ...]
2771
+ ] = (
2772
+ types.FunctionType,
2773
+ types.BuiltinFunctionType,
2774
+ types.MethodType,
2775
+ types.MethodWrapperType,
2776
+ )
2777
+ """
2778
+ For `ExtendedMessage`, this tuple contains the runtime types of
2779
+ *function* and *bound method* objects -- both in the *user-defined*
2780
+ and *built-in* variants (precisely: [`types.FunctionType`][],
2781
+ [`types.BuiltinFunctionType`][], [`types.MethodType`][] and
2782
+ [`types.MethodWrapperType`][]). You can override this attribute in
2783
+ your subclass to redefine the runtime types of values in [`args`][]
2784
+ and [`data`][] that shall be *called* to obtain the *actual values*
2785
+ (see the part of the [`ExtendedMessage`][] constructor's description
2786
+ containing a reference to this attribute...).
2787
+ """
2788
+
2789
+ @overload
2790
+ def __init__(
2791
+ self,
2792
+ pattern: str = '',
2793
+ /,
2794
+ *args: object | ValueProvider[object],
2795
+
2796
+ exc_info: Any = None,
2797
+ stack_info: bool = False,
2798
+ stacklevel: int = 1,
2799
+
2800
+ **data: object | ValueProvider[object],
2801
+ ):
2802
+ ...
2803
+
2804
+ @overload
2805
+ def __init__(
2806
+ self,
2807
+ data: Mapping[str, object | ValueProvider[object]],
2808
+ /,
2809
+ *,
2810
+ exc_info: Any = None,
2811
+ stack_info: bool = False,
2812
+ stacklevel: int = 1,
2813
+ ):
2814
+ ...
2815
+
2816
+ @overload
2817
+ def __init__(
2818
+ self,
2819
+ pattern: object, # (anything convertible to str, but *not* a mapping)
2820
+ /,
2821
+ *args: object | ValueProvider[object],
2822
+
2823
+ exc_info: Any = None,
2824
+ stack_info: bool = False,
2825
+ stacklevel: int = 1,
2826
+
2827
+ **data: object | ValueProvider[object],
2828
+ ):
2829
+ ...
2830
+
2831
+ def __init__(
2832
+ self,
2833
+ first_arg: object | Mapping[str, object | ValueProvider[object]] = '',
2834
+ /,
2835
+ *args: object | ValueProvider[object],
2836
+ exc_info: Any = None,
2837
+ stack_info: bool = False,
2838
+ stacklevel: int = 1,
2839
+ **data: object | ValueProvider[object],
2840
+ ):
2841
+ # Note: it is not necessary to protect the following 4
2842
+ # lines with a lock, because each call to the function
2843
+ # `_ensure_internal_record_hook_is_set_up()` is itself
2844
+ # **thread-safe** and **idempotent**; so any redundant
2845
+ # (even if concurrent) calls to that function are safe.
2846
+ if self._setup_of_record_hooks_still_needs_to_be_done:
2847
+ _ensure_internal_record_hook_is_set_up(self._exc_info_record_hook)
2848
+ _ensure_internal_record_hook_is_set_up(self._stack_stuff_record_hook)
2849
+ self.__class__._setup_of_record_hooks_still_needs_to_be_done = False
2850
+
2851
+ pattern: object
2852
+ if isinstance(first_arg, Mapping):
2853
+ if args or data:
2854
+ raise TypeError(
2855
+ f"{type(self).__qualname__}'s *extra data* items, "
2856
+ f"if any, must be passed to its constructor either "
2857
+ f"by keyword arguments or as a mapping being the "
2858
+ f"only positional argument (not both)"
2859
+ )
2860
+ pattern = ''
2861
+ data = dict(first_arg)
2862
+ else:
2863
+ pattern = first_arg
2864
+
2865
+ self.pattern = pattern or ''
2866
+ self.args = args
2867
+ self.data = data
2868
+ self.exc_info = exc_info
2869
+ self.stack_info = stack_info
2870
+ self.stacklevel = stacklevel
2871
+
2872
+ self._callable_args_and_data_items_already_resolved: bool = False
2873
+ self._callable_args_and_data_items_resolving_lock: threading.Lock | None = None
2874
+ self._cached_message: str | None = None
2875
+
2876
+ def get_message_value(self) -> str:
2877
+ """
2878
+ Automatically invoked by the [`StructuredLogsFormatter`][]'s
2879
+ machinery to obtain a string to be assigned to the log record's
2880
+ [`message` attribute](https://docs.python.org/3/library/logging.html#logrecord-attributes).
2881
+
2882
+ !!! warning "Interface restriction"
2883
+
2884
+ Once this method is invoked on an **`ExtendedMessage`** instance,
2885
+ any attempts (regarding that instance) to replace/mutate any of
2886
+ the objects assigned to the **[`pattern`][]**, **[`args`][]** and
2887
+ **[`data`][]** attributes or anything inside them (regardless of
2888
+ the level of nesting, if any nested data is present) -- are *no
2889
+ loger* allowed. Doing so will result in undefined behavior.
2890
+
2891
+ The default implementation of this method should be sufficient
2892
+ in most cases. It converts [`pattern`][] to a string, and then
2893
+ -- *only* if [`args`][] and/or [`data`][] contain any items --
2894
+ invokes that string's [`format`][str.format] method, passing to
2895
+ it all items of `args` as *positional arguments* and all items
2896
+ of `data` as *keyword arguments*. A string being the result of
2897
+ the above operation(s) is cached (for any further invocations
2898
+ of this method on the same instance) and returned.
2899
+
2900
+ !!! warning "Subclass behavior requirement"
2901
+
2902
+ This method should *always* invoke the
2903
+ **[`_ensure_callable_args_and_data_items_resolved`][]**
2904
+ method before starting the actual work (the default
2905
+ implementation already does that). Failing to do so
2906
+ will result in undefined behavior.
2907
+
2908
+ !!! note
2909
+
2910
+ Apart from the aforementioned use by the machinery
2911
+ of **`StructuredLogsFormatter`**, this method is also
2912
+ invoked by the **`ExtendedMessage`**'s implementation
2913
+ of **[`__str__`][]** (which is important for formatters
2914
+ that are *not* instances of **`StructuredLogsFormatter`**).
2915
+ """
2916
+ self._ensure_callable_args_and_data_items_resolved()
2917
+
2918
+ # (Compare to the source code of `logging.LogRecord.getMessage()`...)
2919
+ message = self._cached_message
2920
+ if message is None:
2921
+ message = str(self.pattern)
2922
+ if self.args or self.data:
2923
+ message = message.format(*self.args, **self.data)
2924
+
2925
+ # (Such assignments are assumed to be *atomic* operations.)
2926
+ self._cached_message = message
2927
+
2928
+ return message
2929
+
2930
+ def get_record_msg_and_args_equivalent_info(
2931
+ self,
2932
+ *,
2933
+ pattern_result_key: str | None,
2934
+ args_result_key: str | None,
2935
+ ) -> Mapping[str, object]:
2936
+ """
2937
+ Automatically invoked by the [`StructuredLogsFormatter`][]'s
2938
+ machinery to get a value to be included in the *output data*
2939
+ dict under the key corresponding to the log record's `msg`
2940
+ attribute. The returned value is supposed to be a mapping
2941
+ that conveys relevant information from the `ExtendedMessage`
2942
+ instance -- to the extent that corresponds to the information
2943
+ typically conveyed by the `msg` and `args` attributes of log
2944
+ records when `ExtendedMessage` is not used.
2945
+
2946
+ !!! warning "Interface restriction"
2947
+
2948
+ Once this method is invoked on an **`ExtendedMessage`** instance,
2949
+ any attempts (regarding that instance) to replace/mutate any of
2950
+ the objects assigned to the **[`args`][]** and **[`data`][]**
2951
+ attributes or anything inside them (regardless of the level
2952
+ of nesting, if any nested data is present) -- are *no loger*
2953
+ allowed. Doing so will result in undefined behavior.
2954
+
2955
+ The default implementation should be sufficient in most cases. It
2956
+ returns a mapping containing zero, one or two items. Specifically
2957
+ -- *each* of the following *if* the key is not [`None`][] and the
2958
+ value is not *falsy*:
2959
+
2960
+ * the given **`pattern_result_key`** -- mapped to the value of the
2961
+ [`pattern`][] attribute,
2962
+
2963
+ * the given **`args_result_key`** -- mapped to the value of the
2964
+ [`args`][] attribute.
2965
+
2966
+ !!! warning "Subclass behavior requirement"
2967
+
2968
+ This method should *always* invoke the
2969
+ **[`_ensure_callable_args_and_data_items_resolved`][]**
2970
+ method before starting the actual work (the default
2971
+ implementation already does that). Failing to do so
2972
+ will result in undefined behavior.
2973
+ """
2974
+ self._ensure_callable_args_and_data_items_resolved()
2975
+
2976
+ return {
2977
+ key: val
2978
+ for key, val in (
2979
+ (pattern_result_key, self.pattern),
2980
+ (args_result_key, self.args),
2981
+ )
2982
+ if (key is not None) and val
2983
+ }
2984
+
2985
+ def __str__(self) -> str:
2986
+ """
2987
+ Invoked when [`str`][] is applied to an `ExtendedMessage` instance.
2988
+ This is done, in particular, by the machinery related to typical
2989
+ non-`StructuredLogsFormatter` formatters (specifically, by the
2990
+ log record method [`getMessage`][logging.LogRecord.getMessage])
2991
+ -- to obtain a string to be assigned to the [`message`
2992
+ attribute](https://docs.python.org/3/library/logging.html#logrecord-attributes)
2993
+ of the log record.
2994
+
2995
+ !!! warning "Interface restriction"
2996
+
2997
+ Once this method is invoked on an **`ExtendedMessage`** instance,
2998
+ any attempts (regarding that instance) to replace/mutate any of
2999
+ the objects assigned to the **[`pattern`][]**, **[`args`][]** and
3000
+ **[`data`][]** attributes or anything inside them (regardless of
3001
+ the level of nesting, if any nested data is present) -- are *no
3002
+ loger* allowed. Doing so will result in undefined behavior.
3003
+
3004
+ The default implementation of this method should be sufficient
3005
+ in most cases. It invokes the [`iter_str_parts`][] method
3006
+ (which, in particular, invokes [`get_message_value`][]...)
3007
+ and concatenates any yielded strings (if more than one) using
3008
+ `" | "` as the separator.
3009
+
3010
+ !!! warning "Subclass behavior requirement"
3011
+
3012
+ This method should *always* invoke the
3013
+ **[`_ensure_callable_args_and_data_items_resolved`][]**
3014
+ method before starting the actual work (the default
3015
+ implementation already does that). Failing to do so
3016
+ will result in undefined behavior.
3017
+ """
3018
+ self._ensure_callable_args_and_data_items_resolved()
3019
+
3020
+ return ' | '.join(self.iter_str_parts())
3021
+
3022
+ @reprlib.recursive_repr(fillvalue='<...>')
3023
+ def __repr__(self) -> str:
3024
+ """
3025
+ Invoked when [`repr`][] is applied to an `ExtendedMessage`
3026
+ instance (typically, for debug purposes).
3027
+
3028
+ The default implementation of this method should be sufficient
3029
+ in most cases. It invokes the [`iter_argument_reprs`][] method,
3030
+ concatenates any yielded strings (if more than one) using `", "`
3031
+ as the separator, adds the parentheses, and prefixes the whole
3032
+ thing with the class name.
3033
+ """
3034
+ type_name = type(self).__qualname__
3035
+ arguments_repr = ', '.join(self.iter_argument_reprs())
3036
+ return f'{type_name}({arguments_repr})'
3037
+
3038
+ def iter_str_parts(self) -> Iterator[str]:
3039
+ """
3040
+ Invoked by the [`__str__`][] method.
3041
+
3042
+ !!! warning "Interface restriction"
3043
+
3044
+ Once this method is invoked on an **`ExtendedMessage`** instance,
3045
+ any attempts (regarding that instance) to replace/mutate any of
3046
+ the objects assigned to the **[`pattern`][]**, **[`args`][]** and
3047
+ **[`data`][]** attributes or anything inside them (regardless of
3048
+ the level of nesting, if any nested data is present) -- are *no
3049
+ loger* allowed. Doing so will result in undefined behavior.
3050
+
3051
+ The default implementation of this method yields zero, one
3052
+ or two strings. Specifically -- *each* of the following *if
3053
+ not empty*:
3054
+
3055
+ * the result of an invocation of the [`get_message_value`][]
3056
+ method,
3057
+
3058
+ * a representation of the [`data`][] mapping's items (formatted
3059
+ in a way that resembles the syntax for specifying keyword
3060
+ arguments, but without the parentheses).
3061
+
3062
+ !!! warning "Subclass behavior requirement"
3063
+
3064
+ This method should *always* invoke the
3065
+ **[`_ensure_callable_args_and_data_items_resolved`][]**
3066
+ method before starting the actual work (the default
3067
+ implementation already does that). Failing to do so
3068
+ will result in undefined behavior.
3069
+ """
3070
+ self._ensure_callable_args_and_data_items_resolved()
3071
+
3072
+ if formatted_message := self.get_message_value():
3073
+ yield formatted_message
3074
+ if formatted_data_items := ', '.join(
3075
+ f'{key}={val!a}' for key, val in self.data.items()
3076
+ ):
3077
+ yield formatted_data_items
3078
+
3079
+ def iter_argument_reprs(self) -> Iterator[str]:
3080
+ """
3081
+ Invoked by the [`__repr__`][] method.
3082
+
3083
+ The default implementation of this method yields string
3084
+ representations of the arguments to the [`ExtendedMessage`][]
3085
+ ([`xm`][]) constructor which would be needed to create an
3086
+ instance equivalent to this one (`self`).
3087
+ """
3088
+ if self.args or self.pattern:
3089
+ yield repr(self.pattern)
3090
+ if self.args:
3091
+ yield from map(repr, self.args)
3092
+ if self.exc_info is not None:
3093
+ yield f'exc_info={self.exc_info!r}'
3094
+ if self.stack_info is not False: # noqa
3095
+ yield f'stack_info={self.stack_info!r}'
3096
+ if self.stacklevel != 1 or type(self.stacklevel) is not int:
3097
+ yield f'stacklevel={self.stacklevel!r}'
3098
+ for key, val in self.data.items():
3099
+ yield f'{key}={val!r}'
3100
+
3101
+ #
3102
+ # Semi-protected method (allowed to be invoked in subclasses)
3103
+
3104
+ def _ensure_callable_args_and_data_items_resolved(self) -> None:
3105
+ """
3106
+ !!! exclusion "Interface exclusion"
3107
+
3108
+ This method is _**not**_ part of the public API -- _**except
3109
+ that**_ it is allowed to be invoked by any methods implemented
3110
+ by possible subclasses of **`ExtendedMessage`**.
3111
+
3112
+ This method processes the items of [`args`][] and [`data`][] --
3113
+ by *calling* each encountered instance of any type included in
3114
+ [`ExtendedMessage.recognized_callable_arg_or_data_item_types`][],
3115
+ and then *replacing* that value with the result of that call.
3116
+ Each call is made without arguments.
3117
+
3118
+ This method can be safely invoked multiple times on the same
3119
+ instance, *even* in the case of *concurrent* invocations. The
3120
+ implementation guarantees that *none* of the calls in question
3121
+ will be made more than *once* per instance of `ExtendedMessage`.
3122
+
3123
+ !!! warning "Interface restriction"
3124
+
3125
+ Once this method is invoked on an **`ExtendedMessage`**
3126
+ instance, any other attempts (regarding that instance)
3127
+ to replace/mutate any of the objects assigned to the
3128
+ **[`args`][]** and **[`data`][]** attributes or anything
3129
+ inside them (regardless of the level of nesting, if any
3130
+ nested data is present) -- are *no loger* allowed. Doing
3131
+ so will result in undefined behavior.
3132
+ """
3133
+ if self._callable_args_and_data_items_already_resolved:
3134
+ # OK, already resolved (fast path).
3135
+ return
3136
+
3137
+ with self._callable_args_and_data_items_resolving_meta_lock:
3138
+ # Obtain the instance's lock in a thread-safe manner...
3139
+ lock = self._callable_args_and_data_items_resolving_lock
3140
+ if lock is None:
3141
+ # (We want to defer its creation until this moment, so that
3142
+ # `ExtendedMessage.__init__()` remains as fast as possible.)
3143
+ lock = self._callable_args_and_data_items_resolving_lock = (
3144
+ threading.Lock()
3145
+ )
3146
+
3147
+ if not lock.acquire(timeout=self._CALLABLE_ARGS_AND_DATA_ITEMS_RESOLVING_LOCK_TIMEOUT):
3148
+ raise RuntimeError(
3149
+ f'could not acquire the lock that protects the procedure '
3150
+ f'of ensuring that relevant callable items of the `args` '
3151
+ f'and `data` collections will be resolved'
3152
+ )
3153
+ try:
3154
+ if self._callable_args_and_data_items_already_resolved:
3155
+ # OK, already resolved.
3156
+ return
3157
+ self._resolve_callable_args_and_data_items()
3158
+
3159
+ # (Such assignments are assumed to be *atomic* operations.)
3160
+ self._callable_args_and_data_items_already_resolved = True
3161
+ finally:
3162
+ lock.release()
3163
+
3164
+ #
3165
+ # Internals (should not be used or extended/overridden outside this module!)
3166
+
3167
+ _CALLABLE_ARGS_AND_DATA_ITEMS_RESOLVING_LOCK_TIMEOUT: Final[float] = 9.0
3168
+
3169
+ _callable_args_and_data_items_resolving_meta_lock: Final[threading.Lock] = threading.Lock()
3170
+ _setup_of_record_hooks_still_needs_to_be_done: ClassVar[bool] = True
3171
+
3172
+ @staticmethod
3173
+ def _exc_info_record_hook(record: logging.LogRecord) -> None:
3174
+ if record.exc_info:
3175
+ return
3176
+
3177
+ instance = getattr(record, 'msg', None)
3178
+ if not isinstance(instance, ExtendedMessage):
3179
+ return
3180
+
3181
+ # (Compare to the `exc_info`-related fragments of
3182
+ # the source code of `logging.Logger._log()`...)
3183
+ exc_info = instance.exc_info
3184
+ if exc_info:
3185
+ if isinstance(exc_info, BaseException):
3186
+ exc_info = (type(exc_info), exc_info, exc_info.__traceback__)
3187
+ elif not isinstance(exc_info, tuple):
3188
+ exc_info = sys.exc_info()
3189
+ record.exc_info = exc_info
3190
+
3191
+ @staticmethod
3192
+ def _stack_stuff_record_hook(record: logging.LogRecord) -> None:
3193
+ if record.stack_info:
3194
+ return
3195
+
3196
+ instance = getattr(record, 'msg', None)
3197
+ if not isinstance(instance, ExtendedMessage):
3198
+ return
3199
+
3200
+ if (getattr(record, 'lineno', None) == 0
3201
+ and getattr(record, 'pathname', None) == '(unknown file)'
3202
+ and getattr(record, 'funcName', None) == '(unknown function)'):
3203
+ # It seems that any calls to `logging.Logger.findCaller()`
3204
+ # are either doomed to failure or should not be attempted
3205
+ # because `logging._srcfile` has been set to None...
3206
+ return
3207
+
3208
+ stack_info = instance.stack_info
3209
+ stacklevel = instance.stacklevel
3210
+ if (not stack_info) and stacklevel == 1:
3211
+ return
3212
+
3213
+ # Let's exclude our internals from stack introspection.
3214
+ if _PY_3_11_OR_NEWER:
3215
+ if stacklevel >= 1:
3216
+ stacklevel += 2
3217
+ else:
3218
+ # A bit weird, but oh well... :)
3219
+ if stacklevel > 1:
3220
+ stacklevel += 3
3221
+ elif stacklevel == 1:
3222
+ stacklevel += 2
3223
+
3224
+ # (Compare to the `fn`/`lno`/`func`/`sinfo`-related fragments of
3225
+ # the source code of `logging.Logger._log()`...)
3226
+ try:
3227
+ found = logging.Logger.findCaller(
3228
+ # Here we pass None as a substitute for a logger instance
3229
+ # (the `findCaller()` method makes no use of it anyway).
3230
+ None, # type: ignore[arg-type]
3231
+ stack_info,
3232
+ stacklevel,
3233
+ )
3234
+ except ValueError:
3235
+ return
3236
+
3237
+ pathname, lineno, func, sinfo = found
3238
+ if (lineno == 0
3239
+ and pathname == '(unknown file)'
3240
+ and func == '(unknown function)'):
3241
+ return
3242
+
3243
+ # (Compare to the `pathname`/`lineno`/`funcName`/`stack_info`/
3244
+ # /`filename`/`module`-related fragments of the source code of
3245
+ # `logging.LogRecord.__init__()`...)
3246
+ record.pathname = pathname
3247
+ record.lineno = lineno
3248
+ record.funcName = func
3249
+ record.stack_info = sinfo
3250
+ try:
3251
+ record.filename = os.path.basename(record.pathname)
3252
+ record.module = os.path.splitext(record.filename)[0]
3253
+ except (TypeError, ValueError, AttributeError):
3254
+ record.filename = record.pathname
3255
+ record.module = "Unknown module"
3256
+
3257
+ def _resolve_callable_args_and_data_items(self) -> None:
3258
+ recognized_callable_types = (
3259
+ self.recognized_callable_arg_or_data_item_types
3260
+ )
3261
+ args: tuple[Any, ...] = self.args
3262
+ data: dict[str, Any] = self.data
3263
+
3264
+ # (Such assignments are assumed to be *atomic* operations.)
3265
+ self.args = tuple([
3266
+ val() if isinstance(val, recognized_callable_types) else val
3267
+ for val in args
3268
+ ])
3269
+ for key, val in data.items():
3270
+ if isinstance(val, recognized_callable_types):
3271
+ # (Such assignments are assumed to be *atomic* operations.)
3272
+ data[key] = val()
3273
+
3274
+
3275
+ xm: Final = ExtendedMessage
3276
+ """[`xm`][] is a convenience alias of [`ExtendedMessage`][]."""
3277
+
3278
+
3279
+ def make_constant_value_provider(value: T) -> ValueProvider[T]:
3280
+ """
3281
+ A trivial (yet sometimes useful) helper: given an arbitrary object
3282
+ (**`value`**), create an argumentless function that will always
3283
+ return that object (note that such argumentless functions can be
3284
+ used as [*auto-makers*][register_log_record_attr_auto_maker]).
3285
+ """
3286
+ return (lambda: value)
3287
+
3288
+
3289
+ def register_log_record_attr_auto_maker(
3290
+ rec_attr: str,
3291
+ auto_maker: ValueProvider[object],
3292
+ ) -> None:
3293
+ """
3294
+ For the specified log record attribute name (**`rec_attr`**),
3295
+ register the given *auto-maker* callable (**`auto_maker`**).
3296
+
3297
+ What this means is that by calling this function you ensure that,
3298
+ from now on, the specified attribute will be automatically set on
3299
+ every *new [log record][logging.LogRecord] object* (shortly *after*
3300
+ it is created by a *[logger](https://docs.python.org/3/howto/logging.html#loggers)*,
3301
+ yet *before* being processed by any *[handlers](https://docs.python.org/3/howto/logging.html#handlers)*,
3302
+ *[filters](https://docs.python.org/3/library/logging.html#filter)* and
3303
+ *[formatters](https://docs.python.org/3/howto/logging.html#formatters)*)
3304
+ -- to a value returned by the specified *auto-maker*.
3305
+
3306
+ The *auto-maker* needs to be an argumentless function or any other
3307
+ object that can be called with no arguments. A call to it will be
3308
+ made at most once for each newly created log record (*only* if the
3309
+ logger is enabled for the respective log level), in the thread in
3310
+ which the respective logger method call is being executed. Obviously,
3311
+ the returned values are allowed to vary depending on the context (or
3312
+ even with each call).
3313
+
3314
+ If, for the specified attribute name, some *auto-maker* is already
3315
+ registered, this function raises [`KeyError`][].
3316
+
3317
+ !!! note
3318
+
3319
+ Typically, you _**do not need**_ to use this function directly,
3320
+ because the machinery of **[`StructuredLogsFormatter`][]** does
3321
+ it for you (on the creation of a **`StructuredLogsFormatter`**
3322
+ instance; see also the description of the
3323
+ **[`StructuredLogsFormatter.make_base_auto_makers`][]** method).
3324
+ That machinery will also take care of avoiding record attribute
3325
+ name collisions.
3326
+
3327
+ !!! warning
3328
+
3329
+ If you use this function *directly*, you need to take care of
3330
+ avoiding record attribute name collisions by yourself. When
3331
+ the internal *auto-makers* machinery attempts to assign an
3332
+ *auto-maker*-produced value to the respective attribute of
3333
+ a log record but the log record already has that attribute set,
3334
+ then [`KeyError`][] is raised (which will typically bubble up
3335
+ to the caller of the currently executed logger method). This
3336
+ behavior mimics how the machinery of the standard [`logging`][]
3337
+ module reacts to collisions between *extra* items and existing
3338
+ attributes of a log record.
3339
+ """
3340
+ with _auto_makers_registry_and_internal_record_hooks_maintenance_lock:
3341
+ _ensure_record_factory_with_auto_makers_and_record_hooks_is_set()
3342
+ _add_to_auto_makers_registry(rec_attr, auto_maker)
3343
+
3344
+
3345
+ def unregister_log_record_attr_auto_maker(
3346
+ rec_attr: str,
3347
+ ) -> None:
3348
+ """
3349
+ For the given log record attribute name (**`rec_attr`**), unregister
3350
+ the previously registered *auto-maker*.
3351
+
3352
+ If, for the specified attribute name, no *auto-maker* is currently
3353
+ registered, this function raises [`KeyError`][].
3354
+ """
3355
+ with _auto_makers_registry_and_internal_record_hooks_maintenance_lock:
3356
+ _remove_from_auto_makers_registry(rec_attr)
3357
+
3358
+
3359
+ #
3360
+ # Static typing helpers
3361
+ #
3362
+
3363
+
3364
+ # *Not* part of the public API.
3365
+ T = TypeVar('T')
3366
+
3367
+ # *Not* part of the public API.
3368
+ Value = TypeVar('Value', covariant=True)
3369
+
3370
+
3371
+ class ValueProvider(Protocol[Value]):
3372
+ """
3373
+ ```python
3374
+ __call__() -> Value
3375
+ ```
3376
+
3377
+ A [*protocol*][typing.Protocol] which describes any callable object
3378
+ (e.g., a function) that takes *no arguments* and returns a value of
3379
+ *any type* (returned values may vary with each call). It is worth
3380
+ noting that, in particular, every *auto-maker* is supposed to be
3381
+ such a callable object.
3382
+
3383
+ !!! info "Typing details"
3384
+
3385
+ * In the above `__call__()` signature, the **`Value`** element
3386
+ is a [*type variable*](https://typing.python.org/en/latest/spec/generics.html#generics).
3387
+
3388
+ * That variable has no [*upper
3389
+ bound*](https://typing.python.org/en/latest/spec/generics.html#type-variables-with-an-upper-bound)
3390
+ (or, in other words, its *upper bound* is [`object`][]).
3391
+
3392
+ * The **`ValueProvider`** protocol is
3393
+ [*generic*](https://typing.python.org/en/latest/spec/protocol.html#generic-protocols).
3394
+ It is [*covariant*](https://typing.python.org/en/latest/spec/generics.html#variance)
3395
+ in that variable.
3396
+ """
3397
+ def __call__(self) -> Value: ...
3398
+
3399
+
3400
+ class OutputSerializer(Protocol):
3401
+ """
3402
+ ```python
3403
+ __call__(output_data: dict[str, OutputValue], /) -> str
3404
+ ```
3405
+
3406
+ A [*protocol*][typing.Protocol] which describes any callable object
3407
+ (e.g., a function) that takes an *output data* dict (supposedly,
3408
+ returned by [`StructuredLogsFormatter.get_prepared_output_data`][])
3409
+ as the sole positional argument, and returns a string representing
3410
+ that dict in *serialized* form (typically, but not necessarily, in
3411
+ JSON format).
3412
+
3413
+ !!! info "Typing details"
3414
+
3415
+ In the above `__call__()` signature, as everywhere else, the
3416
+ **[`OutputValue`][]** element is just an alias of [`Any`][]
3417
+ (see below).
3418
+ """
3419
+ def __call__(self, output_data: dict[str, OutputValue], /) -> str: ...
3420
+
3421
+
3422
+ OutputValue: TypeAlias = Any
3423
+ """
3424
+ A [*type alias*](https://typing.python.org/en/latest/spec/aliases.html#type-aliases)
3425
+ which is used to annotate top-level *values* in *output data* dicts.
3426
+
3427
+ Given that every *output data* dict -- always of type `dict[str,
3428
+ OutputValue]` -- is:
3429
+
3430
+ * created by [`StructuredLogsFormatter.get_prepared_output_data`][], with
3431
+ each *value* obtained using [`StructuredLogsFormatter.prepare_value`][]
3432
+ (whose return type is annotated as `OutputValue`),
3433
+
3434
+ * then passed to [`StructuredLogsFormatter.serialize_prepared_output_data`][],
3435
+
3436
+ * then, consequently, passed to [`StructuredLogsFormatter.serializer`][]
3437
+ (expected to be [`OutputSerializer`][]-compliant)
3438
+
3439
+ -- it should be emphasized that:
3440
+
3441
+ * actual *runtime types* of any values considered an `OutputValue` are
3442
+ decided by an implementation of [`StructuredLogsFormatter.prepare_value`][]
3443
+ (*either* the default one *or* some provided by a subclass of
3444
+ `StructuredLogsFormatter`);
3445
+
3446
+ * [`StructuredLogsFormatter.serializer`][], as an object compliant with
3447
+ [`OutputSerializer`][], is *required* to be capable of serializing a
3448
+ dict that maps strings to any values returned by `prepare_value` (each
3449
+ being an instance of some of the said *runtime types*).
3450
+
3451
+ !!! note
3452
+
3453
+ The [`json.dumps`][] function, which is the default
3454
+ **[`serializer`][StructuredLogsFormatter.serializer]**, satisfies
3455
+ this requirement in respect to the default implementation of
3456
+ **[`prepare_value`][StructuredLogsFormatter.prepare_value]**.
3457
+
3458
+ !!! info "Typing details"
3459
+
3460
+ Accurately expressing the requirement in question using static types
3461
+ is hardly possible (at least without making things overly complicated).
3462
+ This is why **`OutputValue`** is simply an alias of the [`Any`][]
3463
+ special type.
3464
+ """
3465
+
3466
+
3467
+ DottedPath: TypeAlias = str
3468
+ """
3469
+ A [*type alias*](https://typing.python.org/en/latest/spec/aliases.html#type-aliases)
3470
+ which is used to annotate strings being a *dotted path* (*importable
3471
+ dotted name*).
3472
+ """
3473
+
3474
+
3475
+ KwargsMappingAsLiteralEvaluableString: TypeAlias = str
3476
+ """
3477
+ A [*type alias*](https://typing.python.org/en/latest/spec/aliases.html#type-aliases)
3478
+ which is used to annotate strings being an [`ast.literal_eval`][]-evaluable
3479
+ representation of a mapping (dict) of keyword arguments compatible with
3480
+ the main (first) signature of the [`StructuredLogsFormatter`][] constructor.
3481
+ """
3482
+
3483
+
3484
+ #
3485
+ # Internal constants and helpers (should be used only within this module!)
3486
+ #
3487
+
3488
+
3489
+ _PY_3_11_OR_NEWER = sys.version_info[:2] >= (3, 11)
3490
+
3491
+
3492
+ #
3493
+ # Machinery of *auto-makers* + internal *log record hooks*
3494
+
3495
+
3496
+ _auto_makers_registry_and_internal_record_hooks_maintenance_lock = threading.Lock()
3497
+ _auto_makers_registry: Sequence[tuple[str, ValueProvider[object]]] = ()
3498
+ _internal_record_hooks: Sequence[Callable[[logging.LogRecord], None]] = ()
3499
+
3500
+
3501
+ def _add_to_auto_makers_registry(
3502
+ rec_attr: str,
3503
+ auto_maker: ValueProvider[object],
3504
+ ) -> None:
3505
+ global _auto_makers_registry
3506
+
3507
+ rec_attr_to_auto_maker = dict(_auto_makers_registry)
3508
+ if rec_attr in rec_attr_to_auto_maker:
3509
+ raise KeyError(f'{rec_attr=!a} already in auto-makers registry')
3510
+ rec_attr_to_auto_maker[rec_attr] = auto_maker
3511
+ new_registry = tuple(rec_attr_to_auto_maker.items())
3512
+
3513
+ # (Such assignments are assumed to be *atomic* operations.)
3514
+ _auto_makers_registry = new_registry
3515
+
3516
+
3517
+ def _remove_from_auto_makers_registry(
3518
+ rec_attr: str,
3519
+ ) -> None:
3520
+ global _auto_makers_registry
3521
+
3522
+ rec_attr_to_auto_maker = dict(_auto_makers_registry)
3523
+ if rec_attr not in rec_attr_to_auto_maker:
3524
+ raise KeyError(f'{rec_attr=!a} not in auto-makers registry')
3525
+ del rec_attr_to_auto_maker[rec_attr]
3526
+ new_registry = tuple(rec_attr_to_auto_maker.items())
3527
+
3528
+ # (Such assignments are assumed to be *atomic* operations.)
3529
+ _auto_makers_registry = new_registry
3530
+
3531
+
3532
+ def _ensure_internal_record_hook_is_set_up(
3533
+ rec_hook: Callable[[logging.LogRecord], None],
3534
+ ) -> None:
3535
+ global _internal_record_hooks
3536
+
3537
+ with _auto_makers_registry_and_internal_record_hooks_maintenance_lock:
3538
+ if rec_hook in _internal_record_hooks:
3539
+ return
3540
+
3541
+ _ensure_record_factory_with_auto_makers_and_record_hooks_is_set()
3542
+ new_sequence = (*_internal_record_hooks, rec_hook)
3543
+
3544
+ # (Such assignments are assumed to be *atomic* operations.)
3545
+ _internal_record_hooks = new_sequence
3546
+
3547
+
3548
+ def _ensure_record_factory_with_auto_makers_and_record_hooks_is_set() -> None:
3549
+ if not _is_record_factory_with_auto_makers_and_record_hooks_impl_already_in_use():
3550
+ record_factory_being_wrapped = logging.getLogRecordFactory()
3551
+ new_record_factory = functools.partial(
3552
+ _record_factory_with_auto_makers_and_record_hooks_impl,
3553
+ record_factory_being_wrapped,
3554
+ )
3555
+ logging.setLogRecordFactory(new_record_factory)
3556
+
3557
+
3558
+ def _is_record_factory_with_auto_makers_and_record_hooks_impl_already_in_use() -> bool:
3559
+ current_record_factory = logging.getLogRecordFactory()
3560
+ flag: list[None] = []
3561
+ try:
3562
+ # (Compare to the call to `_logRecordFactory()` in
3563
+ # the source code of `logging.makeLogRecord()`...)
3564
+ current_record_factory(
3565
+ None, None, '', 0, '', (), None, None,
3566
+ _record_factory_with_auto_makers_and_record_hooks_impl_confirm_flag=flag,
3567
+ )
3568
+ except Exception: # noqa
3569
+ pass
3570
+ return bool(flag)
3571
+
3572
+
3573
+ def _record_factory_with_auto_makers_and_record_hooks_impl(
3574
+ record_factory_being_wrapped: Callable[..., logging.LogRecord],
3575
+ /,
3576
+ *args: Any,
3577
+ _record_factory_with_auto_makers_and_record_hooks_impl_confirm_flag: list[None] | None = None,
3578
+ **kwargs: Any,
3579
+ ) -> logging.LogRecord:
3580
+ flag = _record_factory_with_auto_makers_and_record_hooks_impl_confirm_flag
3581
+ if flag is not None:
3582
+ flag.append(None) # (<- Making the `flag` list *truthy*)
3583
+
3584
+ record = record_factory_being_wrapped(*args, **kwargs)
3585
+ record_attrs: dict[str, object] = record.__dict__
3586
+
3587
+ rec_attr: str
3588
+ auto_maker: ValueProvider[object]
3589
+ for rec_attr, auto_maker in _auto_makers_registry:
3590
+ try:
3591
+ value = auto_maker()
3592
+ except RecursionError:
3593
+ raise
3594
+ except Exception: # noqa
3595
+ # (Compare to the source code of `logging.Handler.handleError()`...)
3596
+ if logging.raiseExceptions and sys.stderr:
3597
+ sys.stderr.write(
3598
+ f"--- Logging error ({__name__!a}-related) ---\n"
3599
+ f"FAILED to auto-make log record's {rec_attr!a}!\n"
3600
+ f"{traceback.format_exc()}\n"
3601
+ )
3602
+ continue
3603
+
3604
+ actually_set_value = record_attrs.setdefault(rec_attr, value)
3605
+ if actually_set_value is not value:
3606
+ # (Compare to `KeyError(...)` in `logging.Logger.makeRecord()`...)
3607
+ raise KeyError(
3608
+ f"attempt to overwrite log record's {rec_attr!a} "
3609
+ f"(existing value: {actually_set_value!a}; "
3610
+ f"new rejected value: {value!a})"
3611
+ )
3612
+
3613
+ for rec_hook in _internal_record_hooks:
3614
+ rec_hook(record)
3615
+
3616
+ return record
3617
+
3618
+
3619
+ def _clear_auto_makers_and_internal_record_hooks_related_global_state() -> None:
3620
+ # This function is intended to be used *in tests only*.
3621
+
3622
+ global _auto_makers_registry
3623
+ global _internal_record_hooks
3624
+
3625
+ with _auto_makers_registry_and_internal_record_hooks_maintenance_lock:
3626
+ _auto_makers_registry = ()
3627
+ _internal_record_hooks = ()
3628
+ ExtendedMessage._setup_of_record_hooks_still_needs_to_be_done = True
3629
+
3630
+
3631
+ #
3632
+ # Miscellaneous helpers
3633
+
3634
+
3635
+ def _resolve_dotted_path(dotted_path: str) -> Any:
3636
+ """
3637
+ Import an object specified by the given *dotted path*.
3638
+
3639
+ >>> mod = _resolve_dotted_path('collections.abc')
3640
+ >>> import collections.abc
3641
+ >>> mod is collections.abc
3642
+ True
3643
+
3644
+ >>> obj = _resolve_dotted_path('logging.handlers.SocketHandler')
3645
+ >>> from logging.handlers import SocketHandler
3646
+ >>> obj is SocketHandler
3647
+ True
3648
+
3649
+ >>> _resolve_dotted_path('no_such_module_i_hope') # doctest: +ELLIPSIS
3650
+ Traceback (most recent call last):
3651
+ ...
3652
+ ValueError: cannot resolve dotted_path='no_such_module_i_hope' (ModuleNotFoundError...)
3653
+
3654
+ >>> _resolve_dotted_path('logging.no_such_stuff_i_hope') # doctest: +ELLIPSIS
3655
+ Traceback (most recent call last):
3656
+ ...
3657
+ ValueError: cannot resolve dotted_path='logging.no_such_stuff_i_hope' (ModuleNotFoundError...)
3658
+ """
3659
+ # (Compare to the source code of the -- semantically very similar
3660
+ # -- `logging.config.BaseConfigurator.resolve()` method...)
3661
+ importable_name, *rest_parts = dotted_path.split('.')
3662
+ try:
3663
+ obj = importlib.import_module(importable_name)
3664
+ for part in rest_parts:
3665
+ importable_name += f'.{part}'
3666
+ try:
3667
+ obj = getattr(obj, part)
3668
+ except AttributeError:
3669
+ importlib.import_module(importable_name)
3670
+ obj = getattr(obj, part)
3671
+ except ImportError as exc:
3672
+ raise ValueError(
3673
+ f'cannot resolve {dotted_path=!a} '
3674
+ f'({type(exc).__qualname__}: {exc})'
3675
+ ) from exc
3676
+ return obj