victorialogs-handler 0.1.0a1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,90 @@
1
+ Metadata-Version: 2.4
2
+ Name: victorialogs-handler
3
+ Version: 0.1.0a1
4
+ Summary: A log handler for VictoriaLogs.
5
+ Author-email: Erik Kalkoken <kalkoken87@gmail.com>
6
+ Requires-Python: >=3.10
7
+ Description-Content-Type: text/markdown
8
+ License-Expression: MIT
9
+ Classifier: Environment :: Web Environment
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Programming Language :: Python
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.10
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Programming Language :: Python :: 3.13
18
+ Classifier: Topic :: System :: Logging
19
+ License-File: LICENSE
20
+ Requires-Dist: orjson>3.10
21
+ Requires-Dist: pdoc ; extra == "docs"
22
+ Requires-Dist: coverage ; extra == "test"
23
+ Project-URL: documentation, https://erikkalkoken.github.io/python-victorialogs-handler
24
+ Project-URL: source, https://github.com/ErikKalkoken/python-victorialogs-handler
25
+ Provides-Extra: docs
26
+ Provides-Extra: test
27
+
28
+ # victorialogs-handler
29
+
30
+ A high-performance Python log handler for VictoriaLogs.
31
+
32
+ [![release](https://img.shields.io/pypi/v/python-victorialogs-handler?label=release)](https://pypi.org/project/python-victorialogs-handler/)
33
+ [![python](https://img.shields.io/pypi/pyversions/python-victorialogs-handler)](https://pypi.org/project/python-victorialogs-handler/)
34
+ [![CI/CD](https://github.com/ErikKalkoken/python-victorialogs-handler/actions/workflows/cicd.yaml/badge.svg)](https://github.com/ErikKalkoken/python-victorialogs-handler/actions/workflows/cicd.yaml)
35
+ [![codecov](https://codecov.io/gh/ErikKalkoken/python-victorialogs-handler/graph/badge.svg?token=2pPb3lid2k)](https://codecov.io/gh/ErikKalkoken/python-victorialogs-handler)
36
+ [![license](https://img.shields.io/badge/license-MIT-green)](https://gitlab.com/ErikKalkoken/python-victorialogs-handler/-/blob/master/LICENSE)
37
+
38
+ > [!IMPORTANT]
39
+ > STATUS: In development. The API may still change.
40
+
41
+ ## Description
42
+
43
+ **victorialogs-handler** is a high-performance Python log handler tailored for [VictoriaLogs](https://victoriametrics.com/products/victorialogs/). It integrates seamlessly with Python’s native logging module, allowing you to stream log events to a VictoriaLogs instance with minimal configuration.
44
+
45
+ - Non-blocking: Logs are stored in a buffer and later processed in a background thread.
46
+ - Hybrid trigger: Log processing is triggered by a ticker and/or when a size threshold is reached.
47
+ - Batching: Multiple logs are combined into one request to the log server to minimize the amount of requests.
48
+ - Complete logs: All fields of a log record are transmitted including exceptions and `extra` fields.
49
+ - Highly customizable: The handler's behavior is highly customizable (see also [Documentation](#documentation))
50
+
51
+ ## Installation
52
+
53
+ The handler can be installed with PIP from PyPI:
54
+
55
+ ```sh
56
+ pip install victorialogs-handler
57
+ ```
58
+
59
+ ## Quick start
60
+
61
+ > [!NOTE]
62
+ > The script assumes that there is a VictoriaLogs server running
63
+ > on the same system at the default URL: `http://localhost:9428`
64
+
65
+ Here is a quick example on how to use the handler in your Python script:
66
+
67
+ ```python
68
+ import logging
69
+
70
+ from vlogs_handler import VictoriaLogsHandler
71
+
72
+ # Create a custom logger with INFO level
73
+ logger = logging.getLogger(__name__)
74
+ logger.setLevel(logging.INFO)
75
+
76
+ # Add a handler for VictoriaLogs
77
+ vlogs_handler = VictoriaLogsHandler()
78
+ vlogs_handler.setLevel(logging.DEBUG)
79
+ logger.addHandler(vlogs_handler)
80
+
81
+ # Log example
82
+ logger.info("This is an info message")
83
+ ```
84
+
85
+ Please see the directory `/examples` for additional examples on how to use the handler.
86
+
87
+ ## Documentation
88
+
89
+ The full documentation can be found here: [Documentation](https://erikkalkoken.github.io/python-victorialogs-handler/).
90
+
@@ -0,0 +1,8 @@
1
+ vlogs_handler/__init__.py,sha256=7rHydo4SrRUbMtjP2Ox4Uk0JwrK6qGpHViaja1_q-sQ,315
2
+ vlogs_handler/handler.py,sha256=d7VM2k1vbwghteQaAjSwut78EwHoF0Jgmxr4xvs_2Yg,10080
3
+ vlogs_handler/request.py,sha256=4njqG5uAOWlNJYOuhIEvX9t1EGWCci6HELChnJeKHAQ,1877
4
+ vlogs_handler/technical.md,sha256=Jhcnu0lHMwRDvyEFrurBy5ZTsTWzbpuOZozU5qvxWl0,2671
5
+ victorialogs_handler-0.1.0a1.dist-info/licenses/LICENSE,sha256=jvY6lHKeMpeOzbdm1LfDI5iujiRGfVKUW-sNhkYFfos,1080
6
+ victorialogs_handler-0.1.0a1.dist-info/WHEEL,sha256=G2gURzTEtmeR8nrdXUJfNiB3VYVxigPQ-bEQujpNiNs,82
7
+ victorialogs_handler-0.1.0a1.dist-info/METADATA,sha256=LK_wbKaVZZwz-rO5VRI8LdpmOLoCAoXlZVP3GbsT0Xg,3778
8
+ victorialogs_handler-0.1.0a1.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: flit 3.12.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 Erik Kalkoken
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,16 @@
1
+ """
2
+ This is the documentation for the VictoriaLogs Handler.
3
+
4
+ # Readme
5
+ .. include:: ../../README.md
6
+ :start-line: 2
7
+ :end-line: 59
8
+ .. include:: technical.md
9
+ """
10
+
11
+ from .handler import VictoriaLogsHandler # noqa: F401
12
+
13
+ __title__ = "VictoriaLogs Handler"
14
+ __version__ = "0.1.0a1"
15
+
16
+ __all__ = ["VictoriaLogsHandler"]
@@ -0,0 +1,330 @@
1
+ """A module that provides the implementation of the vlogs logging handler."""
2
+
3
+ import io
4
+ import logging
5
+ import os
6
+ import queue
7
+ import sys
8
+ import threading
9
+ import traceback
10
+ import urllib.parse
11
+ from typing import Callable, List, Optional, Tuple
12
+
13
+ import orjson
14
+
15
+ from . import request
16
+
17
+ logger = logging.getLogger(__name__)
18
+
19
+ _STANDARD_ATTRS: frozenset[str] = frozenset(
20
+ {
21
+ "args",
22
+ "asctime",
23
+ "created",
24
+ "exc_info",
25
+ "exc_text",
26
+ "filename",
27
+ "funcName",
28
+ "levelname",
29
+ "levelno",
30
+ "lineno",
31
+ "message",
32
+ "module",
33
+ "msecs",
34
+ "msg",
35
+ "name",
36
+ "pathname",
37
+ "process",
38
+ "processName",
39
+ "relativeCreated",
40
+ "stack_info",
41
+ "taskName",
42
+ "thread",
43
+ "threadName",
44
+ }
45
+ )
46
+
47
+
48
+ class VictoriaLogsHandler(logging.Handler):
49
+ """A handler class which dispatches logging records to a VictoriaLogs server.
50
+
51
+ Args:
52
+ batch_size: New logs are submitted immediately once this threshold is reached.
53
+ flush_interval: New logs are submitted every x seconds.
54
+ buffer_size: Maximum number of logs the buffer can hold.
55
+ If buffer_size <= 0, the size is unlimited (not recommended).
56
+ When the buffer is full any new logs will be discarded.
57
+ 100 000 logs consume approx. 80-100 MB of RAM.
58
+ Tip: The buffer should be large enough to hold incoming logs
59
+ while the log server is down for regular maintenance.
60
+ chunk_size: Maximum number of logs send per request to the log server.
61
+ record_to_stream: A function that returns the value for the `stream`field
62
+ for a log record. The default will return the name of the top package.
63
+ request_timeout: Timeout when sending a request to the vlogs server in seconds.
64
+ start_worker: Whether to start the worker at initialization.
65
+ Alternatively, the worker can be started later by calling `start()`.
66
+ shutdown_timeout: Timeout when waiting for the worker to shut down.
67
+ url: URL of the vlogs server, e.g. `"http://localhost:9428"`
68
+ """
69
+
70
+ def __init__(
71
+ self,
72
+ batch_size: int = 125,
73
+ buffer_size: int = 100_000,
74
+ chunk_size: int = 1_000,
75
+ flush_interval: float = 5.0,
76
+ record_to_stream: Optional[Callable[[logging.LogRecord], str]] = None,
77
+ request_timeout: float = 3.0,
78
+ shutdown_timeout: float = 2.0,
79
+ start_worker: bool = True,
80
+ url: str = "http://localhost:9428",
81
+ ):
82
+ """Initializes the instance."""
83
+
84
+ super().__init__()
85
+
86
+ if batch_size < 1:
87
+ raise ValueError(f"batch_size must be >= 1: {batch_size}")
88
+
89
+ if chunk_size < 1:
90
+ raise ValueError(f"chunk_size must be >= 1: {chunk_size}")
91
+
92
+ if flush_interval < 0:
93
+ raise ValueError(f"flush_interval must be >= 0: {flush_interval}")
94
+
95
+ if record_to_stream and not callable(record_to_stream):
96
+ raise ValueError("record_to_stream must be a callable")
97
+
98
+ if request_timeout <= 0:
99
+ raise ValueError(f"request_timeout must be > 0: {request_timeout}")
100
+
101
+ if shutdown_timeout <= 0:
102
+ raise ValueError(f"shutdown_timeout must be > 0: {shutdown_timeout}")
103
+
104
+ if not request.is_url(url):
105
+ raise ValueError(f"url is not valid: {url}")
106
+
107
+ name = __package__
108
+ if not name:
109
+ raise RuntimeError("must run as module")
110
+
111
+ self.addFilter(_create_filter(name))
112
+
113
+ self._batch_size = int(batch_size)
114
+ self._buffer = queue.Queue(int(buffer_size))
115
+ self._chunk_size = int(chunk_size)
116
+ self._flush_interval = float(flush_interval)
117
+ self._record_to_stream = record_to_stream or _top_package_name
118
+ self._request_timeout = float(request_timeout)
119
+ self._shutdown_timeout = float(shutdown_timeout)
120
+ self._vlogs_url = (
121
+ urllib.parse.urljoin(url, "/insert/jsonline")
122
+ + "?"
123
+ + urllib.parse.urlencode(
124
+ {
125
+ "_stream_fields": "stream",
126
+ "_time_field": "timestamp",
127
+ "_msg_field": "message",
128
+ }
129
+ )
130
+ )
131
+ self._worker_thread = threading.Thread(target=self._worker, daemon=True)
132
+
133
+ self._lock = threading.Lock()
134
+ self._worker_started = False
135
+ self._worker_run = threading.Event()
136
+ self._worker_shutdown = threading.Event()
137
+ self._added_count = 0
138
+
139
+ if start_worker:
140
+ self.start()
141
+
142
+ def close(self):
143
+ """Cleanup resources and flush the buffer."""
144
+ with self._lock:
145
+ if self._worker_shutdown.is_set():
146
+ return # run shutdown once only
147
+
148
+ self._worker_shutdown.set()
149
+
150
+ if self._worker_started:
151
+ self._worker_run.set()
152
+ self._worker_thread.join(timeout=self._flush_interval)
153
+
154
+ try:
155
+ self.flush()
156
+ except Exception:
157
+ pass
158
+
159
+ try:
160
+ count = 0
161
+ stderr = sys.stderr.fileno()
162
+ while True:
163
+ try:
164
+ log = self._buffer.get_nowait()
165
+ except queue.Empty:
166
+ break
167
+
168
+ try:
169
+ os.write(stderr, log + b"\n") # print log to stderr
170
+ except Exception:
171
+ break # abort when writing to stderr no longer possible
172
+
173
+ count += 1
174
+
175
+ if count > 0:
176
+ print(
177
+ f"Dumped {count} remaining logs to stderr during shutdown.",
178
+ file=sys.stderr,
179
+ )
180
+ finally:
181
+ super().close()
182
+
183
+ def emit(self, record: logging.LogRecord) -> None:
184
+ """@private"""
185
+ try:
186
+ log = _serialize_log_to_json(record, self._record_to_stream)
187
+ self._buffer.put_nowait(log)
188
+
189
+ except Exception:
190
+ self.handleError(record)
191
+ return
192
+
193
+ with self._lock:
194
+ self._added_count += 1
195
+ if self._added_count > self._batch_size:
196
+ self._added_count = 0
197
+ self._worker_run.set()
198
+
199
+ def start(self):
200
+ """Start the worker. Is a no-op when the worker is already running."""
201
+ with self._lock:
202
+ if self._worker_started:
203
+ return
204
+
205
+ self._worker_started = True
206
+ self._worker_thread.start()
207
+
208
+ logger.debug("Worker started")
209
+
210
+ def _worker(self):
211
+ while not self._worker_shutdown.is_set():
212
+ self._worker_run.wait(timeout=self._flush_interval)
213
+ self._worker_run.clear()
214
+ self.flush()
215
+
216
+ logger.debug("Worker stopped")
217
+
218
+ def flush(self):
219
+ """Flush the buffer and send all logs to the log server."""
220
+ failed: List[bytes] = []
221
+ done = False
222
+
223
+ while not done:
224
+ logs: List[bytes] = []
225
+ while len(logs) < self._chunk_size:
226
+ try:
227
+ logs.append(self._buffer.get_nowait())
228
+ except queue.Empty:
229
+ done = True
230
+ break
231
+
232
+ if logs:
233
+ ok = request.post_ndjson(
234
+ url=self._vlogs_url, objs=logs, timeout=self._request_timeout
235
+ )
236
+ if not ok:
237
+ failed += logs
238
+ logger.warning("Failed transmitting %d logs to server", len(logs))
239
+ else:
240
+ logger.debug("Completed transmitting %s logs to server", len(logs))
241
+
242
+ if not failed:
243
+ return
244
+
245
+ n = 0
246
+ for log in failed:
247
+ try:
248
+ self._buffer.put_nowait(log)
249
+ n += 1
250
+ except queue.Full:
251
+ logger.error(
252
+ "Discarded %s logs after failed send because buffer is full",
253
+ len(failed) - n,
254
+ )
255
+ break
256
+
257
+ logger.debug("Saved %d logs to buffer after failed send", n)
258
+
259
+
260
+ def _serialize_log_to_json(
261
+ record: logging.LogRecord, record_to_stream: Callable[[logging.LogRecord], str]
262
+ ) -> bytes:
263
+ """Serialize a log record into a JSON object and return it."""
264
+ obj = {
265
+ "stream": record_to_stream(record),
266
+ "timestamp": record.created,
267
+ "level": record.levelname,
268
+ "logger": record.name,
269
+ "module": record.module,
270
+ "function": record.funcName,
271
+ "line_number": record.lineno,
272
+ "message": record.getMessage(),
273
+ "process_name": record.processName,
274
+ "process": record.process,
275
+ "thread_name": record.threadName,
276
+ "thread": record.thread,
277
+ }
278
+ if record.exc_info:
279
+ obj["exception_name"], obj["exception"] = _format_exception(record.exc_info)
280
+
281
+ for k, v in record.__dict__.items():
282
+ if k not in _STANDARD_ATTRS:
283
+ obj[k] = v
284
+
285
+ log = orjson.dumps(obj, default=str)
286
+ return log
287
+
288
+
289
+ def _create_filter(name: str):
290
+ def filter_logic(record: logging.LogRecord) -> bool:
291
+ return not record.name.startswith(name)
292
+
293
+ f = logging.Filter()
294
+ f.filter = filter_logic
295
+ return f
296
+
297
+
298
+ def _format_exception(ei) -> Tuple[str, str]:
299
+ """Format and return the name of the exception
300
+ and specified exception information as strings.
301
+
302
+ This default implementation just uses
303
+ traceback.print_exception()
304
+
305
+ Based on: logging.Formatter.formatException()
306
+ """
307
+ sio = io.StringIO()
308
+ tb = ei[2]
309
+ traceback.print_exception(ei[0], ei[1], tb, None, sio)
310
+ s = sio.getvalue()
311
+ sio.close()
312
+ if s[-1:] == "\n":
313
+ s = s[:-1]
314
+
315
+ name = ei[0].__name__ if ei[0] is not None else ""
316
+ return name, s
317
+
318
+
319
+ def _top_package_name(record: logging.LogRecord) -> str:
320
+ """Return the top package name of a log."""
321
+ if record.name == "__main__":
322
+ return "(undefined)"
323
+
324
+ s = record.name.split(".")
325
+ if len(s) > 1:
326
+ return s[0]
327
+
328
+ return record.name
329
+ return record.name
330
+ return record.name
@@ -0,0 +1,69 @@
1
+ """A module that provides the functionality to send HTTP requests."""
2
+
3
+ import logging
4
+ import urllib.error
5
+ import urllib.parse
6
+ import urllib.request
7
+ from typing import List, Optional
8
+
9
+ logger = logging.getLogger(__name__)
10
+
11
+
12
+ def is_url(url: str) -> bool:
13
+ """Report whether a string represents a valid URL."""
14
+ try:
15
+ result = urllib.parse.urlparse(url)
16
+ return all([result.scheme, result.netloc])
17
+ except ValueError:
18
+ return False
19
+
20
+
21
+ def post_ndjson(
22
+ *, url: str, objs: List[bytes], timeout: Optional[float] = None
23
+ ) -> bool:
24
+ """Send a POST request with the ndjson protocol
25
+ and report whether it was successful.
26
+
27
+ Args:
28
+ url: request URL
29
+ data: list of JSON objects to send
30
+ timeout: request timeout in seconds.
31
+ Settings it to None will disable the timeout.
32
+ """
33
+
34
+ data = b"\n".join(objs)
35
+ req = urllib.request.Request(url, data=data, method="POST")
36
+ req.add_header("Content-Type", "application/x-ndjson")
37
+
38
+ try:
39
+ with urllib.request.urlopen(req, timeout=timeout) as resp:
40
+ logger.debug("Submitted: %s %d %s", url, resp.status, resp.reason)
41
+ return True
42
+
43
+ except urllib.error.HTTPError as ex:
44
+ body = ex.read(4096).decode("utf-8")
45
+ logger.error(
46
+ "HTTP Error: %s",
47
+ ex.reason,
48
+ extra={
49
+ "url": ex.url,
50
+ "code": ex.code,
51
+ "reason": ex.reason,
52
+ "body": body,
53
+ },
54
+ )
55
+
56
+ except urllib.error.URLError as ex:
57
+ logger.error(
58
+ "URL Error: %s",
59
+ ex.reason,
60
+ extra={"url": url, "reason": ex.reason},
61
+ )
62
+
63
+ except TimeoutError:
64
+ logger.warning("Timed out", extra={"url": url})
65
+
66
+ except Exception:
67
+ logger.exception("general exception", extra={"url": url})
68
+
69
+ return False
@@ -0,0 +1,55 @@
1
+ # Technical overview
2
+
3
+ This section gives a technical overview of how the vlogs handler works.
4
+
5
+ ## Process Flow
6
+
7
+ The high-level process flow of the vlogs handler is as follows:
8
+
9
+ 1. When a a log event is received, it is converted into JSON and stored
10
+ in the buffer
11
+ 1. At the tick of an interval (e.g. 5 seconds) or when a threshold is reached
12
+ (e.g. 125 logs) a background worker starts the process of submitting logs
13
+ from the buffer to the log server
14
+ 1. Logs are combined into chunks (e.g. 1.000 logs per request)
15
+ and then submitted to the log server using vlog's the JSON Stream API
16
+
17
+ ## Failure behavior
18
+
19
+ The failure behavior of the handler in key scenarios is as follows:
20
+
21
+ - In case the submission to the log server fails
22
+ (e.g. the server is temporarily down) the logs will be stored
23
+ in the buffer for later retry
24
+ - If the buffer is full, new log events will be discarded
25
+ and raise a exceptions (depending on logging configuration)
26
+ - The handler will make a final attempt to submit remaining logs during shutdown.
27
+ If that fails those logs will written to stderr.
28
+
29
+ ## LogRecord fields
30
+
31
+ The following fields will be transferred for each log event. They are derived from Python's [LogRecord](https://docs.python.org/3/library/logging.html#logrecord-objects):
32
+
33
+ Name | Description | Example | Optional
34
+ -- | -- | -- | --
35
+ `exception_name` | Name of the exception | `ZeroDivisionError` | yes
36
+ `exception` | Full traceback of the exception | `Traceback ...` | yes
37
+ `function` | Name of the function that emitted the log event | `my_function` | no
38
+ `level` | Name of the level of the emitted log event | `INFO` | no
39
+ `line_number` | Line number where the log event was emitted | `89` | no
40
+ `logger` | Name of the related Python logger | `my_package.my_module` | no
41
+ `message` | The logged message | 'This is a log entry' | no
42
+ `stream` | This can be configured. By default this is name of the top-level Python package that emitted the log the event. | `my_package` | no
43
+ `timestamp` | Timestamp of the log event, represented as fractional UNIX epoch | `1775081468.4308655` | no
44
+
45
+ In addition any custom `extras' fields will be added as they are encountered.
46
+
47
+ ## VictoriaLogs special fields
48
+
49
+ VictoriaLogs handles three fields in a special way:
50
+
51
+ - `_msg`: The logged message. This is a mandatory field and is mapped to `message`.
52
+ - `_time`: The timestamp of the log event. This field and is mapped to `timestamp`.
53
+ - `_stream`: The source of a log event, which is used to group and filter logs. This field is mapped to `stream`.
54
+
55
+ For more information please also see [VictoriaLogs Data model](https://docs.victoriametrics.com/victorialogs/keyconcepts/#data-model).