wtftools 0.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- wtftools-0.0.0/CHANGELOG.md +398 -0
- wtftools-0.0.0/LICENSE +21 -0
- wtftools-0.0.0/MANIFEST.in +5 -0
- wtftools-0.0.0/PKG-INFO +246 -0
- wtftools-0.0.0/README.md +184 -0
- wtftools-0.0.0/pyproject.toml +108 -0
- wtftools-0.0.0/scripts/build-deb.sh +33 -0
- wtftools-0.0.0/scripts/wtf.bash-completion +134 -0
- wtftools-0.0.0/setup.cfg +4 -0
- wtftools-0.0.0/tests/test_audit.py +331 -0
- wtftools-0.0.0/tests/test_audit_extras.py +116 -0
- wtftools-0.0.0/tests/test_colors.py +91 -0
- wtftools-0.0.0/tests/test_config.py +100 -0
- wtftools-0.0.0/tests/test_cron.py +390 -0
- wtftools-0.0.0/tests/test_explain_deep.py +254 -0
- wtftools-0.0.0/tests/test_info.py +100 -0
- wtftools-0.0.0/tests/test_iteration10.py +455 -0
- wtftools-0.0.0/tests/test_iteration2_extras.py +206 -0
- wtftools-0.0.0/tests/test_iteration3.py +359 -0
- wtftools-0.0.0/tests/test_iteration4.py +368 -0
- wtftools-0.0.0/tests/test_iteration5.py +315 -0
- wtftools-0.0.0/tests/test_iteration6.py +380 -0
- wtftools-0.0.0/tests/test_iteration7.py +405 -0
- wtftools-0.0.0/tests/test_iteration8.py +389 -0
- wtftools-0.0.0/tests/test_main.py +245 -0
- wtftools-0.0.0/tests/test_main_extras.py +144 -0
- wtftools-0.0.0/tests/test_public_api.py +99 -0
- wtftools-0.0.0/tests/test_sysinfo.py +660 -0
- wtftools-0.0.0/wtftools/__init__.py +55 -0
- wtftools-0.0.0/wtftools/__main__.py +10 -0
- wtftools-0.0.0/wtftools/audit.py +809 -0
- wtftools-0.0.0/wtftools/colors.py +111 -0
- wtftools-0.0.0/wtftools/config.py +249 -0
- wtftools-0.0.0/wtftools/cron.py +388 -0
- wtftools-0.0.0/wtftools/events.py +220 -0
- wtftools-0.0.0/wtftools/explain.py +290 -0
- wtftools-0.0.0/wtftools/info.py +90 -0
- wtftools-0.0.0/wtftools/llm.py +129 -0
- wtftools-0.0.0/wtftools/main.py +1328 -0
- wtftools-0.0.0/wtftools/snapshot.py +203 -0
- wtftools-0.0.0/wtftools/sysinfo.py +1608 -0
- wtftools-0.0.0/wtftools.egg-info/PKG-INFO +246 -0
- wtftools-0.0.0/wtftools.egg-info/SOURCES.txt +45 -0
- wtftools-0.0.0/wtftools.egg-info/dependency_links.txt +1 -0
- wtftools-0.0.0/wtftools.egg-info/entry_points.txt +3 -0
- wtftools-0.0.0/wtftools.egg-info/requires.txt +12 -0
- wtftools-0.0.0/wtftools.egg-info/top_level.txt +1 -0
|
@@ -0,0 +1,398 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
5
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
6
|
+
|
|
7
|
+
## [Unreleased]
|
|
8
|
+
|
|
9
|
+
### Removed — plugin infrastructure
|
|
10
|
+
- `wtftools/plugin_sdk.py` — Python helper for plugin authors.
|
|
11
|
+
- `wtftools/checks/plugins.py` — discovery / executor / parser for
|
|
12
|
+
`/etc/wtf/checks.d/` scripts (bash + Python).
|
|
13
|
+
- `_plugin_to_check` + `_all_check_callables` glue in `wtftools/audit.py`.
|
|
14
|
+
- `tests/test_plugins.py`, `tests/test_iteration16.py`.
|
|
15
|
+
- README's «Plugins» section and QUICKSTART's «Custom checks (plugins)»
|
|
16
|
+
section.
|
|
17
|
+
|
|
18
|
+
The CLI is now a closed set of built-in checks. Custom logic should live
|
|
19
|
+
upstream (e.g. monitoring tools) or be added as new built-in checks via PR.
|
|
20
|
+
|
|
21
|
+
### Changed — layout flattening
|
|
22
|
+
- `wtftools/checks/cron.py` → `wtftools/cron.py`
|
|
23
|
+
- `wtftools/checks/sysinfo.py` → `wtftools/sysinfo.py`
|
|
24
|
+
- `wtftools/checks/` subpackage removed (`__init__.py` deleted).
|
|
25
|
+
- All imports updated: `from wtftools.checks import X` → `from wtftools import X`.
|
|
26
|
+
|
|
27
|
+
### Build & release
|
|
28
|
+
- `release.yml` no longer publishes a Docker image to GHCR. On a `v*` /
|
|
29
|
+
`*.*.*` tag it now runs tests, builds a `.deb` via `scripts/build-deb.sh`
|
|
30
|
+
(stdeb + debhelper toolchain), and attaches the artifact to the matching
|
|
31
|
+
GitHub Release. PyPI publishing happens separately in `publish.yml` via
|
|
32
|
+
OIDC trusted-publisher.
|
|
33
|
+
- `Dockerfile` stays in-repo for ad-hoc `docker build .` use, but no longer
|
|
34
|
+
ships as a release artifact.
|
|
35
|
+
|
|
36
|
+
### Tooling
|
|
37
|
+
- `black` added to `.pre-commit-config.yaml`, running before `ruff` on
|
|
38
|
+
every commit. Config lives in `pyproject.toml [tool.black]`
|
|
39
|
+
(`line-length=180`, `target-version=["py38"]`). Existing tree is already
|
|
40
|
+
black-compatible — no formatting churn.
|
|
41
|
+
|
|
42
|
+
### Changed — scope cleanup
|
|
43
|
+
wtftools is now strictly a **one-shot CLI**. The daemon / fleet / multi-host
|
|
44
|
+
story was removed in favor of the original PROJECT.md Phase 1 vision:
|
|
45
|
+
one server, one command, immediate answer. Removed:
|
|
46
|
+
|
|
47
|
+
- `wtfd` daemon (HTTP server, periodic audit loop, `POST /run-now`)
|
|
48
|
+
- `wtf serve` subcommand and `wtfd` console-script entry point
|
|
49
|
+
- `wtf fleet` (multi-host aggregation) and `wtf compare HOSTA HOSTB`
|
|
50
|
+
- `wtf plugins` listing subcommand (plugins still load — see
|
|
51
|
+
`wtf audit --list-checks` for the `plugin:*` entries)
|
|
52
|
+
- `wtf motd-install` (replace with three lines of shell, see QUICKSTART)
|
|
53
|
+
- `wtf init` interactive wizard (its useful step — writing the example
|
|
54
|
+
config — is `wtf config --example | sudo tee /etc/wtftools/config.ini`)
|
|
55
|
+
- `--watch` flags on `wtf audit`, `wtf info`, `wtf events`
|
|
56
|
+
- `wtf audit --diff` (the standalone `wtf diff` command remains)
|
|
57
|
+
- Bundled `scripts/wtfd.service` systemd unit
|
|
58
|
+
- `wtftools/daemon.py` and `wtftools/fleet.py` modules
|
|
59
|
+
|
|
60
|
+
Kept: `wtf audit --save`, `wtf diff`, `wtf history` — snapshots are pure
|
|
61
|
+
filesystem operations under `~/.cache/wtftools/`, no daemon required.
|
|
62
|
+
|
|
63
|
+
### Added
|
|
64
|
+
- **`wtf problems`** — alias for `wtf audit --only problem`, surfaces just
|
|
65
|
+
the WARN+FAIL rows. Most common audit invocation during an incident,
|
|
66
|
+
given its own subcommand for typing comfort.
|
|
67
|
+
|
|
68
|
+
## [0.0.0] — 2026-05-20
|
|
69
|
+
|
|
70
|
+
Initial public release. Highlights:
|
|
71
|
+
|
|
72
|
+
- **19 subcommands** covering audit / info / services / logs / events /
|
|
73
|
+
history / diff / crontab / doctor / plugins / config / motd-install /
|
|
74
|
+
init / fleet / compare / explain / top / ports / serve.
|
|
75
|
+
- **`wtfd` daemon** with HTTP API (`/audit`, `/audit.json`, `/audit.prom`,
|
|
76
|
+
`/history`, `/snapshot/N`, `POST /run-now`) — drives the fleet story.
|
|
77
|
+
- **38 built-in checks**, plugin system with bash + Python SDK, six output
|
|
78
|
+
formats (text/json/csv/plain/html/prometheus).
|
|
79
|
+
- **Multi-host fleet aggregation** (`wtf fleet`) + host-to-host drift
|
|
80
|
+
detection (`wtf compare`), both with `--watch` and `--run-now`.
|
|
81
|
+
- **LLM bridge** for `wtf explain --llm ollama|claude|openai|auto`.
|
|
82
|
+
- Distribution: PyPI, debian packaging, Docker image, systemd unit,
|
|
83
|
+
bundled MOTD installer, bash completion, GitHub Actions
|
|
84
|
+
release workflow.
|
|
85
|
+
- **724 tests, 92.6 % coverage.**
|
|
86
|
+
|
|
87
|
+
### Added — Plugin SDK & docs (final iteration)
|
|
88
|
+
- **`wtftools.plugin_sdk`** — tiny helper module so Python plugins don't have
|
|
89
|
+
to remember exit codes or hand-roll JSON:
|
|
90
|
+
|
|
91
|
+
```python
|
|
92
|
+
#!/usr/bin/env python3
|
|
93
|
+
from wtftools.plugin_sdk import ok, warn, fail, skip
|
|
94
|
+
# ... your check ...
|
|
95
|
+
fail("internal-api unreachable", detail=["…"]) # exits 2 with JSON
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
Exposes `ok / warn / fail / skip` (terminating) and `result(status, message,
|
|
99
|
+
detail=None)` (non-terminating, for scripts that emit multiple results).
|
|
100
|
+
Detail items are coerced to strings.
|
|
101
|
+
- **`examples/plugins/check-http-health.py`** — example Python plugin using
|
|
102
|
+
the SDK; probes an HTTP endpoint with latency thresholds.
|
|
103
|
+
- **`docs/PLUGIN_GUIDE.md`** — comprehensive plugin author's guide. Documents
|
|
104
|
+
both exit-code and JSON contracts, shows bash + Python quickstarts, lists
|
|
105
|
+
best practices, and points at the 5 example plugins.
|
|
106
|
+
|
|
107
|
+
- **`wtf fleet --watch SECONDS`** — auto-refresh fleet aggregation
|
|
108
|
+
(mirrors the existing `audit --watch` and `info --watch`). Default
|
|
109
|
+
off; pick an interval that respects N-hosts × per-fetch cost.
|
|
110
|
+
- **`wtf fleet --run-now`** — POST `/run-now` to every peer before
|
|
111
|
+
fetching, so the aggregator gets fresh data instead of cached snapshots.
|
|
112
|
+
Best-effort: partial failures print a `run-now reached M/N peer(s)`
|
|
113
|
+
status line and the fetch proceeds anyway.
|
|
114
|
+
- **`wtf events --watch SECONDS`** — auto-refresh the event timeline.
|
|
115
|
+
Useful in an incident war room.
|
|
116
|
+
- **`docs/QUICKSTART.md`** — 5-minute onboarding guide (README grew past 250
|
|
117
|
+
lines — newcomers needed something smaller). Covers install, the
|
|
118
|
+
incident-triage flow, fleet/Prometheus setup, custom checks, and a
|
|
119
|
+
cheat-sheet table mapping common questions to commands.
|
|
120
|
+
|
|
121
|
+
- **`wtf events`** — chronological host timeline. Merges six event sources
|
|
122
|
+
into one newest-first view: reboots (via `last -x reboot`), OOM kills,
|
|
123
|
+
kernel errors, failed-unit transitions, SSH auth failures, recent logins.
|
|
124
|
+
Flags: `--since HOURS` (default 24), `--kind KIND` (repeatable filter),
|
|
125
|
+
`--limit N`, `--format json`. Useful during incident post-mortems: one
|
|
126
|
+
command replaces `last`, `journalctl -k`, `journalctl SYSTEMD_UNIT=…`, and
|
|
127
|
+
several SSH-log greps.
|
|
128
|
+
- **`POST /run-now`** in wtfd — trigger an immediate audit from outside the
|
|
129
|
+
scheduler interval. Used by central dashboards that want fresh data on
|
|
130
|
+
demand. Returns `202 Accepted` instantly; the actual run completes in the
|
|
131
|
+
background and appears on the next `/audit.json` fetch. Auth-token-respecting.
|
|
132
|
+
The scheduler now wakes within ~1s of receiving a `/run-now`.
|
|
133
|
+
|
|
134
|
+
- **`wtf compare HOSTA HOSTB`** — side-by-side diff of two wtfd hosts.
|
|
135
|
+
Real-world SRE use case: «two boxes from the same template, why does
|
|
136
|
+
one behave differently?» Fetches `/audit.json` from both, walks the
|
|
137
|
+
merged set of check names, marks each row as `=` (identical), `DIFF`
|
|
138
|
+
(status or message differs), `A→` (only on A), `→B` (only on B).
|
|
139
|
+
`--only-drift` hides identical rows. `--format json` for pipelines.
|
|
140
|
+
`--token-file` if peers require Bearer auth. Exit code: 0 identical,
|
|
141
|
+
1 drift present, 2 if at least one host is unreachable.
|
|
142
|
+
- **`wtf doctor --check-updates`** — opt-in PyPI version check. Queries
|
|
143
|
+
`https://pypi.org/pypi/wtftools/json` (3s timeout) and surfaces a
|
|
144
|
+
`[WARN]` row if a newer release is published. Off by default — `doctor`
|
|
145
|
+
stays an offline operation unless the operator explicitly opts in.
|
|
146
|
+
|
|
147
|
+
- **`wtf init`** — interactive setup wizard for fresh hosts. Walks through
|
|
148
|
+
four optional steps:
|
|
149
|
+
1. write `/etc/wtftools/config.ini` (sample with defaults)
|
|
150
|
+
2. install `/etc/update-motd.d/99-wtf-brief` for the ssh-login banner
|
|
151
|
+
3. install + enable the bundled `wtfd.service` (off by default)
|
|
152
|
+
4. add `/etc/cron.d/wtftools-hourly` for an hourly audit snapshot
|
|
153
|
+
|
|
154
|
+
Use `--non-interactive` for scripted deploys; `--dry-run` to preview;
|
|
155
|
+
per-step `--enable-X` / `--no-X` flags override defaults.
|
|
156
|
+
- **`examples/plugins/`** — four ready-to-use plugin scripts:
|
|
157
|
+
- `check-cert-domain.sh` — remote TLS cert expiry probe
|
|
158
|
+
- `check-postgres-connections.sh` — Postgres `pg_stat_activity` vs `max_connections`
|
|
159
|
+
- `check-redis-memory.sh` — Redis `used_memory` vs `maxmemory`
|
|
160
|
+
- `check-disk-write.sh` — quick fsync-write latency test
|
|
161
|
+
|
|
162
|
+
Drop any of these into `/etc/wtf/checks.d/` and `wtf audit` picks them up.
|
|
163
|
+
- **`docs/schema/`** — JSON Schema (draft-07) for `--format json` outputs:
|
|
164
|
+
`audit-v1.json` and `fleet-v1.json`. Use with `check-jsonschema` or any
|
|
165
|
+
validator to build typed parsers in your integration.
|
|
166
|
+
- **`CONTRIBUTING.md`** — dev setup, test/lint commands, how to add a new
|
|
167
|
+
check or subcommand, release flow.
|
|
168
|
+
- **GitHub Actions release workflow** (`.github/workflows/release.yml`) —
|
|
169
|
+
on `v*` tag: runs the suite, builds sdist+wheel, publishes to PyPI
|
|
170
|
+
(via `PYPI_API_TOKEN` secret), builds + pushes a Docker image to GHCR.
|
|
171
|
+
|
|
172
|
+
- **`wtf fleet`** — multi-host aggregation. Pulls `/audit.json` from each
|
|
173
|
+
configured wtfd peer in parallel (`urllib` + ThreadPoolExecutor, no extra
|
|
174
|
+
deps). Renders an at-a-glance fleet view sorted by severity:
|
|
175
|
+
unreachable → fail → warn → ok. Per-host row inlines the top two problems
|
|
176
|
+
so an SRE doesn't need to drill in to know what's broken.
|
|
177
|
+
- Targets from `--hosts a:8765,b:8765` (repeatable), `--hosts-file FILE`
|
|
178
|
+
(one per line, `#` comments), or `[thresholds] fleet_hosts = …` in the
|
|
179
|
+
config file. All sources merge and dedupe.
|
|
180
|
+
- `--token-file FILE` sends `Authorization: Bearer …` to every peer.
|
|
181
|
+
- `--problem-only` hides healthy hosts during incidents.
|
|
182
|
+
- `--format prometheus` emits one set of metrics per host
|
|
183
|
+
(`wtf_fleet_host_up{host="…"}`, `wtf_fleet_summary_count{host,status}`)
|
|
184
|
+
suitable for a single scrape job targeting the aggregator.
|
|
185
|
+
- Exit codes: 0 if all hosts OK; 1 if some unreachable but no FAIL;
|
|
186
|
+
2 if any FAIL or everything is unreachable. CI-friendly.
|
|
187
|
+
- **Dockerfile** — `python:3.12-slim` base with `[full]` extras (psutil)
|
|
188
|
+
plus tools wtftools probes (procps, iproute2, smartmontools, openssl,
|
|
189
|
+
systemd-sysv, cron). `HEALTHCHECK` against `/healthz`, default entrypoint
|
|
190
|
+
is `wtf`. `.dockerignore` keeps the image lean.
|
|
191
|
+
|
|
192
|
+
- **LLM bridge for `wtf explain`** — closes the loop: instead of piping the
|
|
193
|
+
structured prompt to an LLM by hand, point at a backend directly.
|
|
194
|
+
- `wtf explain --llm ollama` — subprocess call to local ollama (no API key).
|
|
195
|
+
- `wtf explain --llm claude` — uses `anthropic` SDK if installed +
|
|
196
|
+
`ANTHROPIC_API_KEY` env. Default model: `claude-haiku-4-5-20251001`.
|
|
197
|
+
- `wtf explain --llm openai` — uses `openai` SDK + `OPENAI_API_KEY`.
|
|
198
|
+
- `wtf explain --llm auto` — tries ollama → claude → openai, returns the
|
|
199
|
+
first one that responds.
|
|
200
|
+
- `--llm-model` overrides the default model; `--llm-timeout` overrides 60s.
|
|
201
|
+
- No mandatory new dependencies — the SDKs are imported lazily, missing
|
|
202
|
+
backends become a graceful skip with an explanatory message.
|
|
203
|
+
- **`wtf audit --format html`** — self-contained HTML with inline CSS.
|
|
204
|
+
Color-coded rows, collapsible detail. Survives email/ticket paste.
|
|
205
|
+
- **`wtf audit --output FILE` / `-o FILE`** — write the audit to a file
|
|
206
|
+
instead of stdout. Drops ANSI escapes automatically so logs stay clean.
|
|
207
|
+
- **`fail2ban` check** — surfaces currently-banned IP counts per jail
|
|
208
|
+
(informational, not a problem signal). Skip when fail2ban-client missing
|
|
209
|
+
or the daemon is down.
|
|
210
|
+
|
|
211
|
+
- **`wtfd` daemon** — PROJECT.md Phase 2 landed. Stdlib-only single-process
|
|
212
|
+
daemon (`pip install wtftools` ships an extra `wtfd` console script).
|
|
213
|
+
Runs `audit` on a configurable cadence and serves the result over HTTP:
|
|
214
|
+
- `GET /` — brief one-liner (host, fail/warn counts, top problems)
|
|
215
|
+
- `GET /healthz` — liveness probe
|
|
216
|
+
- `GET /audit` / `/audit.txt` — current audit in plaintext
|
|
217
|
+
- `GET /audit.json` — full audit + summary + timestamp + error state
|
|
218
|
+
- `GET /audit.prom` — Prometheus textfile-collector
|
|
219
|
+
- `GET /history` — snapshot dir + list of recent basenames
|
|
220
|
+
- `GET /snapshot/N` — Nth-most-recent snapshot (by index or basename prefix)
|
|
221
|
+
|
|
222
|
+
Flags: `--listen HOST:PORT` (default `127.0.0.1:8765`), `--interval SEC`
|
|
223
|
+
(default 300 = 5 min), `--save` to persist each run as a snapshot,
|
|
224
|
+
`--auth-token-file PATH` for `Authorization: Bearer …` protection.
|
|
225
|
+
Every response carries `X-WTF-Host`, `X-WTF-Last-Audit`, `X-WTF-Version`
|
|
226
|
+
headers for trivial observability. Run via `wtf serve …` or the bare
|
|
227
|
+
`wtfd` console script.
|
|
228
|
+
|
|
229
|
+
- **systemd unit** in `scripts/wtfd.service` — `DynamicUser=yes`,
|
|
230
|
+
`StateDirectory=wtftools`, hardened (`ProtectSystem=strict`,
|
|
231
|
+
`NoNewPrivileges`, `ProtectKernel*`). Drop into `/etc/systemd/system/`,
|
|
232
|
+
`systemctl enable --now wtfd`.
|
|
233
|
+
|
|
234
|
+
- **`http-probes` and `tcp-probes` checks** — declare endpoints in
|
|
235
|
+
`[thresholds]` (`http_probes = http://localhost:80, http://localhost:9090`
|
|
236
|
+
and `tcp_probes = 127.0.0.1:5432, db.internal:6379`). Each becomes its own
|
|
237
|
+
audit row. HTTP non-2xx/3xx → FAIL; connect refused/timeout → FAIL; latency
|
|
238
|
+
≥ `probe_slow_ms` → WARN. Uses stdlib `http.client` + `socket` — no extra
|
|
239
|
+
dependencies. Catches the "service is running but not actually serving"
|
|
240
|
+
failure mode that `failed-units` misses.
|
|
241
|
+
- **`smart` check** — per-disk SMART health via `smartctl -H -j` (requires
|
|
242
|
+
`smartmontools` package, typically also root). One FAILED disk → FAIL with
|
|
243
|
+
device name in detail. Discovers disks via `lsblk`, filters out loop devices
|
|
244
|
+
and partitions.
|
|
245
|
+
- **`wtf diff`** — standalone snapshot diff command. `wtf diff` compares the
|
|
246
|
+
latest snapshot to a fresh audit (same as `wtf audit --diff`).
|
|
247
|
+
`wtf diff --snapshot N` reaches back N snapshots. `wtf diff --against A B`
|
|
248
|
+
diffs two snapshot files directly without running a live audit (useful for
|
|
249
|
+
comparing snapshots shipped from other hosts).
|
|
250
|
+
- **`wtf audit --format plain`** — tab-separated `status<TAB>name<TAB>message`
|
|
251
|
+
rows. No headers, no summary, no colors. Designed for shell pipelines:
|
|
252
|
+
`wtf audit --format plain | awk '$1=="fail"'`.
|
|
253
|
+
|
|
254
|
+
- **`wtf top`** — focused process top with sort and filters.
|
|
255
|
+
`--sort cpu|rss`, `--user PREFIX`, `--name SUBSTRING`, `--limit N`.
|
|
256
|
+
Cuts through the noise of `wtf info`'s 5-row top section when you need
|
|
257
|
+
the bigger picture.
|
|
258
|
+
- **`wtf ports`** — listening sockets with owning PID, user, command.
|
|
259
|
+
Replaces `ss -tlnp` for the common "who's on :443?" question.
|
|
260
|
+
`--proto tcp|udp|all`, `--public-only` (drops 127.x).
|
|
261
|
+
- **`wtf motd-install`** — installs `/etc/update-motd.d/99-wtf-brief` so
|
|
262
|
+
every SSH login shows a one-line wtftools summary. `--path` to override
|
|
263
|
+
destination, requires root.
|
|
264
|
+
- **`hw-temp` check** — reads `/sys/class/hwmon/*/temp*_input`. ≥75°C WARN,
|
|
265
|
+
≥90°C FAIL (configurable). Reports max + count, lists all sensors in
|
|
266
|
+
`-v` detail. Filters absurd readings (<-50°C or >200°C broken sensors).
|
|
267
|
+
- **`dns` check** — probes well-known hosts via the system resolver.
|
|
268
|
+
Configurable list (`dns_probe_hosts`, default `google.com,cloudflare.com`)
|
|
269
|
+
+ 2s per-probe timeout. All resolve → OK. Some fail → WARN. None
|
|
270
|
+
resolve → FAIL (broken DNS / resolved.service down). Catches silently-
|
|
271
|
+
broken `systemd-resolved`.
|
|
272
|
+
- **`wtf audit --format csv`** — CSV output with name,status,message,detail
|
|
273
|
+
columns. For spreadsheet flows / lightweight reporting.
|
|
274
|
+
|
|
275
|
+
- **Snapshots, history, and diff** — `wtfd-lite` finally exists.
|
|
276
|
+
- `wtf audit --save` persists the current run to `~/.cache/wtftools/snapshots/`
|
|
277
|
+
(or `/var/lib/wtftools/snapshots/` when running as root, or
|
|
278
|
+
`$WTFTOOLS_SNAPSHOT_DIR` if set). Auto-rotates to keep the newest 48.
|
|
279
|
+
- `wtf audit --diff` compares the current audit to the most recent snapshot,
|
|
280
|
+
flagging regressions / recoveries / new / removed checks. Sorted with
|
|
281
|
+
regressions first.
|
|
282
|
+
- `wtf history` lists stored snapshots with status counts.
|
|
283
|
+
- Snapshot file format is plain JSON — easy to ship to a central host.
|
|
284
|
+
- **`docker` check** — surfaces containers in `unhealthy` or `Restarting` state.
|
|
285
|
+
`unhealthy` → FAIL, `restarting`-only → WARN. Skips cleanly when docker is
|
|
286
|
+
not installed or the daemon is unreachable.
|
|
287
|
+
- **NTP drift magnitude** in the `time-sync` check — when `chronyc tracking`
|
|
288
|
+
is available, the reported offset (ms) augments the binary sync/no-sync
|
|
289
|
+
signal. Drift ≥100ms → WARN, ≥1s → FAIL.
|
|
290
|
+
- **`wtf audit --format prometheus`** — Prometheus textfile-collector output.
|
|
291
|
+
Two metrics: `wtf_check_status{name="..."}` (0/1/2/3 for ok/warn/fail/skip)
|
|
292
|
+
and `wtf_summary_total{status="..."}`. Drop into node_exporter's
|
|
293
|
+
`--collector.textfile.directory` for scraping.
|
|
294
|
+
- **`wtf info --watch SECONDS`** — live-refresh the host snapshot (mirror of
|
|
295
|
+
the existing `wtf audit --watch`).
|
|
296
|
+
|
|
297
|
+
### Added (earlier in this Unreleased cycle)
|
|
298
|
+
- **`wtf explain`** — turns audit findings into actionable per-check advice.
|
|
299
|
+
A rule-based table maps each `(name, status)` to a 1-2 sentence diagnosis
|
|
300
|
+
and concrete next steps (which command to run, which file to vacuum, etc.).
|
|
301
|
+
Covers every built-in check; unknown checks get a fallback hint.
|
|
302
|
+
- **`wtf explain --prompt`** — emit an LLM-ready prompt summarizing the audit.
|
|
303
|
+
Pipe to `claude`, `ollama run llama3`, or any other LLM for a synthesized
|
|
304
|
+
diagnosis without bundling an LLM dependency. The PROJECT.md headline finally
|
|
305
|
+
has a delivery vehicle.
|
|
306
|
+
- **`wtf audit --alert <cmd>`** — fire a shell command when audit produces
|
|
307
|
+
FAIL (or WARN, with `--alert-on warn`). Audit summary is piped to the
|
|
308
|
+
command's stdin; env vars `WTF_FAIL_COUNT`, `WTF_WARN_COUNT`, `WTF_HOST`
|
|
309
|
+
are set. Cron-driven monitoring without a notification client:
|
|
310
|
+
`wtf audit --alert 'mail -s "wtf $WTF_HOST" sre@example.com'`.
|
|
311
|
+
- **`conntrack` check** — reads `/proc/sys/net/netfilter/nf_conntrack_count`
|
|
312
|
+
vs `nf_conntrack_max`. NAT/firewall/proxy hosts silently drop new
|
|
313
|
+
connections when the table fills; ≥70% WARN, ≥90% FAIL (configurable).
|
|
314
|
+
- **`journal-disk` check** — parses `journalctl --disk-usage`. ≥4GB WARN,
|
|
315
|
+
≥16GB FAIL (configurable). Includes a vacuum-size hint in the message.
|
|
316
|
+
- pyproject installs the bash-completion file system-wide.
|
|
317
|
+
|
|
318
|
+
### Added (earlier in this Unreleased cycle)
|
|
319
|
+
- **Parallel check execution** — checks now run on a `ThreadPoolExecutor`
|
|
320
|
+
(default 8 workers, configurable via `config.ini` `parallel_workers` or env).
|
|
321
|
+
Typical full audit dropped from ~2.3s to ~1.2s on a 24-core dev machine; one
|
|
322
|
+
hung check no longer blocks the rest. Use `wtf audit --serial` to force the
|
|
323
|
+
old sequential path for debugging.
|
|
324
|
+
- **Per-check timeout** — every check gets a default 10s budget. A check that
|
|
325
|
+
exceeds it surfaces a `[SKIP]` result with a clear "timeout" message instead
|
|
326
|
+
of hanging the whole audit. Tune via `config.ini` `check_timeout` or
|
|
327
|
+
`wtf audit --check-timeout SECONDS`.
|
|
328
|
+
- **`psi` check** — reads `/proc/pressure/{cpu,memory,io}` (Linux ≥4.20). The
|
|
329
|
+
modern kernel signal for real resource contention. Thresholds on PSI `some
|
|
330
|
+
avg10`: ≥10% WARN, ≥30% FAIL (configurable). Three result rows: one per
|
|
331
|
+
resource. Auto-skipped when `psi=0` boot cmdline is set.
|
|
332
|
+
- **`kernel-taint` check** — reads `/proc/sys/kernel/tainted`. Non-zero means
|
|
333
|
+
the kernel saw a proprietary/forced/unsigned module, a machine check, a
|
|
334
|
+
soft-lockup, etc. Decodes the bitmask into readable flag names; severe bits
|
|
335
|
+
(`MACHINE_CHECK`, `SOFTLOCKUP`, `DIE`, `BAD_PAGE`) escalate to FAIL.
|
|
336
|
+
- **`cert-expiry` check** — walks server-cert dirs (`/etc/letsencrypt/live`,
|
|
337
|
+
`/etc/nginx/ssl`, `/etc/haproxy/certs`, …), parses `notAfter` via openssl.
|
|
338
|
+
≥30d OK, <30d WARN, <7d FAIL. Bounded to 50 files. Avoids the system CA
|
|
339
|
+
store (`/etc/ssl/certs`) which legitimately ships long-expired root CAs.
|
|
340
|
+
- **`wtf logs`** — recent ERROR+ journal entries grouped by service. Flags:
|
|
341
|
+
`--since '1 hour ago'`, `--priority err`, `--units N`, `--lines N`,
|
|
342
|
+
`--format json`. Natural complement to `wtf services <name>`.
|
|
343
|
+
|
|
344
|
+
- **`wtf services <name>`** — focused drilldown for one systemd unit: shows
|
|
345
|
+
ActiveState, SubState, Result, UnitFileState, MainPID, NRestarts,
|
|
346
|
+
MemoryCurrent, listening ports owned by the main pid, plus the last N journal
|
|
347
|
+
lines. Replaces the SSH dance of `systemctl status … && journalctl -u … && ss -tlnp`.
|
|
348
|
+
- **Config file** — INI at `/etc/wtftools/config.ini`, `/etc/wtf/config.ini`,
|
|
349
|
+
or `~/.config/wtftools/config.ini`. Customizable thresholds for disk, memory,
|
|
350
|
+
swap, load, iowait, fds, pids, tcp-retrans, auth, service restarts, plus
|
|
351
|
+
`[ignore]` lists. Global `--config PATH` stacks a further file on top.
|
|
352
|
+
- **`wtf config`** — print effective values + search paths. `wtf config --example`
|
|
353
|
+
prints a fully-commented template ready for `> /etc/wtftools/config.ini`.
|
|
354
|
+
- **`wtf audit --ignore NAME`** — skip a check by short-name OR by result-name
|
|
355
|
+
(e.g. `--ignore "disk /mnt/Backup"` to hush a single noisy mount). Repeatable.
|
|
356
|
+
- **`tcp-retrans`** check — samples `/proc/net/snmp` TCP RetransSegs/OutSegs
|
|
357
|
+
over a 1-second window; ≥1% WARN, ≥5% FAIL (configurable).
|
|
358
|
+
|
|
359
|
+
### Changed
|
|
360
|
+
- All audit thresholds now read from the active config (no more hardcoded
|
|
361
|
+
85/95/30/70). Defaults match prior behavior exactly.
|
|
362
|
+
- `run_audit()` accepts `ignore=` and merges it with the config's
|
|
363
|
+
`[ignore]` lists.
|
|
364
|
+
|
|
365
|
+
- **Plugin system**: drop executable scripts into `/etc/wtf/checks.d/`,
|
|
366
|
+
`/etc/wtftools/checks.d/`, or `~/.config/wtftools/checks.d/`. Exit codes
|
|
367
|
+
`0=ok / 1=warn / 2=fail / 77=skip`; stdout becomes the message. A plugin
|
|
368
|
+
may also emit a one-line JSON object `{"status":..., "message":...,
|
|
369
|
+
"detail":[...]}` for full control. Plugins show up in `wtf audit` and
|
|
370
|
+
`wtf audit --list-checks` under the `plugin:<name>` namespace.
|
|
371
|
+
- `wtf plugins` — list discovery dirs and registered plugins.
|
|
372
|
+
- `restart-loops` audit check — flags active services where systemd has had
|
|
373
|
+
to bring them back ≥3 times (`NRestarts`). ≥10 → FAIL (the "flaky daemon"
|
|
374
|
+
case where the service technically "runs" but isn't healthy).
|
|
375
|
+
- `network-errors` audit check — reads `/sys/class/net/*/statistics/` and
|
|
376
|
+
surfaces interfaces with non-zero rx/tx errors or drops (≥1000 → WARN).
|
|
377
|
+
- `wtf audit --brief` / `-b` — one-line summary suitable for MOTD / SSH
|
|
378
|
+
banners: `wtf: 1 fail, 3 warn — swap: 99% · …`. Exit code mirrors severity.
|
|
379
|
+
- Example plugin in `scripts/example-plugin-check-tmp.sh` (warns when /tmp
|
|
380
|
+
usage crosses 80% / 95%).
|
|
381
|
+
- `wtf doctor` — self-diagnostic that probes which CLI tools (`systemctl`,
|
|
382
|
+
`journalctl`, `apt`, `timedatectl`, …) and `/proc` files are available.
|
|
383
|
+
Explains why checks may be skipped on this host.
|
|
384
|
+
- `wtf audit --check NAME` — run a single named check (repeatable). For CI
|
|
385
|
+
and scripted use (e.g. `wtf audit --check disks --check memory --format json`).
|
|
386
|
+
- `wtf audit --list-checks` — print the short names of every registered check.
|
|
387
|
+
- `wtf audit --only fail|warn|problem|skip|ok|all` — filter output by status.
|
|
388
|
+
Useful on terminal: `wtf audit --only problem` shows just what's broken.
|
|
389
|
+
- `wtf audit --since HOURS` — configurable look-back window for OOM, kernel
|
|
390
|
+
errors and failed-auth checks (was hardcoded to 24h).
|
|
391
|
+
- `wtf audit --watch SECONDS` — live mode that re-runs the audit and re-prints
|
|
392
|
+
every N seconds (Ctrl-C to exit).
|
|
393
|
+
- Bash completion in `scripts/wtf.bash-completion`.
|
|
394
|
+
- GitHub Actions CI workflow running tests + coverage on Python 3.10–3.12.
|
|
395
|
+
|
|
396
|
+
### Changed
|
|
397
|
+
- Audit registry now keys checks by stable short names so `--check` / `--list-checks`
|
|
398
|
+
expose a documented, scriptable surface.
|
wtftools-0.0.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Aleksandr Pimenov
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
wtftools-0.0.0/PKG-INFO
ADDED
|
@@ -0,0 +1,246 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: wtftools
|
|
3
|
+
Version: 0.0.0
|
|
4
|
+
Summary: One command to see what is going on with your Linux server right now.
|
|
5
|
+
Author-email: Aleksandr Pimenov <wachawo@gmail.com>
|
|
6
|
+
Maintainer-email: Aleksandr Pimenov <wachawo@gmail.com>
|
|
7
|
+
License: MIT License
|
|
8
|
+
|
|
9
|
+
Copyright (c) 2026 Aleksandr Pimenov
|
|
10
|
+
|
|
11
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
12
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
13
|
+
in the Software without restriction, including without limitation the rights
|
|
14
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
15
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
16
|
+
furnished to do so, subject to the following conditions:
|
|
17
|
+
|
|
18
|
+
The above copyright notice and this permission notice shall be included in all
|
|
19
|
+
copies or substantial portions of the Software.
|
|
20
|
+
|
|
21
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
22
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
23
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
24
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
25
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
26
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
27
|
+
SOFTWARE.
|
|
28
|
+
|
|
29
|
+
Project-URL: Homepage, https://github.com/wachawo/wtftools
|
|
30
|
+
Project-URL: Repository, https://github.com/wachawo/wtftools.git
|
|
31
|
+
Project-URL: Documentation, https://github.com/wachawo/wtftools#readme
|
|
32
|
+
Project-URL: Bug Reports, https://github.com/wachawo/wtftools/issues
|
|
33
|
+
Keywords: devops,sre,linux,diagnostics,monitoring,cron,system,audit,cli
|
|
34
|
+
Classifier: Development Status :: 4 - Beta
|
|
35
|
+
Classifier: Intended Audience :: System Administrators
|
|
36
|
+
Classifier: Intended Audience :: Developers
|
|
37
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
38
|
+
Classifier: Programming Language :: Python :: 3
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.8
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
41
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
42
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
43
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
44
|
+
Classifier: Operating System :: POSIX :: Linux
|
|
45
|
+
Classifier: Topic :: System :: Systems Administration
|
|
46
|
+
Classifier: Topic :: System :: Monitoring
|
|
47
|
+
Classifier: Topic :: Utilities
|
|
48
|
+
Requires-Python: >=3.8
|
|
49
|
+
Description-Content-Type: text/markdown
|
|
50
|
+
License-File: LICENSE
|
|
51
|
+
Provides-Extra: full
|
|
52
|
+
Requires-Dist: psutil>=5.9.0; extra == "full"
|
|
53
|
+
Provides-Extra: dev
|
|
54
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
55
|
+
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
56
|
+
Requires-Dist: coverage>=7.0.0; extra == "dev"
|
|
57
|
+
Requires-Dist: ruff>=0.4.0; extra == "dev"
|
|
58
|
+
Requires-Dist: build>=1.0.0; extra == "dev"
|
|
59
|
+
Requires-Dist: stdeb>=0.10.0; extra == "dev"
|
|
60
|
+
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
|
|
61
|
+
Dynamic: license-file
|
|
62
|
+
|
|
63
|
+
# wtftools
|
|
64
|
+
|
|
65
|
+
> One command to see what is going on with your Linux server right now.
|
|
66
|
+
|
|
67
|
+
**Status:** v0.0.0 — initial public release. 14 subcommands, 38 built-in
|
|
68
|
+
checks, snapshot/diff/history, LLM-driven explain. One-shot CLI; no daemon,
|
|
69
|
+
no fleet aggregator, no plugin extension API.
|
|
70
|
+
|
|
71
|
+
> **In a hurry?** See [docs/QUICKSTART.md](docs/QUICKSTART.md) for the 5-minute version.
|
|
72
|
+
|
|
73
|
+
```
|
|
74
|
+
$ wtf
|
|
75
|
+
─────────── AUDIT ────────────
|
|
76
|
+
[ OK ] uptime 3d 4h 12m
|
|
77
|
+
[ OK ] load average 0.42 0.51 0.55 / 8 CPU
|
|
78
|
+
[ OK ] memory 4.1GB / 16.0GB used (25%)
|
|
79
|
+
[WARN] disk /var 17.0GB / 20.0GB used (85%)
|
|
80
|
+
[ OK ] zombie processes 0 zombies
|
|
81
|
+
[FAIL] failed systemd units 1 failed unit(s)
|
|
82
|
+
[ OK ] crontab syntax 14 cron line(s), no errors
|
|
83
|
+
|
|
84
|
+
Summary: 12 ok · 1 warn · 1 fail · 2 skip
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
## Subcommands
|
|
88
|
+
|
|
89
|
+
| command | what it does |
|
|
90
|
+
|---------------------|-------------------------------------------------------------|
|
|
91
|
+
| `wtf` / `wtf audit` | green/yellow/red checklist: what is OK and what is not |
|
|
92
|
+
| `wtf problems` | alias for `audit --only problems` — show WARN+FAIL only |
|
|
93
|
+
| `wtf explain` | per-check actionable advice; `--llm` to pipe to LLM |
|
|
94
|
+
| `wtf info` | one-page snapshot: host, uptime, load, mem, disks, top, net |
|
|
95
|
+
| `wtf top` | focused process top: sort by cpu/rss, filter user/name |
|
|
96
|
+
| `wtf ports` | listening sockets with owning PID/user/command |
|
|
97
|
+
| `wtf services NAME` | drilldown one service: state, restarts, mem, ports, journal |
|
|
98
|
+
| `wtf logs` | recent ERROR+ journal entries grouped by service |
|
|
99
|
+
| `wtf events` | chronological timeline: reboots, OOM, failed units, … |
|
|
100
|
+
| `wtf history` | list saved audit snapshots (`wtf audit --save` to create) |
|
|
101
|
+
| `wtf diff` | compare current state to a saved snapshot |
|
|
102
|
+
| `wtf crontab` | validate all standard crontab locations + per-user crontabs |
|
|
103
|
+
| `wtf doctor` | self-diagnostic: which tools wtftools can actually use |
|
|
104
|
+
| `wtf config` | show effective config / print example |
|
|
105
|
+
|
|
106
|
+
`wtftools` absorbs and supersedes [`checkcrontab`](https://github.com/wachawo/checkcrontab) — the same cron validator now lives at `wtf crontab`.
|
|
107
|
+
|
|
108
|
+
## Install
|
|
109
|
+
|
|
110
|
+
### From PyPI
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
pip install wtftools # core, stdlib-only
|
|
114
|
+
pip install wtftools[full] # + psutil for richer metrics
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
After install the short command `wtf` (and the long alias `wtftools`) is on `$PATH`.
|
|
118
|
+
|
|
119
|
+
### From apt (Debian/Ubuntu)
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
sudo apt install python3-psutil
|
|
123
|
+
sudo dpkg -i wtftools_0.0.0-1_all.deb
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
A `.deb` is built from the same source via `scripts/build-deb.sh` (uses `stdeb`).
|
|
127
|
+
|
|
128
|
+
### From source
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
git clone https://github.com/wachawo/wtftools
|
|
132
|
+
cd wtftools
|
|
133
|
+
pip install -e .
|
|
134
|
+
# or test without installing:
|
|
135
|
+
python3 wtf.py audit
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
## Usage
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
wtf # short audit summary (default)
|
|
142
|
+
wtf problems # only WARN+FAIL rows
|
|
143
|
+
wtf info # detailed system snapshot
|
|
144
|
+
wtf info --format json # machine-readable
|
|
145
|
+
|
|
146
|
+
wtf audit # full audit with [OK]/[WARN]/[FAIL] markers
|
|
147
|
+
wtf audit -v # show extra detail (failed units, OOM events)
|
|
148
|
+
wtf audit --strict # exit 1 on warnings (CI-friendly)
|
|
149
|
+
wtf audit --format json # JSON output for pipelines
|
|
150
|
+
wtf audit --check memory --check disks # run named checks only
|
|
151
|
+
wtf audit --list-checks # show all available check short-names
|
|
152
|
+
wtf audit --since 1 # look-back window for OOM/auth/kernel (default 24h)
|
|
153
|
+
wtf audit --brief # one-line summary for MOTD / SSH banners
|
|
154
|
+
wtf audit --ignore swap --ignore "disk /mnt/Backup" # silence specific checks
|
|
155
|
+
wtf audit --format csv > audit.csv # spreadsheet-friendly
|
|
156
|
+
wtf audit --format plain | awk '$1=="fail"' # shell-pipeline-friendly
|
|
157
|
+
wtf audit --format html -o report.html # self-contained HTML for tickets
|
|
158
|
+
|
|
159
|
+
wtf audit --save # save snapshot to ~/.cache/wtftools/
|
|
160
|
+
wtf diff # what changed vs last snapshot
|
|
161
|
+
wtf diff --snapshot 5 # vs 5 snapshots ago
|
|
162
|
+
wtf history # list saved snapshots
|
|
163
|
+
|
|
164
|
+
wtf explain # per-check actionable advice
|
|
165
|
+
wtf explain --prompt | ollama run llama3 # pipe to local LLM
|
|
166
|
+
wtf explain --llm ollama # built-in: call ollama directly
|
|
167
|
+
wtf explain --llm claude # anthropic SDK + ANTHROPIC_API_KEY
|
|
168
|
+
wtf explain --llm auto # try ollama → claude → openai
|
|
169
|
+
|
|
170
|
+
wtf audit --alert 'mail -s "wtf $WTF_HOST" sre@example.com'
|
|
171
|
+
wtf audit --alert-on warn --alert 'curl -X POST $SLACK_WEBHOOK -d @-'
|
|
172
|
+
|
|
173
|
+
wtf top # top processes
|
|
174
|
+
wtf top --sort rss --user www-data --limit 5 # top RAM consumers for one user
|
|
175
|
+
wtf ports # listening TCP + owning process
|
|
176
|
+
|
|
177
|
+
wtf services nginx # state + restarts + ports + last 20 journal lines
|
|
178
|
+
wtf logs # last hour, ERROR+
|
|
179
|
+
wtf events --since 48 # 48-hour incident timeline
|
|
180
|
+
wtf events --kind oom --kind failed-unit # filter to specific kinds
|
|
181
|
+
|
|
182
|
+
wtf doctor # show which CLI tools wtf can use on this host
|
|
183
|
+
wtf doctor --check-updates # also query PyPI for a newer version
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
## Exit codes
|
|
187
|
+
|
|
188
|
+
| code | meaning |
|
|
189
|
+
|------|--------------------------------------------------|
|
|
190
|
+
| 0 | everything OK (`audit`) / no errors (`crontab`) |
|
|
191
|
+
| 1 | warnings with `--strict`, or crontab errors |
|
|
192
|
+
| 2 | audit found a `[FAIL]` |
|
|
193
|
+
| 130 | interrupted (Ctrl-C) |
|
|
194
|
+
|
|
195
|
+
## Built-in checks
|
|
196
|
+
|
|
197
|
+
uptime · system state · load average · CPU iowait · PSI cpu/memory/io ·
|
|
198
|
+
TCP retransmits · memory · swap · disk (per mount) · inodes ·
|
|
199
|
+
read-only mounts · failed systemd units · enabled-but-down services ·
|
|
200
|
+
restart loops · network errors · conntrack · journal disk usage · zombies ·
|
|
201
|
+
D-state processes · OOM kills · kernel errors · kernel taint · cert expiry ·
|
|
202
|
+
open file descriptors · process count · failed auth · time sync ·
|
|
203
|
+
pending updates · reboot required · cron daemon · crontab syntax · docker ·
|
|
204
|
+
hw temperatures · disk SMART · DNS · HTTP/TCP probes · fail2ban.
|
|
205
|
+
|
|
206
|
+
Run `wtf audit --list-checks` for the full list of short names usable with
|
|
207
|
+
`--check` and `--ignore`.
|
|
208
|
+
|
|
209
|
+
## Config
|
|
210
|
+
|
|
211
|
+
Drop an INI at any of:
|
|
212
|
+
|
|
213
|
+
- `/etc/wtftools/config.ini`
|
|
214
|
+
- `/etc/wtf/config.ini`
|
|
215
|
+
- `~/.config/wtftools/config.ini`
|
|
216
|
+
|
|
217
|
+
Or stack one ad-hoc via `wtf --config /path/to.ini …`. Run `wtf config --example`
|
|
218
|
+
for a fully-commented template. Headlines:
|
|
219
|
+
|
|
220
|
+
```ini
|
|
221
|
+
[thresholds]
|
|
222
|
+
disk_warn = 85
|
|
223
|
+
disk_fail = 95
|
|
224
|
+
swap_warn = 50
|
|
225
|
+
swap_fail = 90
|
|
226
|
+
tcp_retrans_warn = 1.0
|
|
227
|
+
tcp_retrans_fail = 5.0
|
|
228
|
+
|
|
229
|
+
[ignore]
|
|
230
|
+
checks = swap, updates
|
|
231
|
+
result_names =
|
|
232
|
+
disk /mnt/Backup
|
|
233
|
+
disk /mnt/Video
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
## Compatibility
|
|
237
|
+
|
|
238
|
+
- Python 3.8+
|
|
239
|
+
- Linux (any systemd-based distribution is the happy path; the tool degrades
|
|
240
|
+
gracefully when `systemctl` / `journalctl` are missing)
|
|
241
|
+
- No network access required for the core CLI
|
|
242
|
+
- Optional network: `wtf explain --llm claude/openai`, `wtf doctor --check-updates`
|
|
243
|
+
|
|
244
|
+
## License
|
|
245
|
+
|
|
246
|
+
MIT
|