ironops 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- ironops-1.0.0/LICENSE +21 -0
- ironops-1.0.0/PKG-INFO +281 -0
- ironops-1.0.0/README.md +257 -0
- ironops-1.0.0/pyproject.toml +60 -0
- ironops-1.0.0/setup.cfg +4 -0
- ironops-1.0.0/src/ironops/__init__.py +7 -0
- ironops-1.0.0/src/ironops/__main__.py +9 -0
- ironops-1.0.0/src/ironops/cli.py +302 -0
- ironops-1.0.0/src/ironops.egg-info/PKG-INFO +281 -0
- ironops-1.0.0/src/ironops.egg-info/SOURCES.txt +12 -0
- ironops-1.0.0/src/ironops.egg-info/dependency_links.txt +1 -0
- ironops-1.0.0/src/ironops.egg-info/entry_points.txt +2 -0
- ironops-1.0.0/src/ironops.egg-info/requires.txt +3 -0
- ironops-1.0.0/src/ironops.egg-info/top_level.txt +1 -0
ironops-1.0.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Aziz Khemiri
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
ironops-1.0.0/PKG-INFO
ADDED
|
@@ -0,0 +1,281 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: ironops
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: DevOps CLI tool for remote server health monitoring over SSH
|
|
5
|
+
Author: Aziz Khemiri
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/AzizKhemiri/IronOps
|
|
8
|
+
Project-URL: Repository, https://github.com/AzizKhemiri/IronOps
|
|
9
|
+
Project-URL: Issues, https://github.com/AzizKhemiri/IronOps/issues
|
|
10
|
+
Keywords: devops,ssh,monitoring,cli,server,health
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
13
|
+
Classifier: Operating System :: OS Independent
|
|
14
|
+
Classifier: Topic :: System :: Monitoring
|
|
15
|
+
Classifier: Topic :: System :: Systems Administration
|
|
16
|
+
Classifier: Environment :: Console
|
|
17
|
+
Requires-Python: >=3.8
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
License-File: LICENSE
|
|
20
|
+
Requires-Dist: paramiko>=3.4.0
|
|
21
|
+
Requires-Dist: PyYAML>=6.0
|
|
22
|
+
Requires-Dist: python-dotenv>=1.0.0
|
|
23
|
+
Dynamic: license-file
|
|
24
|
+
|
|
25
|
+
# IronOps — DevOps CLI tool for remote server health monitoring
|
|
26
|
+
<div align="center">
|
|
27
|
+
<img src="images/IronOps.png" width="700" alt="IronOps Logo">
|
|
28
|
+
</div>
|
|
29
|
+
|
|
30
|
+
### DevOps Project
|
|
31
|
+
|
|
32
|
+
> A Python-based tool to check the health of multiple remote Linux servers over SSH — automatically collecting CPU, RAM, disk, and service metrics, then generating reports and sending alerts.
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## Table of Contents
|
|
37
|
+
- [What This Project Does](#-what-this-project-does)
|
|
38
|
+
- [How It Works (Architecture)](#-how-it-works-architecture)
|
|
39
|
+
- [Tools & Technologies](#-tools--technologies)
|
|
40
|
+
- [Installation](#-installation)
|
|
41
|
+
- [Configuration](#-configuration)
|
|
42
|
+
- [Usage](#-usage)
|
|
43
|
+
- [Example Output](#-example-output)
|
|
44
|
+
- [Scheduling with Cron](#-scheduling-with-cron)
|
|
45
|
+
- [Project Structure](#-project-structure)
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## What This Project Does
|
|
50
|
+
|
|
51
|
+
This tool connects to a list of remote Linux servers via **SSH**, runs health checks on each one, and:
|
|
52
|
+
|
|
53
|
+
- Collects: **CPU usage**, **RAM usage**, **disk space**, **uptime**, **running services**
|
|
54
|
+
- Generates a **JSON, readable report** for each run
|
|
55
|
+
- Sends **email/Slack alerts** when a server exceeds thresholds (e.g., disk > 90%)
|
|
56
|
+
- Can run **automatically on a schedule** using cron
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## How It Works (Architecture)
|
|
61
|
+
<div align="center">
|
|
62
|
+
<img src="images/IronOps-Architecture.png" width="700" alt="Architecture Diagram">
|
|
63
|
+
</div>
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Tools & Technologies
|
|
67
|
+
|
|
68
|
+
| Tool | Why we use it | Learn more |
|
|
69
|
+
|------|--------------|-----------|
|
|
70
|
+
| **Python 3** | Main language | [python.org](https://python.org) |
|
|
71
|
+
| **paramiko** | Python SSH library to connect to servers | [paramiko.org](https://paramiko.org) |
|
|
72
|
+
| **PyYAML** | Read the servers config file (YAML format) | [pyyaml.org](https://pyyaml.org) |
|
|
73
|
+
| **smtplib** | Built-in Python email sender for alerts | Python standard library |
|
|
74
|
+
| **cron** | Linux scheduler to run the script automatically | `man cron` |
|
|
75
|
+
| **JSON** | Report format | Built-in Python |
|
|
76
|
+
|
|
77
|
+
### Why SSH?
|
|
78
|
+
SSH (Secure Shell) is the industry-standard way to remotely access Linux servers. You connect using:
|
|
79
|
+
- **Password**
|
|
80
|
+
- **SSH Key Pair**
|
|
81
|
+
|
|
82
|
+
Our project supports **both methods**.
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## Installation
|
|
87
|
+
|
|
88
|
+
### Step 1 — Install Python 3
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
# Ubuntu/Debian
|
|
92
|
+
sudo apt update && sudo apt install python3 python3-pip -y
|
|
93
|
+
|
|
94
|
+
# macOS (using Homebrew)
|
|
95
|
+
brew install python3
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### Step 2 — Clone this project
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
git clone https://github.com/yourname/server-health-monitor.git
|
|
102
|
+
cd server-health-monitor
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### Step 3 — Install Python dependencies
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
pip3 install -r requirements.txt
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
This installs:
|
|
112
|
+
- `paramiko` — SSH connections
|
|
113
|
+
- `pyyaml` — reading YAML config files
|
|
114
|
+
|
|
115
|
+
### Step 4 — Set up your SSH access
|
|
116
|
+
|
|
117
|
+
**Using SSH keys (recommended for prod):**
|
|
118
|
+
```bash
|
|
119
|
+
# Generate an SSH key pair (if you don't have one already)
|
|
120
|
+
ssh-keygen -t rsa -b 4096 -C "health-monitor"
|
|
121
|
+
|
|
122
|
+
# Copy your public key to each remote server
|
|
123
|
+
ssh-copy-id user@your-server-ip
|
|
124
|
+
|
|
125
|
+
# Test it works
|
|
126
|
+
ssh user@your-server-ip "echo connected successfully"
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Step 5 — Configure your servers
|
|
130
|
+
|
|
131
|
+
Edit `config/servers.yaml` — see the [Configuration](#-configuration) section below.
|
|
132
|
+
|
|
133
|
+
### Step 6 — Run this commande to track you server
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
python3 health_monitor.py
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Configuration
|
|
142
|
+
|
|
143
|
+
Edit `config/servers.yaml` to add your servers (e.g):
|
|
144
|
+
|
|
145
|
+
```yaml
|
|
146
|
+
# List of servers to monitor
|
|
147
|
+
servers:
|
|
148
|
+
- name: "web-server-01"
|
|
149
|
+
host: "192.168.1.10"
|
|
150
|
+
port: 22
|
|
151
|
+
user: "ubuntu"
|
|
152
|
+
auth: "key" # "key"
|
|
153
|
+
key_path: "~/.ssh/id_rsa" # path to your private key
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Usage
|
|
159
|
+
|
|
160
|
+
### Basic run (check all servers once)
|
|
161
|
+
```bash
|
|
162
|
+
python3 health_monitor.py
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### Check a single specific server
|
|
166
|
+
```bash
|
|
167
|
+
python3 health_monitor.py --server web-server-01
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
### Show verbose output
|
|
171
|
+
```bash
|
|
172
|
+
python3 health_monitor.py --verbose
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Save report to a custom location
|
|
176
|
+
```bash
|
|
177
|
+
python3 health_monitor.py --output /tmp/my-report.json
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Example Terminak Output
|
|
183
|
+
|
|
184
|
+
**python3 health_monitor.py --my-mac**
|
|
185
|
+
<div align="center">
|
|
186
|
+
<img src="images/4-mac-demo.png" width="700" alt="Mac demo">
|
|
187
|
+
</div>
|
|
188
|
+
|
|
189
|
+
**python3 health_monitor.py**
|
|
190
|
+
<div align="center">
|
|
191
|
+
<img src="images/1-demo.png" width="700" alt="Full health check demo">
|
|
192
|
+
</div>
|
|
193
|
+
|
|
194
|
+
**monitor.log**
|
|
195
|
+
<div align="center">
|
|
196
|
+
<img src="images/3-demo.png" width="700" alt="Monitor log output">
|
|
197
|
+
</div>
|
|
198
|
+
|
|
199
|
+
**Demo fake data (test)**
|
|
200
|
+
<div align="center">
|
|
201
|
+
<img src="images/2-demo.png" width="700" alt="Demo with fake data">
|
|
202
|
+
</div>
|
|
203
|
+
|
|
204
|
+
**JSON report (reports/health_20241115_143201.json):**
|
|
205
|
+
```json
|
|
206
|
+
{
|
|
207
|
+
"run_id": 1,
|
|
208
|
+
"timestamp": "2024-11-15T14:32:01Z",
|
|
209
|
+
"summary": {
|
|
210
|
+
"total": 3,
|
|
211
|
+
"healthy": 2,
|
|
212
|
+
"warning": 1,
|
|
213
|
+
"critical": 0,
|
|
214
|
+
"unreachable": 1
|
|
215
|
+
},
|
|
216
|
+
"servers": [
|
|
217
|
+
{
|
|
218
|
+
"name": "web-server-01",
|
|
219
|
+
"host": "192.168.1.10",
|
|
220
|
+
"status": "warning",
|
|
221
|
+
"metrics": {
|
|
222
|
+
"cpu_percent": 23.4,
|
|
223
|
+
"ram_percent": 61.2,
|
|
224
|
+
"disk_percent": 78.5,
|
|
225
|
+
"uptime": "14 days, 3 hours"
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
]
|
|
229
|
+
}
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
---
|
|
233
|
+
|
|
234
|
+
## Scheduling with Cron
|
|
235
|
+
|
|
236
|
+
To run the health check automatically every 15 minutes:
|
|
237
|
+
|
|
238
|
+
```bash
|
|
239
|
+
# Open the cron editor
|
|
240
|
+
crontab -e
|
|
241
|
+
|
|
242
|
+
# Add this line (e.g: runs every 15 minutes)
|
|
243
|
+
*/15 * * * * /usr/bin/python3 /path/to/server-health-monitor/health_monitor.py >> /var/log/health_monitor.log 2>&1
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
Cron schedule format: `minute hour day month weekday`
|
|
247
|
+
```
|
|
248
|
+
*/15 * * * * → every 15 minutes
|
|
249
|
+
0 * * * * → every hour
|
|
250
|
+
0 8 * * * → every day at 8am
|
|
251
|
+
0 8 * * 1 → every Monday at 8am
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
---
|
|
255
|
+
|
|
256
|
+
## Project Structure
|
|
257
|
+
|
|
258
|
+
```
|
|
259
|
+
IronOps/
|
|
260
|
+
│
|
|
261
|
+
├── health_monitor.py # Main entry point (run this)
|
|
262
|
+
├── requirements.txt # Python dependencies
|
|
263
|
+
├── README.md # This file
|
|
264
|
+
│
|
|
265
|
+
├── config/
|
|
266
|
+
│ └── servers.yaml # Your server list + settings
|
|
267
|
+
│
|
|
268
|
+
├── modules/
|
|
269
|
+
│ ├── ssh_client.py # Handles SSH connections
|
|
270
|
+
│ ├── health_check.py # Runs metric commands on servers to check
|
|
271
|
+
│ ├── reporter.py # Formats and saves reports
|
|
272
|
+
│ └── alerter.py # Sends email/Slack alerts
|
|
273
|
+
│
|
|
274
|
+
├── reports/ # Auto-generated JSON reports
|
|
275
|
+
│ └── health_YYYYMMDD_HHMMSS.json
|
|
276
|
+
│
|
|
277
|
+
└── logs/ # Run logs
|
|
278
|
+
└── monitor.log
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
---
|
ironops-1.0.0/README.md
ADDED
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
# IronOps — DevOps CLI tool for remote server health monitoring
|
|
2
|
+
<div align="center">
|
|
3
|
+
<img src="images/IronOps.png" width="700" alt="IronOps Logo">
|
|
4
|
+
</div>
|
|
5
|
+
|
|
6
|
+
### DevOps Project
|
|
7
|
+
|
|
8
|
+
> A Python-based tool to check the health of multiple remote Linux servers over SSH — automatically collecting CPU, RAM, disk, and service metrics, then generating reports and sending alerts.
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Table of Contents
|
|
13
|
+
- [What This Project Does](#-what-this-project-does)
|
|
14
|
+
- [How It Works (Architecture)](#-how-it-works-architecture)
|
|
15
|
+
- [Tools & Technologies](#-tools--technologies)
|
|
16
|
+
- [Installation](#-installation)
|
|
17
|
+
- [Configuration](#-configuration)
|
|
18
|
+
- [Usage](#-usage)
|
|
19
|
+
- [Example Output](#-example-output)
|
|
20
|
+
- [Scheduling with Cron](#-scheduling-with-cron)
|
|
21
|
+
- [Project Structure](#-project-structure)
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## What This Project Does
|
|
26
|
+
|
|
27
|
+
This tool connects to a list of remote Linux servers via **SSH**, runs health checks on each one, and:
|
|
28
|
+
|
|
29
|
+
- Collects: **CPU usage**, **RAM usage**, **disk space**, **uptime**, **running services**
|
|
30
|
+
- Generates a **JSON, readable report** for each run
|
|
31
|
+
- Sends **email/Slack alerts** when a server exceeds thresholds (e.g., disk > 90%)
|
|
32
|
+
- Can run **automatically on a schedule** using cron
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## How It Works (Architecture)
|
|
37
|
+
<div align="center">
|
|
38
|
+
<img src="images/IronOps-Architecture.png" width="700" alt="Architecture Diagram">
|
|
39
|
+
</div>
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## Tools & Technologies
|
|
43
|
+
|
|
44
|
+
| Tool | Why we use it | Learn more |
|
|
45
|
+
|------|--------------|-----------|
|
|
46
|
+
| **Python 3** | Main language | [python.org](https://python.org) |
|
|
47
|
+
| **paramiko** | Python SSH library to connect to servers | [paramiko.org](https://paramiko.org) |
|
|
48
|
+
| **PyYAML** | Read the servers config file (YAML format) | [pyyaml.org](https://pyyaml.org) |
|
|
49
|
+
| **smtplib** | Built-in Python email sender for alerts | Python standard library |
|
|
50
|
+
| **cron** | Linux scheduler to run the script automatically | `man cron` |
|
|
51
|
+
| **JSON** | Report format | Built-in Python |
|
|
52
|
+
|
|
53
|
+
### Why SSH?
|
|
54
|
+
SSH (Secure Shell) is the industry-standard way to remotely access Linux servers. You connect using:
|
|
55
|
+
- **Password**
|
|
56
|
+
- **SSH Key Pair**
|
|
57
|
+
|
|
58
|
+
Our project supports **both methods**.
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
## Installation
|
|
63
|
+
|
|
64
|
+
### Step 1 — Install Python 3
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
# Ubuntu/Debian
|
|
68
|
+
sudo apt update && sudo apt install python3 python3-pip -y
|
|
69
|
+
|
|
70
|
+
# macOS (using Homebrew)
|
|
71
|
+
brew install python3
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Step 2 — Clone this project
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
git clone https://github.com/yourname/server-health-monitor.git
|
|
78
|
+
cd server-health-monitor
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Step 3 — Install Python dependencies
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
pip3 install -r requirements.txt
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
This installs:
|
|
88
|
+
- `paramiko` — SSH connections
|
|
89
|
+
- `pyyaml` — reading YAML config files
|
|
90
|
+
|
|
91
|
+
### Step 4 — Set up your SSH access
|
|
92
|
+
|
|
93
|
+
**Using SSH keys (recommended for prod):**
|
|
94
|
+
```bash
|
|
95
|
+
# Generate an SSH key pair (if you don't have one already)
|
|
96
|
+
ssh-keygen -t rsa -b 4096 -C "health-monitor"
|
|
97
|
+
|
|
98
|
+
# Copy your public key to each remote server
|
|
99
|
+
ssh-copy-id user@your-server-ip
|
|
100
|
+
|
|
101
|
+
# Test it works
|
|
102
|
+
ssh user@your-server-ip "echo connected successfully"
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### Step 5 — Configure your servers
|
|
106
|
+
|
|
107
|
+
Edit `config/servers.yaml` — see the [Configuration](#-configuration) section below.
|
|
108
|
+
|
|
109
|
+
### Step 6 — Run this commande to track you server
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
python3 health_monitor.py
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
## Configuration
|
|
118
|
+
|
|
119
|
+
Edit `config/servers.yaml` to add your servers (e.g):
|
|
120
|
+
|
|
121
|
+
```yaml
|
|
122
|
+
# List of servers to monitor
|
|
123
|
+
servers:
|
|
124
|
+
- name: "web-server-01"
|
|
125
|
+
host: "192.168.1.10"
|
|
126
|
+
port: 22
|
|
127
|
+
user: "ubuntu"
|
|
128
|
+
auth: "key" # "key"
|
|
129
|
+
key_path: "~/.ssh/id_rsa" # path to your private key
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## Usage
|
|
135
|
+
|
|
136
|
+
### Basic run (check all servers once)
|
|
137
|
+
```bash
|
|
138
|
+
python3 health_monitor.py
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
### Check a single specific server
|
|
142
|
+
```bash
|
|
143
|
+
python3 health_monitor.py --server web-server-01
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Show verbose output
|
|
147
|
+
```bash
|
|
148
|
+
python3 health_monitor.py --verbose
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
### Save report to a custom location
|
|
152
|
+
```bash
|
|
153
|
+
python3 health_monitor.py --output /tmp/my-report.json
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Example Terminak Output
|
|
159
|
+
|
|
160
|
+
**python3 health_monitor.py --my-mac**
|
|
161
|
+
<div align="center">
|
|
162
|
+
<img src="images/4-mac-demo.png" width="700" alt="Mac demo">
|
|
163
|
+
</div>
|
|
164
|
+
|
|
165
|
+
**python3 health_monitor.py**
|
|
166
|
+
<div align="center">
|
|
167
|
+
<img src="images/1-demo.png" width="700" alt="Full health check demo">
|
|
168
|
+
</div>
|
|
169
|
+
|
|
170
|
+
**monitor.log**
|
|
171
|
+
<div align="center">
|
|
172
|
+
<img src="images/3-demo.png" width="700" alt="Monitor log output">
|
|
173
|
+
</div>
|
|
174
|
+
|
|
175
|
+
**Demo fake data (test)**
|
|
176
|
+
<div align="center">
|
|
177
|
+
<img src="images/2-demo.png" width="700" alt="Demo with fake data">
|
|
178
|
+
</div>
|
|
179
|
+
|
|
180
|
+
**JSON report (reports/health_20241115_143201.json):**
|
|
181
|
+
```json
|
|
182
|
+
{
|
|
183
|
+
"run_id": 1,
|
|
184
|
+
"timestamp": "2024-11-15T14:32:01Z",
|
|
185
|
+
"summary": {
|
|
186
|
+
"total": 3,
|
|
187
|
+
"healthy": 2,
|
|
188
|
+
"warning": 1,
|
|
189
|
+
"critical": 0,
|
|
190
|
+
"unreachable": 1
|
|
191
|
+
},
|
|
192
|
+
"servers": [
|
|
193
|
+
{
|
|
194
|
+
"name": "web-server-01",
|
|
195
|
+
"host": "192.168.1.10",
|
|
196
|
+
"status": "warning",
|
|
197
|
+
"metrics": {
|
|
198
|
+
"cpu_percent": 23.4,
|
|
199
|
+
"ram_percent": 61.2,
|
|
200
|
+
"disk_percent": 78.5,
|
|
201
|
+
"uptime": "14 days, 3 hours"
|
|
202
|
+
}
|
|
203
|
+
}
|
|
204
|
+
]
|
|
205
|
+
}
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## Scheduling with Cron
|
|
211
|
+
|
|
212
|
+
To run the health check automatically every 15 minutes:
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
# Open the cron editor
|
|
216
|
+
crontab -e
|
|
217
|
+
|
|
218
|
+
# Add this line (e.g: runs every 15 minutes)
|
|
219
|
+
*/15 * * * * /usr/bin/python3 /path/to/server-health-monitor/health_monitor.py >> /var/log/health_monitor.log 2>&1
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
Cron schedule format: `minute hour day month weekday`
|
|
223
|
+
```
|
|
224
|
+
*/15 * * * * → every 15 minutes
|
|
225
|
+
0 * * * * → every hour
|
|
226
|
+
0 8 * * * → every day at 8am
|
|
227
|
+
0 8 * * 1 → every Monday at 8am
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
---
|
|
231
|
+
|
|
232
|
+
## Project Structure
|
|
233
|
+
|
|
234
|
+
```
|
|
235
|
+
IronOps/
|
|
236
|
+
│
|
|
237
|
+
├── health_monitor.py # Main entry point (run this)
|
|
238
|
+
├── requirements.txt # Python dependencies
|
|
239
|
+
├── README.md # This file
|
|
240
|
+
│
|
|
241
|
+
├── config/
|
|
242
|
+
│ └── servers.yaml # Your server list + settings
|
|
243
|
+
│
|
|
244
|
+
├── modules/
|
|
245
|
+
│ ├── ssh_client.py # Handles SSH connections
|
|
246
|
+
│ ├── health_check.py # Runs metric commands on servers to check
|
|
247
|
+
│ ├── reporter.py # Formats and saves reports
|
|
248
|
+
│ └── alerter.py # Sends email/Slack alerts
|
|
249
|
+
│
|
|
250
|
+
├── reports/ # Auto-generated JSON reports
|
|
251
|
+
│ └── health_YYYYMMDD_HHMMSS.json
|
|
252
|
+
│
|
|
253
|
+
└── logs/ # Run logs
|
|
254
|
+
└── monitor.log
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
---
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
# ================================================================
|
|
2
|
+
# IronOps — pyproject.toml
|
|
3
|
+
# ================================================================
|
|
4
|
+
# This file tells Python HOW to package and install IronOps.
|
|
5
|
+
#
|
|
6
|
+
# Key concept — entry_points:
|
|
7
|
+
# This is the magic that creates the `ironops` command.
|
|
8
|
+
# When someone runs `pip install ironops`, pip reads this file
|
|
9
|
+
# and creates a small script in their PATH called `ironops`
|
|
10
|
+
# that calls our cli.main() function automatically.
|
|
11
|
+
#
|
|
12
|
+
# Standard: PEP 517/518 — the modern Python packaging standard.
|
|
13
|
+
# ================================================================
|
|
14
|
+
|
|
15
|
+
[build-system]
|
|
16
|
+
requires = ["setuptools>=68", "wheel"]
|
|
17
|
+
build-backend = "setuptools.build_meta"
|
|
18
|
+
|
|
19
|
+
[project]
|
|
20
|
+
name = "ironops"
|
|
21
|
+
version = "1.0.0"
|
|
22
|
+
description = "DevOps CLI tool for remote server health monitoring over SSH"
|
|
23
|
+
readme = "README.md"
|
|
24
|
+
license = { text = "MIT" }
|
|
25
|
+
requires-python = ">=3.8"
|
|
26
|
+
|
|
27
|
+
authors = [
|
|
28
|
+
{ name = "Aziz Khemiri" }
|
|
29
|
+
]
|
|
30
|
+
|
|
31
|
+
keywords = ["devops", "ssh", "monitoring", "cli", "server", "health"]
|
|
32
|
+
|
|
33
|
+
classifiers = [
|
|
34
|
+
"Programming Language :: Python :: 3",
|
|
35
|
+
"License :: OSI Approved :: MIT License",
|
|
36
|
+
"Operating System :: OS Independent",
|
|
37
|
+
"Topic :: System :: Monitoring",
|
|
38
|
+
"Topic :: System :: Systems Administration",
|
|
39
|
+
"Environment :: Console",
|
|
40
|
+
]
|
|
41
|
+
|
|
42
|
+
# Runtime dependencies
|
|
43
|
+
dependencies = [
|
|
44
|
+
"paramiko>=3.4.0",
|
|
45
|
+
"PyYAML>=6.0",
|
|
46
|
+
"python-dotenv>=1.0.0",
|
|
47
|
+
]
|
|
48
|
+
|
|
49
|
+
# When user types `ironops --scan`, Python calls cli.main()
|
|
50
|
+
[project.scripts]
|
|
51
|
+
ironops = "ironops.cli:main"
|
|
52
|
+
|
|
53
|
+
[project.urls]
|
|
54
|
+
Homepage = "https://github.com/AzizKhemiri/IronOps"
|
|
55
|
+
Repository = "https://github.com/AzizKhemiri/IronOps"
|
|
56
|
+
Issues = "https://github.com/AzizKhemiri/IronOps/issues"
|
|
57
|
+
|
|
58
|
+
# Package discovery
|
|
59
|
+
[tool.setuptools.packages.find]
|
|
60
|
+
where = ["src"]
|
ironops-1.0.0/setup.cfg
ADDED
|
@@ -0,0 +1,302 @@
|
|
|
1
|
+
"""
|
|
2
|
+
src/ironops/cli.py
|
|
3
|
+
==================
|
|
4
|
+
The CLI entry point for IronOps.
|
|
5
|
+
|
|
6
|
+
This is the function that runs when a user types:
|
|
7
|
+
$ ironops --scan
|
|
8
|
+
$ ironops --status
|
|
9
|
+
$ ironops --monitor
|
|
10
|
+
$ ironops --help
|
|
11
|
+
|
|
12
|
+
How it works:
|
|
13
|
+
pyproject.toml maps the `ironops` command to this file:
|
|
14
|
+
ironops = "ironops.cli:main"
|
|
15
|
+
pip creates a small wrapper script in PATH that calls main().
|
|
16
|
+
|
|
17
|
+
Commands:
|
|
18
|
+
--scan One-time health check of all servers
|
|
19
|
+
--status Quick summary: how many servers are up/down
|
|
20
|
+
--monitor Continuous loop — checks every N seconds
|
|
21
|
+
--version Print IronOps version
|
|
22
|
+
--help Show usage (built-in to argparse)
|
|
23
|
+
"""
|
|
24
|
+
|
|
25
|
+
import sys
|
|
26
|
+
import time
|
|
27
|
+
import argparse
|
|
28
|
+
import os
|
|
29
|
+
import yaml
|
|
30
|
+
from dotenv import load_dotenv
|
|
31
|
+
|
|
32
|
+
from ironops import __version__
|
|
33
|
+
|
|
34
|
+
load_dotenv()
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
# ANSI colors
|
|
38
|
+
GREEN = "\033[92m"
|
|
39
|
+
YELLOW = "\033[93m"
|
|
40
|
+
RED = "\033[91m"
|
|
41
|
+
CYAN = "\033[96m"
|
|
42
|
+
BOLD = "\033[1m"
|
|
43
|
+
RESET = "\033[0m"
|
|
44
|
+
|
|
45
|
+
|
|
46
|
+
def _banner():
|
|
47
|
+
print(f"""
|
|
48
|
+
{BOLD}{CYAN}
|
|
49
|
+
██╗██████╗ ██████╗ ███╗ ██╗ ██████╗ ██████╗ ███████╗
|
|
50
|
+
██║██╔══██╗██╔═══██╗████╗ ██║██╔═══██╗██╔══██╗██╔════╝
|
|
51
|
+
██║██████╔╝██║ ██║██╔██╗ ██║██║ ██║██████╔╝███████╗
|
|
52
|
+
██║██╔══██╗██║ ██║██║╚██╗██║██║ ██║██╔═══╝ ╚════██║
|
|
53
|
+
██║██║ ██║╚██████╔╝██║ ╚████║╚██████╔╝██║ ███████║
|
|
54
|
+
╚═╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═══╝ ╚═════╝ ╚═╝ ╚══════╝
|
|
55
|
+
{RESET} {CYAN}DevOps CLI tool for remote server health monitoring{RESET}
|
|
56
|
+
{YELLOW}v{__version__} — Author: Aziz Khemiri{RESET}
|
|
57
|
+
""")
|
|
58
|
+
|
|
59
|
+
|
|
60
|
+
def _get_core_path():
|
|
61
|
+
"""
|
|
62
|
+
Locate the project root (where config/ and modules/ live).
|
|
63
|
+
Works whether installed via pip or run from source.
|
|
64
|
+
"""
|
|
65
|
+
# When installed via pip, look for config next to the package
|
|
66
|
+
candidate = os.path.join(os.path.dirname(__file__), "..", "..", "..")
|
|
67
|
+
candidate = os.path.abspath(candidate)
|
|
68
|
+
if os.path.exists(os.path.join(candidate, "config", "servers.yaml")):
|
|
69
|
+
return candidate
|
|
70
|
+
|
|
71
|
+
# Fallback: current working directory
|
|
72
|
+
return os.getcwd()
|
|
73
|
+
|
|
74
|
+
|
|
75
|
+
def _run_scan(args):
|
|
76
|
+
"""
|
|
77
|
+
ironops --scan
|
|
78
|
+
One-time health check of all servers (or a specific one).
|
|
79
|
+
"""
|
|
80
|
+
_banner()
|
|
81
|
+
root = _get_core_path()
|
|
82
|
+
sys.path.insert(0, root)
|
|
83
|
+
|
|
84
|
+
try:
|
|
85
|
+
# ✅ Load YAML directly, no env_loader needed
|
|
86
|
+
with open(os.path.join(root, "config", "servers.yaml"), "r") as f:
|
|
87
|
+
config = yaml.safe_load(f)
|
|
88
|
+
|
|
89
|
+
from modules.health_check import check_server
|
|
90
|
+
from modules.reporter import print_report, save_json_report
|
|
91
|
+
from modules.alerter import send_alerts
|
|
92
|
+
except ImportError as e:
|
|
93
|
+
print(f"{RED} Error: could not load IronOps modules: {e}{RESET}")
|
|
94
|
+
print(f" Make sure you are running from the project directory.\n")
|
|
95
|
+
sys.exit(1)
|
|
96
|
+
except FileNotFoundError as e:
|
|
97
|
+
print(f"{RED} Error: config file not found: {e}{RESET}")
|
|
98
|
+
print(f" Run: {BOLD}ironops --init{RESET}\n")
|
|
99
|
+
sys.exit(1)
|
|
100
|
+
|
|
101
|
+
from datetime import datetime
|
|
102
|
+
from concurrent.futures import ThreadPoolExecutor, as_completed
|
|
103
|
+
import logging
|
|
104
|
+
|
|
105
|
+
logging.basicConfig(level=logging.WARNING)
|
|
106
|
+
|
|
107
|
+
servers = config.get("servers", [])
|
|
108
|
+
thresholds = config.get("thresholds", {})
|
|
109
|
+
alerts_cfg = config.get("alerts", {})
|
|
110
|
+
|
|
111
|
+
if not servers:
|
|
112
|
+
print(f"{RED} Error: No servers defined in config/servers.yaml{RESET}\n")
|
|
113
|
+
sys.exit(1)
|
|
114
|
+
|
|
115
|
+
# Filter to one server if --server was passed
|
|
116
|
+
if args.server:
|
|
117
|
+
servers = [s for s in servers if s.get("name") == args.server]
|
|
118
|
+
if not servers:
|
|
119
|
+
print(f"{RED} No server named '{args.server}' found.{RESET}\n")
|
|
120
|
+
sys.exit(1)
|
|
121
|
+
|
|
122
|
+
print(f" {CYAN}Scanning {len(servers)} server(s)...{RESET}\n")
|
|
123
|
+
start = datetime.now()
|
|
124
|
+
results = []
|
|
125
|
+
|
|
126
|
+
with ThreadPoolExecutor(max_workers=min(5, len(servers))) as pool:
|
|
127
|
+
futures = {pool.submit(check_server, s, thresholds): s for s in servers}
|
|
128
|
+
for future in as_completed(futures):
|
|
129
|
+
try:
|
|
130
|
+
results.append(future.result())
|
|
131
|
+
except Exception as e:
|
|
132
|
+
s = futures[future]
|
|
133
|
+
results.append({
|
|
134
|
+
"name": s.get("name", s["host"]), "host": s["host"],
|
|
135
|
+
"status": "unreachable", "metrics": {}, "alerts": [],
|
|
136
|
+
"error": str(e),
|
|
137
|
+
})
|
|
138
|
+
|
|
139
|
+
# Preserve config order
|
|
140
|
+
order = {s.get("name", s["host"]): i for i, s in enumerate(servers)}
|
|
141
|
+
results.sort(key=lambda r: order.get(r["name"], 999))
|
|
142
|
+
|
|
143
|
+
print_report(results, run_id=1, start_time=start)
|
|
144
|
+
save_json_report(results, run_id=1)
|
|
145
|
+
send_alerts(results, alerts_cfg)
|
|
146
|
+
|
|
147
|
+
has_critical = any(r["status"] in ("critical", "unreachable") for r in results)
|
|
148
|
+
sys.exit(1 if has_critical else 0)
|
|
149
|
+
|
|
150
|
+
|
|
151
|
+
def _run_status(args):
|
|
152
|
+
"""
|
|
153
|
+
ironops --status
|
|
154
|
+
Fast summary only — no full metric output, just up/down counts.
|
|
155
|
+
"""
|
|
156
|
+
_banner()
|
|
157
|
+
root = _get_core_path()
|
|
158
|
+
sys.path.insert(0, root)
|
|
159
|
+
|
|
160
|
+
try:
|
|
161
|
+
# ✅ Load YAML directly
|
|
162
|
+
with open(os.path.join(root, "config", "servers.yaml"), "r") as f:
|
|
163
|
+
config = yaml.safe_load(f)
|
|
164
|
+
|
|
165
|
+
from modules.ssh_client import SSHClient
|
|
166
|
+
except ImportError as e:
|
|
167
|
+
print(f"{RED} Error: {e}{RESET}\n")
|
|
168
|
+
sys.exit(1)
|
|
169
|
+
except FileNotFoundError as e:
|
|
170
|
+
print(f"{RED} Error: config file not found: {e}{RESET}\n")
|
|
171
|
+
sys.exit(1)
|
|
172
|
+
|
|
173
|
+
servers = config.get("servers", [])
|
|
174
|
+
|
|
175
|
+
if not servers:
|
|
176
|
+
print(f"{RED} Error: No servers defined in config/servers.yaml{RESET}\n")
|
|
177
|
+
sys.exit(1)
|
|
178
|
+
|
|
179
|
+
print(f" {CYAN}Checking connectivity for {len(servers)} server(s)...{RESET}\n")
|
|
180
|
+
|
|
181
|
+
up = down = 0
|
|
182
|
+
for s in servers:
|
|
183
|
+
name = s.get("name", s["host"])
|
|
184
|
+
client = SSHClient(s)
|
|
185
|
+
if client.connect():
|
|
186
|
+
print(f" {GREEN}[UP] {BOLD}{name}{RESET} {s['host']}")
|
|
187
|
+
client.disconnect()
|
|
188
|
+
up += 1
|
|
189
|
+
else:
|
|
190
|
+
print(f" {RED}[DOWN]{RESET} {BOLD}{name}{RESET} {s['host']}")
|
|
191
|
+
down += 1
|
|
192
|
+
|
|
193
|
+
print(f"\n {BOLD}{'='*38}{RESET}")
|
|
194
|
+
print(f" {GREEN}{up} UP{RESET} | {RED}{down} DOWN{RESET} | {up+down} total\n")
|
|
195
|
+
|
|
196
|
+
|
|
197
|
+
def _run_monitor(args):
|
|
198
|
+
"""
|
|
199
|
+
ironops --monitor
|
|
200
|
+
Continuous loop — scans every N seconds until Ctrl+C.
|
|
201
|
+
"""
|
|
202
|
+
interval = args.interval or 60
|
|
203
|
+
_banner()
|
|
204
|
+
print(f" {CYAN}Monitor mode — scanning every {interval}s (Ctrl+C to stop){RESET}\n")
|
|
205
|
+
|
|
206
|
+
run = 0
|
|
207
|
+
try:
|
|
208
|
+
while True:
|
|
209
|
+
run += 1
|
|
210
|
+
print(f" {YELLOW}── Run #{run} {time.strftime('%H:%M:%S')} ──{RESET}")
|
|
211
|
+
args._mode = "scan_quiet"
|
|
212
|
+
_run_scan(args)
|
|
213
|
+
print(f"\n {CYAN}Next scan in {interval}s...{RESET}")
|
|
214
|
+
time.sleep(interval)
|
|
215
|
+
except KeyboardInterrupt:
|
|
216
|
+
print(f"\n\n {YELLOW}Monitor stopped.{RESET}\n")
|
|
217
|
+
|
|
218
|
+
|
|
219
|
+
def main():
|
|
220
|
+
"""
|
|
221
|
+
Entry point called by the `ironops` shell command.
|
|
222
|
+
Registered in pyproject.toml under [project.scripts].
|
|
223
|
+
"""
|
|
224
|
+
parser = argparse.ArgumentParser(
|
|
225
|
+
prog="ironops",
|
|
226
|
+
description=f"{BOLD}IronOps{RESET} — DevOps CLI tool for remote server health monitoring",
|
|
227
|
+
formatter_class=argparse.RawDescriptionHelpFormatter,
|
|
228
|
+
epilog=f"""
|
|
229
|
+
{BOLD}Examples:{RESET}
|
|
230
|
+
ironops --scan # check all servers once
|
|
231
|
+
ironops --scan --server web-01 # check one specific server
|
|
232
|
+
ironops --status # quick up/down summary
|
|
233
|
+
ironops --monitor # continuous scan (every 60s)
|
|
234
|
+
ironops --monitor --interval 30 # continuous scan (every 30s)
|
|
235
|
+
|
|
236
|
+
{BOLD}Author:{RESET} Aziz Khemiri
|
|
237
|
+
{BOLD}Version:{RESET} {__version__}
|
|
238
|
+
"""
|
|
239
|
+
)
|
|
240
|
+
|
|
241
|
+
# ── Mutually exclusive commands ──────────────────────────────────
|
|
242
|
+
group = parser.add_mutually_exclusive_group(required=True)
|
|
243
|
+
|
|
244
|
+
group.add_argument(
|
|
245
|
+
"--scan",
|
|
246
|
+
action="store_true",
|
|
247
|
+
help="Run a one-time health check on all servers"
|
|
248
|
+
)
|
|
249
|
+
group.add_argument(
|
|
250
|
+
"--status",
|
|
251
|
+
action="store_true",
|
|
252
|
+
help="Quick connectivity check — show which servers are up or down"
|
|
253
|
+
)
|
|
254
|
+
group.add_argument(
|
|
255
|
+
"--monitor",
|
|
256
|
+
action="store_true",
|
|
257
|
+
help="Continuous monitoring loop (checks every --interval seconds)"
|
|
258
|
+
)
|
|
259
|
+
group.add_argument(
|
|
260
|
+
"--version",
|
|
261
|
+
action="store_true",
|
|
262
|
+
help="Print IronOps version and exit"
|
|
263
|
+
)
|
|
264
|
+
|
|
265
|
+
# ── Optional flags ───────────────────────────────────────────────
|
|
266
|
+
parser.add_argument(
|
|
267
|
+
"--server",
|
|
268
|
+
metavar="NAME",
|
|
269
|
+
help="Target a single server by name (use with --scan)"
|
|
270
|
+
)
|
|
271
|
+
parser.add_argument(
|
|
272
|
+
"--interval",
|
|
273
|
+
type=int,
|
|
274
|
+
metavar="SECONDS",
|
|
275
|
+
default=60,
|
|
276
|
+
help="Scan interval in seconds for --monitor mode (default: 60)"
|
|
277
|
+
)
|
|
278
|
+
parser.add_argument(
|
|
279
|
+
"--config",
|
|
280
|
+
metavar="PATH",
|
|
281
|
+
default="config/servers.yaml",
|
|
282
|
+
help="Path to config file (default: config/servers.yaml)"
|
|
283
|
+
)
|
|
284
|
+
|
|
285
|
+
args = parser.parse_args()
|
|
286
|
+
|
|
287
|
+
# ── Route to the right action ────────────────────────────────────
|
|
288
|
+
if args.version:
|
|
289
|
+
print(f"ironops v{__version__} — by Aziz Khemiri")
|
|
290
|
+
|
|
291
|
+
elif args.scan:
|
|
292
|
+
_run_scan(args)
|
|
293
|
+
|
|
294
|
+
elif args.status:
|
|
295
|
+
_run_status(args)
|
|
296
|
+
|
|
297
|
+
elif args.monitor:
|
|
298
|
+
_run_monitor(args)
|
|
299
|
+
|
|
300
|
+
|
|
301
|
+
if __name__ == "__main__":
|
|
302
|
+
main()
|
|
@@ -0,0 +1,281 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: ironops
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: DevOps CLI tool for remote server health monitoring over SSH
|
|
5
|
+
Author: Aziz Khemiri
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/AzizKhemiri/IronOps
|
|
8
|
+
Project-URL: Repository, https://github.com/AzizKhemiri/IronOps
|
|
9
|
+
Project-URL: Issues, https://github.com/AzizKhemiri/IronOps/issues
|
|
10
|
+
Keywords: devops,ssh,monitoring,cli,server,health
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
13
|
+
Classifier: Operating System :: OS Independent
|
|
14
|
+
Classifier: Topic :: System :: Monitoring
|
|
15
|
+
Classifier: Topic :: System :: Systems Administration
|
|
16
|
+
Classifier: Environment :: Console
|
|
17
|
+
Requires-Python: >=3.8
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
19
|
+
License-File: LICENSE
|
|
20
|
+
Requires-Dist: paramiko>=3.4.0
|
|
21
|
+
Requires-Dist: PyYAML>=6.0
|
|
22
|
+
Requires-Dist: python-dotenv>=1.0.0
|
|
23
|
+
Dynamic: license-file
|
|
24
|
+
|
|
25
|
+
# IronOps — DevOps CLI tool for remote server health monitoring
|
|
26
|
+
<div align="center">
|
|
27
|
+
<img src="images/IronOps.png" width="700" alt="IronOps Logo">
|
|
28
|
+
</div>
|
|
29
|
+
|
|
30
|
+
### DevOps Project
|
|
31
|
+
|
|
32
|
+
> A Python-based tool to check the health of multiple remote Linux servers over SSH — automatically collecting CPU, RAM, disk, and service metrics, then generating reports and sending alerts.
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## Table of Contents
|
|
37
|
+
- [What This Project Does](#-what-this-project-does)
|
|
38
|
+
- [How It Works (Architecture)](#-how-it-works-architecture)
|
|
39
|
+
- [Tools & Technologies](#-tools--technologies)
|
|
40
|
+
- [Installation](#-installation)
|
|
41
|
+
- [Configuration](#-configuration)
|
|
42
|
+
- [Usage](#-usage)
|
|
43
|
+
- [Example Output](#-example-output)
|
|
44
|
+
- [Scheduling with Cron](#-scheduling-with-cron)
|
|
45
|
+
- [Project Structure](#-project-structure)
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## What This Project Does
|
|
50
|
+
|
|
51
|
+
This tool connects to a list of remote Linux servers via **SSH**, runs health checks on each one, and:
|
|
52
|
+
|
|
53
|
+
- Collects: **CPU usage**, **RAM usage**, **disk space**, **uptime**, **running services**
|
|
54
|
+
- Generates a **JSON, readable report** for each run
|
|
55
|
+
- Sends **email/Slack alerts** when a server exceeds thresholds (e.g., disk > 90%)
|
|
56
|
+
- Can run **automatically on a schedule** using cron
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
## How It Works (Architecture)
|
|
61
|
+
<div align="center">
|
|
62
|
+
<img src="images/IronOps-Architecture.png" width="700" alt="Architecture Diagram">
|
|
63
|
+
</div>
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Tools & Technologies
|
|
67
|
+
|
|
68
|
+
| Tool | Why we use it | Learn more |
|
|
69
|
+
|------|--------------|-----------|
|
|
70
|
+
| **Python 3** | Main language | [python.org](https://python.org) |
|
|
71
|
+
| **paramiko** | Python SSH library to connect to servers | [paramiko.org](https://paramiko.org) |
|
|
72
|
+
| **PyYAML** | Read the servers config file (YAML format) | [pyyaml.org](https://pyyaml.org) |
|
|
73
|
+
| **smtplib** | Built-in Python email sender for alerts | Python standard library |
|
|
74
|
+
| **cron** | Linux scheduler to run the script automatically | `man cron` |
|
|
75
|
+
| **JSON** | Report format | Built-in Python |
|
|
76
|
+
|
|
77
|
+
### Why SSH?
|
|
78
|
+
SSH (Secure Shell) is the industry-standard way to remotely access Linux servers. You connect using:
|
|
79
|
+
- **Password**
|
|
80
|
+
- **SSH Key Pair**
|
|
81
|
+
|
|
82
|
+
Our project supports **both methods**.
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## Installation
|
|
87
|
+
|
|
88
|
+
### Step 1 — Install Python 3
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
# Ubuntu/Debian
|
|
92
|
+
sudo apt update && sudo apt install python3 python3-pip -y
|
|
93
|
+
|
|
94
|
+
# macOS (using Homebrew)
|
|
95
|
+
brew install python3
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### Step 2 — Clone this project
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
git clone https://github.com/yourname/server-health-monitor.git
|
|
102
|
+
cd server-health-monitor
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### Step 3 — Install Python dependencies
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
pip3 install -r requirements.txt
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
This installs:
|
|
112
|
+
- `paramiko` — SSH connections
|
|
113
|
+
- `pyyaml` — reading YAML config files
|
|
114
|
+
|
|
115
|
+
### Step 4 — Set up your SSH access
|
|
116
|
+
|
|
117
|
+
**Using SSH keys (recommended for prod):**
|
|
118
|
+
```bash
|
|
119
|
+
# Generate an SSH key pair (if you don't have one already)
|
|
120
|
+
ssh-keygen -t rsa -b 4096 -C "health-monitor"
|
|
121
|
+
|
|
122
|
+
# Copy your public key to each remote server
|
|
123
|
+
ssh-copy-id user@your-server-ip
|
|
124
|
+
|
|
125
|
+
# Test it works
|
|
126
|
+
ssh user@your-server-ip "echo connected successfully"
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Step 5 — Configure your servers
|
|
130
|
+
|
|
131
|
+
Edit `config/servers.yaml` — see the [Configuration](#-configuration) section below.
|
|
132
|
+
|
|
133
|
+
### Step 6 — Run this commande to track you server
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
python3 health_monitor.py
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Configuration
|
|
142
|
+
|
|
143
|
+
Edit `config/servers.yaml` to add your servers (e.g):
|
|
144
|
+
|
|
145
|
+
```yaml
|
|
146
|
+
# List of servers to monitor
|
|
147
|
+
servers:
|
|
148
|
+
- name: "web-server-01"
|
|
149
|
+
host: "192.168.1.10"
|
|
150
|
+
port: 22
|
|
151
|
+
user: "ubuntu"
|
|
152
|
+
auth: "key" # "key"
|
|
153
|
+
key_path: "~/.ssh/id_rsa" # path to your private key
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
---
|
|
157
|
+
|
|
158
|
+
## Usage
|
|
159
|
+
|
|
160
|
+
### Basic run (check all servers once)
|
|
161
|
+
```bash
|
|
162
|
+
python3 health_monitor.py
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### Check a single specific server
|
|
166
|
+
```bash
|
|
167
|
+
python3 health_monitor.py --server web-server-01
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
### Show verbose output
|
|
171
|
+
```bash
|
|
172
|
+
python3 health_monitor.py --verbose
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Save report to a custom location
|
|
176
|
+
```bash
|
|
177
|
+
python3 health_monitor.py --output /tmp/my-report.json
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Example Terminak Output
|
|
183
|
+
|
|
184
|
+
**python3 health_monitor.py --my-mac**
|
|
185
|
+
<div align="center">
|
|
186
|
+
<img src="images/4-mac-demo.png" width="700" alt="Mac demo">
|
|
187
|
+
</div>
|
|
188
|
+
|
|
189
|
+
**python3 health_monitor.py**
|
|
190
|
+
<div align="center">
|
|
191
|
+
<img src="images/1-demo.png" width="700" alt="Full health check demo">
|
|
192
|
+
</div>
|
|
193
|
+
|
|
194
|
+
**monitor.log**
|
|
195
|
+
<div align="center">
|
|
196
|
+
<img src="images/3-demo.png" width="700" alt="Monitor log output">
|
|
197
|
+
</div>
|
|
198
|
+
|
|
199
|
+
**Demo fake data (test)**
|
|
200
|
+
<div align="center">
|
|
201
|
+
<img src="images/2-demo.png" width="700" alt="Demo with fake data">
|
|
202
|
+
</div>
|
|
203
|
+
|
|
204
|
+
**JSON report (reports/health_20241115_143201.json):**
|
|
205
|
+
```json
|
|
206
|
+
{
|
|
207
|
+
"run_id": 1,
|
|
208
|
+
"timestamp": "2024-11-15T14:32:01Z",
|
|
209
|
+
"summary": {
|
|
210
|
+
"total": 3,
|
|
211
|
+
"healthy": 2,
|
|
212
|
+
"warning": 1,
|
|
213
|
+
"critical": 0,
|
|
214
|
+
"unreachable": 1
|
|
215
|
+
},
|
|
216
|
+
"servers": [
|
|
217
|
+
{
|
|
218
|
+
"name": "web-server-01",
|
|
219
|
+
"host": "192.168.1.10",
|
|
220
|
+
"status": "warning",
|
|
221
|
+
"metrics": {
|
|
222
|
+
"cpu_percent": 23.4,
|
|
223
|
+
"ram_percent": 61.2,
|
|
224
|
+
"disk_percent": 78.5,
|
|
225
|
+
"uptime": "14 days, 3 hours"
|
|
226
|
+
}
|
|
227
|
+
}
|
|
228
|
+
]
|
|
229
|
+
}
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
---
|
|
233
|
+
|
|
234
|
+
## Scheduling with Cron
|
|
235
|
+
|
|
236
|
+
To run the health check automatically every 15 minutes:
|
|
237
|
+
|
|
238
|
+
```bash
|
|
239
|
+
# Open the cron editor
|
|
240
|
+
crontab -e
|
|
241
|
+
|
|
242
|
+
# Add this line (e.g: runs every 15 minutes)
|
|
243
|
+
*/15 * * * * /usr/bin/python3 /path/to/server-health-monitor/health_monitor.py >> /var/log/health_monitor.log 2>&1
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
Cron schedule format: `minute hour day month weekday`
|
|
247
|
+
```
|
|
248
|
+
*/15 * * * * → every 15 minutes
|
|
249
|
+
0 * * * * → every hour
|
|
250
|
+
0 8 * * * → every day at 8am
|
|
251
|
+
0 8 * * 1 → every Monday at 8am
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
---
|
|
255
|
+
|
|
256
|
+
## Project Structure
|
|
257
|
+
|
|
258
|
+
```
|
|
259
|
+
IronOps/
|
|
260
|
+
│
|
|
261
|
+
├── health_monitor.py # Main entry point (run this)
|
|
262
|
+
├── requirements.txt # Python dependencies
|
|
263
|
+
├── README.md # This file
|
|
264
|
+
│
|
|
265
|
+
├── config/
|
|
266
|
+
│ └── servers.yaml # Your server list + settings
|
|
267
|
+
│
|
|
268
|
+
├── modules/
|
|
269
|
+
│ ├── ssh_client.py # Handles SSH connections
|
|
270
|
+
│ ├── health_check.py # Runs metric commands on servers to check
|
|
271
|
+
│ ├── reporter.py # Formats and saves reports
|
|
272
|
+
│ └── alerter.py # Sends email/Slack alerts
|
|
273
|
+
│
|
|
274
|
+
├── reports/ # Auto-generated JSON reports
|
|
275
|
+
│ └── health_YYYYMMDD_HHMMSS.json
|
|
276
|
+
│
|
|
277
|
+
└── logs/ # Run logs
|
|
278
|
+
└── monitor.log
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
---
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
LICENSE
|
|
2
|
+
README.md
|
|
3
|
+
pyproject.toml
|
|
4
|
+
src/ironops/__init__.py
|
|
5
|
+
src/ironops/__main__.py
|
|
6
|
+
src/ironops/cli.py
|
|
7
|
+
src/ironops.egg-info/PKG-INFO
|
|
8
|
+
src/ironops.egg-info/SOURCES.txt
|
|
9
|
+
src/ironops.egg-info/dependency_links.txt
|
|
10
|
+
src/ironops.egg-info/entry_points.txt
|
|
11
|
+
src/ironops.egg-info/requires.txt
|
|
12
|
+
src/ironops.egg-info/top_level.txt
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
ironops
|