I’ve got Monitorix that gives some understanding of resource use server wide. But sometimes I find it hard to find which user(website) is using a lot of resources. Even triggering a OOM-kill sometimes.
So I wonder, what is a good monitor tool that also gives some insights of resources used per user?
Thank you. Yes this works great for debugging and the OOM-kill cause was found swiftly.
However, I was looking for some visual thing, like Monitorix or Netdata and then with the resource use split per user. But maybe this is not available and would cause a lot of drag on the server.
It’s more a nice to have, so see the impact per website on a server.
Had some time to spare and came up with this Python script.
You can run it as a service or in background.
WIll monitor CPU usage and will send an alert based on your settings.
Needs Python 3 and to be run from a root or similar level user.
import psutil
import time
import smtplib
from email.mime.text import MIMEText
from subprocess import Popen, PIPE
# Email configuration
SMTP_SERVER = 'smtp.your-email-server.com'
SMTP_USERNAME = 'your-email@example.com'
SMTP_PASSWORD = 'your-email-password'
EMAIL_FROM = 'your-email@example.com'
EMAIL_TO = 'recipient@example.com'
EMAIL_SUBJECT = 'High CPU Usage Alert'
# Monitoring parameters
CPU_THRESHOLD = 2 # CPU load threshold
MONITOR_INTERVAL = 60 # seconds between checks
ALERT_AFTER_MINUTES = 3 # Alert after 3 minutes
def get_cpu_load():
return psutil.getloadavg()[0]
def get_top_cpu_processes():
processes = []
for proc in psutil.process_iter(['pid', 'name', 'username', 'cpu_percent']):
proc_info = proc.info
except (psutil.NoSuchProcess, psutil.AccessDenied):
# Sort by CPU usage
processes = sorted(processes, key=lambda x: x['cpu_percent'], reverse=True)
return processes
def send_email(body):
msg = MIMEText(body)
msg['Subject'] = EMAIL_SUBJECT
msg['From'] = EMAIL_FROM
msg['To'] = EMAIL_TO
with smtplib.SMTP(SMTP_SERVER, SMTP_PORT) as server:
server.sendmail(EMAIL_FROM, EMAIL_TO, msg.as_string())
def alert_if_needed():
high_load_counter = 0
while True:
load = get_cpu_load()
print(f"Current load: {load}")
if load > CPU_THRESHOLD:
high_load_counter += 1
print(f"High load detected for {high_load_counter} minute(s).")
# If CPU usage is above threshold for more than ALERT_AFTER_MINUTES
if high_load_counter >= ALERT_AFTER_MINUTES:
print("Threshold exceeded, gathering process details...")
processes = get_top_cpu_processes()
# Create email body with process details
body = f"CPU load has been above {CPU_THRESHOLD} for more than {ALERT_AFTER_MINUTES} minutes.\n"
body += f"Current load: {load}\n\n"
body += "Top CPU-consuming processes:\n"
for proc in processes[:5]: # Limit to top 5 processes
body += f"User: {proc['username']}, Process: {proc['name']} (PID: {proc['pid']}), CPU: {proc['cpu_percent']}%\n"
# Send the email
print("Alert email sent!")
# Reset counter after sending the alert
high_load_counter = 0
high_load_counter = 0
if __name__ == "__main__":
WIll get an email like:
CPU load has been above 2 for more than 3 minutes.
Current load: 4.4169921875