Report - Postfix Dead, ClamScan Hanging?

SYSTEM INFORMATION
OS type and version Debian Linux 11
Usermin version 2.102
Virtualmin version 7.20.2
Theme version 21.20.7
Apache version 2.4.62
Package updates 31 package updates are available, of which 1 is security update

mail_version = 3.5.25

ClamAV 0.103.10/27465/Fri Nov 22 04:41:26 2024

This is in regard or related to this post

Today, postfix is failing and when restarting I am having the same issue as the post above. postfix is taking up too much memory and my processors x 6 are also consumed to practically 100%

postfix@-.service: A process of this unit has been killed by the OOM killer

I also have 8G ram on VM and this system usually float around 3.5gigs or ram (no sure about spikes when postfix activated clamscan)

anyway, postfix is up but taking nothing connection refused 587, and oom killer seems to kill clamav when postfix uses it or activates it to scan?

Nov 22 11:13:50 web.domain.ca kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/clamav-daemon.service,task=clamd,pid=776,uid=115
Nov 22 11:13:50 web.domain.ca kernel: Out of memory: Killed process 776 (clamd) total-vm:1484020kB, anon-rss:1374004kB, file-rss:0kB, shmem-rss:0kB, UID:115 pgtables:2828kB oom_score_adj:0
Nov 22 11:13:51 web.domain.ca systemd[1]: clamav-daemon.service: A process of this unit has been killed by the OOM killer.

I was not sure how to reboot clamd since systemctl status clamd was finding nothing. But I do know that postfix only start clamav for email scanning.

I found out how to start clamav this way,

/etc/init.d/clamav-daemon start

it takes up memory as mentioned, and stay up, but as soon as I start postfix I get the error, postfix is up but refusing a connection on 587 and clamav-daemon goes down… again I do not think you ned this service up.

I have never had this issue before and never had the need for more than 8gigs.

on reboot everything was back to normal
all services are up and my memory is running at

Mem[||||||||||||||||||||||||||||||||||||||                                           2.25G/7.46G]

on reboot I noticed that systemctl status clamav-daemon.service was in failed state. Again, I think this isn’t necessary because postfix just scan email and clamav doesn’t need to be up and running ( please correct me if I’m wrong )

So I decided to boot it up to see how it will go and it did work and stayed online

systemctl start clamav-daemon.service

it is up and my memory is

Mem[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||                      3.68G/7.46G]

I have tested postfix to send email and everything is working fine. So weird how this happened and not sure why the spike in memory. I now know that clamav-daemon does not need to be running, but prior to my reboot I was only trying to restart postfix and as you can see in the logs something was blowing up the memory and crash a little after postfix was trying to run task=clamscan

So I just got an email. My memory went from 3.5 to 5 gigs, nothing crashed and it went back down to 3.83 gigs. This however is with clamav-daemon.service running. So I shutdown clamav-daemon to ensure I have no mail issues for now.

2.51gig to almost 3.83gigs, so does postfix boot clamav-daemon to scan, then shuts it back down? cause this is over a gig when an email is being scanned and this is the log

Nov 22 11:47:38 web.domain.ca postfix/smtpd[7081]: disconnect from mail-ot1-f47.google.com[209.85.210.47] ehlo=2 starttls=1 mail=1 rcpt=1 bdat=2 quit=1 commands=8
^[[ANov 22 11:47:59 web.domain.ca spamd[2233]: spamd: connection from ::1 [::1]:41188 to port 783, fd 5
Nov 22 11:47:59 web.domain.ca spamd[2233]: spamd: setuid to gstlouis.domain succeeded
Nov 22 11:47:59 web.domain.ca spamd[2233]: spamd: processing message <CALYYd5P4Vyx3pK53M0bExKC7+wUPsfsMbY_9_rqpKSMiwwBgGA@mail.gmail.com> for gstlouis.domain:1011
Nov 22 11:47:59 web.domain.ca spamd[2233]: spamd: clean message (-1.2/5.0) for gstlouis.domain:1011 in 0.4 seconds, 65643 bytes.
Nov 22 11:47:59 web.domain.ca spamd[2233]: spamd: result: . -1 - BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DRUG_ED_CAPS,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS scantime=0.4,size=65643,user=gstlouis.domain,uid=1011,required_score=5.0,rhost=::1,raddr=::1,rport=41188,mid=<CALYYd5P4Vyx3pK53M0bExKC7+wUPsfsMbY_9_rqpKSMiwwBgGA@mail.gmail.com>,bayes=0.000000,autolearn=no autolearn_force=no
Nov 22 11:47:59 web.domain.ca postfix/local[7085]: D036B6035B: to=<gstlouis.domain@web.domain.ca>, orig_to=<gstlouis@domain.ca>, relay=local, delay=22, delays=0.18/0/0/21, dsn=2.0.0, status=sent (delivered to command: /usr/bin/procmail-wrapper -o -a $DOMAIN -d $LOGNAME)
Nov 22 11:47:59 web.domain.ca postfix/qmgr[1675]: D036B6035B: removed
Nov 22 11:47:59 web.domain.ca spamd[1204]: prefork: child states: II

I have attached and refined my mail logs, one with the problem and the other after reboot if anyone cares for it.

I do not think this is not a virtualmin issue at all, I just thought I’d add more information to the previously closed post in case this pops up again. Sucks you have to reboot and do not really know the root cause.

Archive 2.zip (117.8 KB)

No. You’re misunderstanding what’s happening.

The OOM killer does not kill the biggest process. It kills the least active process it can kill that will free up the amount of memory is needs to allocate for new work. In this case, you are using ClamAV, so it is certainly one of the big processes that is using a lot of your memory. Postfix is almost certainly not a big process (something would have to be misconfigured or you’d have to have a huge mail server with a lot of users and mail coming and going for it to be very large).

You need to actually look at memory usage to see what’s using memory. ClamAV is definitely huge (more than 1GB, now), but you may have other problem processes, and only you can figure out what they are.

Anyway, to be clear: The OOM killer is an indiscriminate killer. It does not kill the guilty process, it kills what it thinks is the least destructive to kill. Running out of memory is catastrophic. Something terrible has to happen when you run out of memory. The kernel is trying (within its limited heuristic abilities) to do the least damage it can to your running system. But, there is no safe OOM killer event. Something extremely bad has happened: You ran out of memory. And, the only solution is to reduce memory usage (stop/disable ClamAV, maybe fix other apps) or add more memory.

Postfix does nothing with ClamAV in a Virtualmin default configuration. Postfix hands the mail to procmail-wrapper, which hands the mail to procmail, which checks the mail with SpamAssassin and/or ClamAV if you have enabled spam and AV scanning features.

If you have configured ClamAV to run on-demand, clamscan would be started every time you receive mail. But, the ability to enable that mode has been removed for years…I don’t know how you could have set that up, unless you did it manually. No one should use the on-demand scanning mode, anymore. ClamAV is just too big and too slow to start. Even a huge system would be overwhelmed immediately with that mode.

You’re doing a bunch of stuff manually without looking at the configuration of Virtualmin for these features, so I can’t tell you what is actually happening with your mail (and it sounds like maybe you have done something custom outside of Virtualmin, so I can’t predict what’s happening even if you tell us about your Virtualmin configuration).

But, you cannot reasonably use clamscan for mail. If you must have AV scanning, you need to use clamdscan with the always-on ClamAV service that Virtualmin sets up (I think it’s called clamd@scan, but maybe different on different OSes, it’s been a long time since I looked at it…maybe it’s clamav-daemon on Ubuntu/Debian).

But, merely starting the service will not make Virtualmin use clamdscan. You need to configure Virtualmin to use the right scanner. (But, again, since you’re seeing changing behavior based on whether you start that service, I think you have something custom outside of the control of Virtualmin, in which case I can’t help. I don’t know what’s happening.)

Postfix does not scan mail (for viruses or spam in a Virtualmin default configuration).

1 Like

OOM killer seemingly kills what it thinks best which might not be your offender.

(Written before @Joe responded but was interrupted by wife. :wink: )

top
shift + m sorts by memory.

Debian 11 and I did the updates. Also, you don’t mention virtual memory.
482 clamav 20 0 1699032 1.3g 8424 S 0.0 8.7 20:08.90 clamd

I’ve updated our documentation to provide more details on how and why the OOM Killer terminates processes.

The full document is available here:

2 Likes

It did not occur to me to check live memory usage. I was too fixated on targeting postfix and clam daemon.

I mean, the spike was happening exactly when the mail was being processed, so I thought it was logical to assume something was going on between postfix and clamd, but should not have assumed

@Joe I recall seeing clamd@scan so you are correct. Thank you for your comments and given me better ideas how to identify the problem. I would have enjoyed troubleshooting this more before I did the reboot so emails could get back online. I will be more conscientious about investigating the memory use if this ever happens again. All has been running smoothly since…

This is just a traditional install, there really shouldn’t be anything out of the ordinary. It is a VM and not a container, so I do have lots of other services/apps working together I installed in one huge virtual machine I wanted to try, but now I know that containerization is the way :slight_smile: