Collectinfo.pl High CPU Usage

OS: Debian 10 (Buster)

Hey,

We have the latest OS, packages, webmin and virtualmin versions. Also PHP 5.6 and all of 7.x versions installed. Install options was Minimal Virtualmin and installed awstats. There are ~150 domains on server and all of them using FPM on apache (nginx not installed)

As you can see on this old thread (collectinfo.pl makes %100 CPU [#54532] | Virtualmin), we are having issue with collectinfo.pl while it’s activated. Every 15 mins (we defined collection time as 15 mins) it created collectinfo.pl instance which is a never finished task.

After 15 mins, here’s the two instance running same time:

Any helps would be appreciated. Thank you.

Hi,

Interesting.

Could you strace the process with children, saving the output to a file, archiving it and sending it to us, so it would be evident, where is the actual bottleneck?

@Jamie Jamie, shouldn’t we check before running another instance of collectinfo.pl if the previous process was finished? Perhaps additionally notifying a master administrator over email of a stuck process and possibly embedding more detailed report on what went wrong?

@mahony0 If you increase the update interval to let’s say 60 minutes, does it still happen?

What are the server specs?

@Ilia I just disabled info collecting after one instance started. Now only one instance of collectinfo.pl is running and local disk space usage statistic is dropping 10 MB by every 3 seconds (more or less).

I have had disabled info collecting for 3-4 months so maybe it’s clearing some things up but exactly I don’t know what.

This is our one of the prod. server so I can’t install extra packages but here’s the specs:

CPU: Xeon E3-1275 v6
Ram: 64 GB
Disk: 512 GB NVMe SSD (%35 used)
OS: Debian 10 (Buster)
Webmin: 1.962
Virtualmin: 6.14
Installation: Virtualmin Minimal
Extra packages: ProFtpd, Awstats, PHP 7.x versions with FPM (apache)

We already have a check to prevent multiple concurrent runs of collectinfo.pl - see virtualmin-gpl/collectinfo.pl at master · virtualmin/virtualmin-gpl · GitHub

Specs looks good to me.

Do you run Webmin behind proxy or connecting using host:port directly? We are talking about automatic call of recollect, not manually triggered using Dashboard?

What is the simple output of :

ps aux |grep recollect.cgi

Moreover, what if you go to Webmin > System > Running Processes > PID, find collectinfo.pl process, click on it and then click on Trace Process. Additionally it would be interesting to see Files and Connections output.

1 Like

I meant :

ps aux | grep collectinfo

What is the output of as well :

du -h /var/webmin | perl -e 'sub h{%h=(K=>10,M=>20,G=>30);($n,$u)=shift=~/([0-9.]+)(\D)/; return $n*2**$h{$u}}print sort{h($b)<=>h($a)}<>;'

Thanks, Jamie. I wounder how it would still be possible then?

I think this could fix it. @mahony0 could you please try the patch from the link and see if it fixes your issue?

Hey @Ilia sorry for late reply.

We are not using proxy (connecting like host:port) and recollect running automatic. I’m not triggering it manually.

Btw, after 3-4 hours process instance finished and disk usage lowered nearly 20 GB. Recollecting now running as expected (every 15 mins) and finishing successfully.

But collectinfo.pl has an error about not running simultaneous instances so it hogs cpu resource if it taking too long for a single instance.

Could you apply also this patch and the patch above, restart Webmin with /etc/webmin/restart and see if it solves the problems?

Is this bare metal machine?

Yes, this is bare metal dedicated server.

My problem already solved, thank you.

Because I didn’t run script via cli, I won’t get any result from applying the patches above. pl scripts were all run automatically

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.