No access to System Logs

simon1066 · October 20, 2015, 11:29am

Virtualmin GPL
CentOS 7

This may have happened at the time of the 4.18 gpl update, but the System Logs option under
Webmin > System failed to take me to the list of system logs, with the error:

System Logs: The syslog configuration file /etc/rsyslog.conf was not found on your system. Maybe syslog is not installed, or a newer version like syslog-ng is in use, or the module configuration is incorrect.

one of the things I tried was the ‘Refresh Modules’ option. Now System Logs is only available in Un-used Modules together with System Logs NG.

The error messages I get after clicking on them are:

System Logs: The syslog configuration file /etc/rsyslog.conf was not found on your system. Maybe syslog is not installed, or a newer version like syslog-ng is in use, or the module configuration is incorrect.

System Logs NG: The system logging daemon syslog-ng was not found on your system, or is not the syslog-ng program. Adjust the module configuration to use the correct path.

The only syslog related file/folder in /etc directory is

/rsyslog.d which contains listen.conf

Eric · October 20, 2015, 2:28pm

Howdy,

It sounds like Webmin isn’t seeing the normal logging daemon used on CentOS 7.

Is it still installed?

You can determine that by running this command:

rpm -qa | grep rsyslog

simon1066 · October 21, 2015, 5:34pm

I ran that command

rpm -qa | grep rsyslog

and got nothing back (putty just returned to a new line)

simon1066 · October 22, 2015, 12:31pm

Does that mean it’s no longer installed? If so what should I do to install it?

simon1066 · October 23, 2015, 12:23pm

As rsyslog appears to be missing I installed it using

yum install rsyslog

Then started the syslog server in Webmin > Others > System Logs

then ‘Refreshed Modules’ and System Logs appeared in Webmin > System

Are there any changes I need to make to this newly-installed rsyslog, such as to the module’s config or adding new logs …?

simon1066 · October 23, 2015, 1:03pm

So, with rsyslogd running I’ve noticed, using ‘top’ command that CPU usage is constantly 80 - 100%.

I’ve read that using rsyslog with OpenVZ can cause this. Is that true or does rsyslog use a lot of resources? Surely not that much?

simon1066 · October 23, 2015, 1:57pm

It has occurred to me that rsyslog might be using power to process the log files that accumulated when rsyslog wasn’t running (at least I think that’s what it does).

I’ve temporarily disabled kernel logging which has brought CPU usage down a bit

I’ll keep an eye on it.

Eric · October 23, 2015, 3:39pm

Howdy,

That’s unusual that rsyslog was missing! I’m glad that was easy to fix though.

The rsyslog program shouldn’t use a high amount of resources, unless it’s doing a particularly large amount of work.

What kind of output are you seeing in the kernel logs?

-Eric

simon1066 · October 23, 2015, 4:27pm

Yes it was strange especially as last time I checked I was able to access log files (a few weeks ago though)

If by kernel logs you mean those in Webmin > System > System Logs - Kernel messages (Output from dmesg) then that is empty (I had already renabled kernel logging).

Perhaps of interest, stopping and starting rsyslog and running the command

sudo cat /var/log/messages | grep rsyslog

gives me

Oct 23 16:57:13 server1 rsyslogd: [origin software=“rsyslogd” swVersion=“7.4.7” x-pid=“28602” x-info=“http://www.rsyslog.com”] exiting on signal 15. Oct 23 16:57:24 server1 rsyslogd: [origin software=“rsyslogd” swVersion=“7.4.7” x-pid=“31028” x-info=“http://www.rsyslog.com”] start Oct 23 16:57:35 server1 rsyslogd-2177: imjournal: begin to drop messages due to rate-limiting Oct 23 17:07:27 server1 rsyslogd-2177: imjournal: 17983 messages lost due to rate-limiting

I’ve only included the lines relevant to the most recent stop/start, the log shows similar dropped message alerts at the other stop/starts I carried out this afternoon, 934250 being the highest number

I wonder if that means it is attempting to read /var/logs/messages and its archives.

Either way rsyslogd is bubbling away at well over 90%.

Diabolico · October 23, 2015, 5:38pm

This could be because of well know bug with Centos 7. Please anyone who have a problem with log files do next:

in “/var/lib/rsyslog/” delete the file “imjournal.state”
restart “rsyslog”
check if everything works and then open /etc/rc.local and add at the end (leave one empty line at the bottom):
rm -f /var/lib/rsyslog/imjournal.state /bin/systemctl restart rsyslog.service

Point 3 is good to have so after every server restart the file “imjournal.state” will be deleted. This bug sometimes happen when the system time is changed but there could be other reasons. There is no harm to have “imjournal.state” deleted after every server restart just dont forget what changes you made on your system.

Hope this helps you all.

P.S. This is valid only for Centos 7! Please be careful if you are using something else because i dont know how will work out as my test server is Centos 7 and i didnt test on other OS.

simon1066 · October 24, 2015, 7:53am

Having tried 1 & 2, the CPU usage for rsyslogd is still high although seems to fluctuate wildly 40 - 99%. I’ve been reading up on this CentOS bug and have seen the references to time warps and possibly this one:

https://bugzilla.redhat.com/show_bug.cgi?id=1123190

where ‘systemd reports all logentries again to syslogd’ there seem to be a couple of workarounds for it,

none of which I’ve tried. Hoping perhaps there’s a more definite way around this.

Diabolico · October 24, 2015, 8:07pm

simon1066 did you by any chance turn on kernel logging?

simon1066 · October 25, 2015, 9:00am

I have tried with and without kernel logging ($ModLoad imklog both commented and uncommented in rsyslog.conf) whilst also deleting imjournal.state.

Diabolico · October 25, 2015, 9:48am

Can you do next:

disable kernel logging, delete imjournal.state, restart rsyslog
what is the size of /var/log/
maybe if you post your rsyslog.conf could help

simon1066 · October 25, 2015, 11:14am

Thank you for your input Diabolico and Eric, I think the issue may be resolved.

Seeing that the size of /var/log was big I have archived off 2.5GB of files from /var/log/journal/f52647523d934ab29d19829bb209f045/ to my local machine.

(To make sure that rsyslogd was not caught in a loop I ran journalctl --no-pager, this processed the journal files and stopped)

Kernel logging is on and rsyslogd is no longer consuming any CPU, so it seems that it was just those archive files being processed that used up so much CPU time.

Assuming I haven’t created any problems by doing this I would hope that that’s it - thanks for your help

Simon

Diabolico · October 25, 2015, 6:04pm

By any chance do you know how old were all that log files consuming 2.5GB? Logrotate should prevent log files to pileup to such amount or something is wrong with the system and is flooding the logs. We have several servers with big and busy Magento stores (i’m speaking of tens of thousands articles) but log files never come even close to 1GB. I saw many times powerful servers literally dying because of the size of /var/log/ and usually was a problem with logrotate or something was flooding the logs.

If that amount of logs was not old months if not more than a year i think this will come back again and the actual problem is still here. Can you examine the log files and see what/why is taking to much space?

simon1066 · October 26, 2015, 10:19am

The journal log files dated back to June 2015 when I first installed Virtualmin on this new server.

The 2.5GB was the total of approx 20 files that I archived, there were about 30 in total. They mainly consisted of files of standardised file sizes (134,217,728 & 8,388,608 bytes).

Looking back through the messages in System Logs > /var/log/messages, the most common entry is
Sep 20 13:40:02 server1 systemd: Created slice user-0.slice. Sep 20 13:40:03 server1 systemd: Starting Session c4165 of user root. Sep 20 13:40:03 server1 systemd: Started Session c4165 of user root. Sep 20 13:40:03 server1 systemd: Starting Session c4166 of user root. Sep 20 13:40:03 server1 systemd: Started Session c4166 of user root. Sep 20 13:40:06 server1 su: (to postgres) root on none
with a lot of

Oct 25 18:30:13 server1 saslauthd[260]: do_auth : auth failure: [user=tomcat] [service=smtp] [realm=] [mech=pam] [reason=PAM auth error]

and
Oct 26 01:08:27 server1 named[904]: zone my.subdomain.com/IN: ‘my.subdomain.com’ found SPF/TXT record but no SPF/SPF record found, add matching type SPF record Oct 26 01:08:27 server1 named[904]: zone my.subdomain.com/IN: loaded serial 1442154377
[I’ve renamed my subdomain (alias) address to my.subdomain.com. This is an example of one subdomain (alias) and I have a lot of subdomains]

Diabolico · October 26, 2015, 12:48pm

I dont want to say that 4 months is not enough to get 2.5GB of log files but it looks little too much.

Sep 20 13:40:03 server1 systemd: Starting Session c4165 of user root.
You should check what is causing this. Probably some cron job but it will not hurt to see what exactly.

Oct 25 18:30:13 server1 saslauthd[260]: do_auth : auth failure: [user=tomcat] [service=smtp] [realm=] [mech=pam] [reason=PAM auth error]
Classic bruteforce attacks. Do you have any protection installed like fail2ban?

Oct 26 01:08:27 server1 named[904]: zone my.subdomain.com/IN: ‘my.subdomain.com’ found SPF/TXT record but no SPF/SPF record found, add matching type SPF record
Check if you have both lines in your zone file:
domain.com. IN TXT “v=spf1 a mx a:domain.com mx:domain.com ip4:xxx.xxx.xxx.xxx -all” domain.com. IN SPF “v=spf1 a mx a:domain.com mx:domain.com ip4:xxx.xxx.xxx.xxx -all”

If your websites doesnt have thousands visits per day still doesnt answer why log files got so big but it could be possible. Maybe bots were hammering your server what could produce large amount of reports but not if you have something like fail2ban. If you ask me i would keep an eye on the log files and see how they will grow up during next few days.

Oh one more thing, if you are the only one using the server aside of fail2ban (in case you dont have it) i would change ports for SSH, FTP, Webmin and Usermin (first, dont forget to open this ports in iptables).

simon1066 · October 26, 2015, 1:45pm

Sep 20 13:40:03 server1 systemd: Starting Session c4165 of user root. You should check what is causing this. Probably some cron job but it will not hurt to see what exactly.
I’ll check this.

Oct 25 18:30:13 server1 saslauthd[260]: do_auth : auth failure: [user=tomcat] [service=smtp] [realm=] [mech=pam] [reason=PAM auth error] Classic bruteforce attacks. Do you have any protection installed like fail2ban?
Yes I do have fail2ban installed

Oct 26 01:08:27 server1 named[904]: zone my.subdomain.com/IN: ‘my.subdomain.com’ found SPF/TXT record but no SPF/SPF record found, add matching type SPF record Check if you have both lines in your zone file:
I’ll check this aswell.

If your websites doesnt have thousands visits per day still doesnt answer why log files got so big but it could be possible. Maybe bots were hammering your server what could produce large amount of reports but not if you have something like fail2ban. If you ask me i would keep an eye on the log files and see how they will grow up during next few days.
I get about 220visits/1900pages per day on my main site so I guess that would explain some, although that has only been live on this server for two months. Some of the big log files might have been when I started learning/setting things up and installing mods, before fail2ban for example. The file date for the journal files have changed since I downloaded them so it’s hard to tell now.

Oh one more thing, if you are the only one using the server aside of fail2ban (in case you dont have it) i would change ports for SSH, FTP, Webmin and Usermin (first, dont forget to open this ports in iptables).
Good point, I think it was something that had crossed my mind when starting off and I put it on the back-burner. I will do this asap.

Thank you for your help, it’s much appreciated.

Diabolico · October 26, 2015, 7:52pm

Honestly 220 visits is nothing from the point of server but i can give you few more ideas:

you could use f2b to stop some of the bad bots
use iptables to limit how many connections per IP (use this if you see some strange IP’s bursting your entire website in few seconds)
with htaccess you can block majority of well know bad bots
use robots.txt to instruct bots how to behave, for example “Crawl-delay: 20” will tell them to slow down to 20sec per page request (you dont really need bots going over entire website in 1-2 sec)

And like i said keep an eye to the log files, wait few days and if you see the problem is coming back come here with any new info you have.