Failed: end of filezone - loading from master - failed: end of file

Hello,

i get this error on new server creation:

zone subdomain.domain.com/IN: loading from master file /var/lib/bind/subdomain.domain.com.hosts failed: end of file

The issue is that a new line is added to the zone file…
If i delete the new line at the end of the file and restart bind, everything is okay.

What can the cause of this be?

Did you modify the Server Template for zone records?

Only changed some of the settings.

BIND DNS records for new domains is set to No additional records.

Is it just one domain? i.e. are other zones working fine? Was it imported or migrated? I’m pretty much out of ideas, as I’ve never seen this happen.

This was a new virtual server. No errors on other zones.

Tried to create another one with options:

  • [Setup DNS zone?]

No other options.

  • Additional feature options
    • Slave zone master DNS servers [IP to master]

No errors on webmin → servers → BIND DNS Server → Check BIND Config.

The server which got the error was setup with DNS, Nginx, Postgres and MySQL.

So, seems I’m not able to reproduce the problem. I’m a bit puzzled… :thinking:

Oh…I bet I know what’s happening. How much memory do you have?

Check the kernel log for out of memory errors. I bet you have one (or more). I’d almost be willing to bet this is the OOM killer biting you.

I have seen a few of those, so yes that could be what’s happening.

There’s 4GiB Real memory and 8GiB virtual.

Is it okay to use “OOMScoreAdjust” in the systemd service for webmin?

Or are there any other options besides increasing the memory/cost? :grimacing:

If you’ve seen OOM killer events, that’s definitely what’s happening, and it’s disastrous. You can’t run a production server that randomly kills processes.

This should be plenty. You must have something leaking memory, or it’s a very heavily loaded server with tons of services.

Well, the OOM Killer overly aggressive, that’s all i know.

Memory is around 40%, virtual 17% at the moment, so i cannot say it’s heavily loaded.

Is this an OpenVZ or Virtuozzo instance?

If it is not (e.g. it’s physical hardware or VM running under KVM or Xen), it isn’t possible for it to be overly aggressive. It only kills processes when it’s literally out of memory. At that point it has no choice. Something literally must be killed.

OOMScoreAdjust should never come into play. You need to fix it so processes aren’t being randomly killed. But, if you can’t do that, I guess you can make your least important processes first on the chopping block.

But, figure out what’s taking up your memory and fix it. Running out of memory is a disaster to be avoided at any cost. You must reduce your usage or increase available memory (or choose a host that doesn’t use OpenVZ or Virtuozzo and oversell memory, which causes random memory errors no matter how much memory it claims you have).

It’s a DigitalOcean droplet.

It has been a common problem for a long time now, that i have to manually restart webmin, usermin, csf and lfd on several (at least 4) droplets.

They should have enough memory and swap, so there’s something going on here i have to figure out i guess.

OK, they use KVM or Xen, so it’s not oversold. So, yes, you need to reduce memory. You cannot run production services on a system that’s having OOM killer events. Nothing me or Webmin can do about that.

The only right number of OOM killer events is zero.

Here’s what’s been killed from syslog just today:

grep -i kill /var/log/syslog
Mar  9 00:00:02 vps1 systemd[1]: lfd.service: Main process exited, code=killed, status=9/KILL
Mar  9 01:36:30 vps1 systemd[1]: packagekit.service: Main process exited, code=killed, status=15/TERM
Mar  9 03:46:08 vps1 systemd[1]: packagekit.service: Main process exited, code=killed, status=15/TERM
Mar  9 05:00:45 vps1 systemd[1]: lfd.service: Main process exited, code=killed, status=9/KILL
Mar  9 06:51:50 vps1 spamd[15781]: spamd: child [16087] killed successfully: interrupted, signal 2 (0002)
Mar  9 06:51:50 vps1 spamd[15781]: spamd: child [16088] killed successfully: interrupted, signal 2 (0002)
Mar  9 07:00:10 vps1 systemd[1]: lfd.service: Main process exited, code=killed, status=9/KILL
Mar  9 07:37:42 vps1 systemd[1]: packagekit.service: Main process exited, code=killed, status=15/TERM
Mar  9 11:32:22 vps1 systemd[1]: lfd.service: Main process exited, code=killed, status=9/KILL
Mar  9 13:40:50 vps1 systemd[1]: packagekit.service: Main process exited, code=killed, status=15/TERM
Mar  9 14:52:16 vps1 systemd[1]: lfd.service: Main process exited, code=killed, status=9/KILL
Mar  9 15:27:19 vps1 systemd[1]: lfd.service: Main process exited, code=killed, status=9/KILL
Mar  9 19:35:54 vps1 usermin[32541]: /etc/usermin/stop: 4: kill: No such process
Mar  9 19:41:20 vps1 systemd[1]: packagekit.service: Main process exited, code=killed, status=15/TERM

Yesterday is even worse…

*Edit
It should state invoked oom-killer though, which it isn’t.

That’s not the OOM killer. OOM killer is a kernel event, and will appear in the kernel log.

There’s nothing in the kernel logs about the OOM killer.

OK, then it’s some other problem. I don’t have good guesses about what. It seems like it has to have been Webmin being killed while it was writing the file (or it ran out of space), and if it’s not the OOM killer, I don’t know what else it’d be.

This topic was automatically closed 8 days after the last reply. New replies are no longer allowed.