Skip loading broken zone with BIND?

SYSTEM INFORMATION
OS type and version Redhat Enterprise Linux 9.4
Webmin version 2.111
Virtualmin version 7.20.2
Webserver version 2.4.57
Related packages BIND 9.16

The other day I had an issue with bind refusing to start due to a broken zone. Out of the blue, it had just disappeared (the records were missing). This resulted in BIND not starting. I recreated the zone and it started. Few days later the virtual server turned numeric, but that’s a different problem.

Question is: Is there way to make BIND skip loading a broken zone so it can start at least even without it? Did some Googling, but could not find useful information.

Not that I’m aware of. But, I think finding why it disappeared is worth spending time on.

Did any OOM killer messages appear in the kernel log? I don’t know of any bugs that would lead to empty zone files, but a process being killed in the middle of writing might explain it.

Let me give you a little bit of backstory on this machine.

The server was originally running in ext4 with MBR boot mode and / to be 2 TB. What I did that potentially could have been destructive is that the server owner asked for disk upgrade from 2 TB to 4 TB. So knowing that / can only go up to 2 TB when running in BIOS mode, I decided to convert it to GPT. However I can’t remember the exact steps I did and the reason why I decided to convert the filesystem from ext4 to xfs, which was done with fstransform. However knowing myself, I probably did xfs_repair to make sure it runs properly (equivalent of fsck).

Fast forward 2 months later, 3 virtual server IDs had turned numeric , strangely all of them were expired domains.

I somehow fixed those, but meanwhile noticed that for example the “speedometers” showing disk, RAM, etc. usage were missing from dashboard. I did some reading which hinted to corrupted config file, but then I just let the issue slide back then.

Some months later (in June maybe), I noticed that the speedometers had returned, so I thought the issue had autofixed itself.

Fast forward to July, the server suffered a power outage prior to this issue occuring. However, the zone in question is an expired domain and the chance it was being edited while the server went down is close to zero.

Few days later, the same specific virtual server, corresponding to this domain turned to be numeric - out of the blue - again after another power outage. This also made BIND not reload itself. I made a backup of the website files, then deleted the virtual server’s ID from /etc/webmin/virtual-server/domains and re-imported it.

Then I noticed the speedometers were gone again. So I remembered what I had read about the configuration file for Virtualmin, checked it and it was also corrupted(compared /etc/webmin/virtual-server/config to another box and it was smaller), so I recreated it by going through the Virtualmin > Virtualmin Configuration. Then I re-did the virtual server templates as well (comparing to the working box). The speedometers re-appeared and everything appears to be working properly now.

I have configured also weekly remote FTP backups of the virtualmin configuration and virtual servers, but the server owner is a bit stubborn and just wants to solve the issue with BIND not restarting after a power outage.

So far now its operational, but the question is whether anything would happen again on the next outage.

I hope my story makes sense. I suppose I was destructive with GPT and XFS transformation, but I don’t know at this stage. Maybe a good idea will be to just backup all virtual servers and reinstall, I don’t know. Hope my essay made sense to you.