Cannot write to directory /etc/webmin/virtual-server/ Still an issue

SYSTEM INFORMATION
OS type and version Rocky Linux 8.6
Webmin version Webmin 1.994
Virtualmin version Latest Pro Version - The system will not tell me from the GUI

This system has been in service for a year or more. It was converted using the CentOS 8 to Rocky 8 script provided by Rocky. That was done over 6 months ago. I’ve done this on more than just this system without issues.

Our entire network went through an IP address change which ended a couple of months ago. Obviously I was working on addressing all over the system.

It was rebooted about 17 days ago. I did not look, but I believe that is when the problem started.

When I open Virtualmin, it is on the post install wizard. If I go through those steps it ends with:
Cannot write to directory /etc/webmin/virtual-server/

If I try running the Recheck Configuration I get:

The feature Administration user cannot be disabled, as it is used by the following virtual servers:

followed by the list of vservers and ends with system is not ready.

I was able to put a check in the box on features, but I’m not able to check off all that should be available without hitting this same error.

Ahhh… and more playing in here… it fails when I check off Apache. It allowed me to add Maria DB and a few other things, but as soon as I check off Apache, I get the error. I’ve run through my Apache configs and cannot find any error. It does restart, stop, start and run just fine.

I’m now running up on LetsEncrypt expirations as that has disappeared. I can’t make simple changes to any vhost, such as administrator address to stop the warnings to clients.

Obviously, I need to get this fixed. I wonder if I can run backup and restore and if I do, will it get all the right info? So far, all seems to actually be functioning to the public, other than LetsEncrypt. On my side I can do very little.

If you want a login Jamie, just let me know.

Thanks!

I just found that one of my servers had the same problem, for me this was fixed by restoring a file see here what i did:

1 Like

Thank you Steven. FYI, the config file on the broken system is very short, about 50 lines. One on another Rocky 8 system it is almost 500 lines. I’ll grab that and edit it (IP addresses names and such) and do as you suggested. I’ll be doing this in the wee hours of the morning and will report back tomorrow.

Gee I hope this works. Backup doesn’t work in Virtualmin… I was looking at creating all the accounts on another system and rsyncing them over. Not so bad, but it is much better to know a cure should it happen on one of my more loaded up systems which would take a very long time to move.

This is not clear what causes this kind of config files corruption … @Jamie also mentioned that it must happen … @dumorian are you sure that your hard drives aren’t failing?

I have a round robin backup… the typical 11 backkups. Looking back one month… which should be 3 weeks about now, I have a backup of the config file which seems to be complete. I think I’ll backup the running config, restore the newest version (this is Feb 9th for certain. I’ll look to find where it broke.

On July 18, which would have contained most of July 17th activity, the config file dropped from 8385 to 423. I restored latest of the large files, restarted Webmin and I’m back in business! I had no idea where to begin. Pointing me to the config file was the key!

Solved with backup file.

1 Like

Can you post the contents of the smaller config file? Or a least a diff between when the shorter config and the backup?

@dumorian Also, do you have any overlapping background services running that could potentially try to obtain a lock and write to a config file?

@Jamie, what happens, if two separate background processes are running pretty much at the same time and one process obtains a lock on a config file trying to write it, while the other process restarts Webmin?

It would be interesting to see if the config file left on the problematic system is just a beginning part of an actual config file …

I backed up my config file before i overwrote it with a updated-copy from it’s partner slave-dns host. while i do not have a backed up version of its original content, i can provide the file that caused web/virtualmin to break, also in my case it was something along the lines of 700 bytes vs the 7, or more Kb it is on almost all of my other servers.

Given that the file contains IP addresses and hostnames i prefer to not openly post the files. ( the one that was broken vs the copy of one of the other servers i modified the ip’s and hostnames contained and then used to replace the broken file with.)

Looking somewhat deeper into the logs i have determined it seems to be caused by modification of the “Virtualmin Server Templates”, likely triggered by a missing single or double-quote in doing so.

Given that it’s one of my public dns servers, testing is not really an option, though i can image the host and run a test copy to try and figure out the history of this host.

Steven

Please post the config and truncate/replace any passwords, IPs and hostnames, as this information is irrelevant to the problem.

Interesting! Can we see the log? Do you remember which template section was it? What was entered exactly that could trigger this problem?

The way Webmin / Virtualmin writes config files, even a restart shouldn’t cause a file to be truncated. All writes are done to a temp file first which is then renamed over the original file in an atomic operation. Nothing should ever restart Webmin entirely, and a HUP signal sent to reload the config should only restart the main process and leave in-process operations running.

1 Like

Contents of broken config file:


virt=1
initsub_template=1
last_letsencrypt_mass_renewal=1658882348
init_template=0
virt6=1
licence_script=
name_max=20
iface=eno4
index_fcols=
display_max=
index_cols=dom,user,owner,users,aliases
name_mode=0
domains_sort=sub
show_lastlogin=0
show_mailsize=0
show_mailuser=1
ip6enabled=1
dns_ip=
defip6=fe80::
iface6=
netmask6=
defip=
allow_symlinks=0
old_defip6=
old_defip=
external_ip_cache=
preload_mode=2
mysql=1
postgres=0
mysql_size=huge
status=1
dns=0
group_quotas=
web=0
mail_quotas=
virus=0
webalizer=0
logrotate=3
unix=3
plugins=virtualmin-signup
ftp=0
spam=0
ssl=0
disable=
last_check=1658679936
mail=1
webmin=1
dir=3
plugins_inactive=virtualmin-awstats virtualmin-nginx virtualmin-nginx-ssl virtualmin-vsftpd virtualmin-notes virtualmin-google-analytics virtualmin-init virtualmin-dav virtualmin-registrar virtualmin-git virtualmin-mailrelay virtualmin-signup virtualmin-mailman virtualmin-messageoftheday virtualmin-oracle virtualmin-powerdns virtualmin-htpasswd virtualmin-sqlite virtualmin-disable virtualmin-slavedns virtualmin-styles-oswd virtualmin-styles-openwebdesign virtualmin-svn virtualmin-support virtualmin-iframe
home_quotas=
----end----------------------

I did notice later that the dashboard is saying “Virtualmin’s configuration has not been checked since it was last updated. Click the button below to verify it now.” I have not done this yet. I’ll be sure to backup the running config before doing this.

When moving to the new network, I had to run both IPv4 addresses side by side in Apache. Those entries have been gone for months. Likely as many as 6 months ago.

the content of the broken config file, i just renamed this as a means of “backing it up”:

virt6=1
status=0
last_letsencrypt_mass_renewal=1658849463
initsub_template=1
init_template=0
virt=1
allow_symlinks=0
iface=eth0
old_defip=XX.XX.XX.XX
old_defip6=
external_ip_cache=XX.XX.XX.XX
postgres=0
mysql=0

the logs according to webmin:

Logged actions between 07/01/2022 and 07/27/2022 …
Action Module User Client Address Date
Changed the IP address of 1 virtual servers Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Install Wizard step : Database servers Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Install Wizard step : Memory use Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Install Wizard step : Database servers Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Install Wizard step : Memory use Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Changed routing and gateways options Network Configuration root xxx.xxx.xxx.xxx 2022/07/26
Install Wizard step : Database servers Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Install Wizard step : Memory use Virtualmin Virtual Servers root xxx.xxx.xxx.xxx 2022/07/26
Refreshed available packages Software Package Updates root xxx.xxx.xxx.xxx 2022/07/24
Applied changes BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Applied changes to host50.dom.tld BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Changed zone options for host50.dom.tld BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Applied changes to host50.dom.tld BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Started DNS server BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Stopped DNS server BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Started DNS server BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/24
Started DNS server BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/23
Stopped DNS server BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/23
Deleted 2 zones BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/23
Installed 1 updated packages Software Package Updates root xxx.xxx.xxx.xxx 2022/07/23
Refreshed available packages Software Package Updates root xxx.xxx.xxx.xxx 2022/07/18
Logged into Webmin None root xxx.xxx.xxx.xxx 2022/07/18
Applied changes to host10.dom.tld BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/04
Installed 10 updated packages Software Package Updates root xxx.xxx.xxx.xxx 2022/07/04
Deleted 1 zones BIND DNS Server root xxx.xxx.xxx.xxx 2022/07/04
Logged into Webmin None root xxx.xxx.xxx.xxx 2022/07/04
------------------------------ ------------------------------- ------------------------------- ------------------------------- -------------------------------

If i search in the logs for changes that affect the given config file, it only shows up for the ip change i have had to do after replacing the file with a non-corrupted one from a remote server ( as the ip in the file was from my ns2 server and virtualmin picked up on that as a change of ip )

Before that i have only installed updates, and removed 2 domains as for some reason the cluster-slave-dns module refused to set the master / slaves that where allowed to update / transfer.

All 3 servers ( ns1/ns2/ns3 ) are kept up to date and in sync as to their config, i apply the same changes on all three of them in a row, so any change that would be breaking based on a bug would likely happen on all three of them. yet, only ns3 was affected.

I have run disk checks on both the physical host, and the vm itself, nothing there. the disks are fine, no smart errors, no power outages, no hard resets, nothing.

The change for the gateway ip was done on all three ns servers as that was a change request by the ISP. seeing that the other user with this issue has been changing ip’s there might be a likely culprit to look after…

although that would make no sense at the same time as the other two ns servers have had the exact same change happen at the same time frame some minutes from one another.

So, home_quotas= is the last line in that broken Virtualmin config file?

the last and first line in mine is different, I have posted the info, though my post got spam flagged by akismet,

@Jamie, can you think about something that could cause to return incomplete config file at the first place, and latter written as incomplete as well? Perhaps, cache mis-invalidation?

Yes, home_quotas=

RedHat’s default filesystem for RHEL 8 is xfs for /home. I’m still weeding through fixing the commands to display quotas and such. Oddly, under webmin quotas, I can see the quotas as they were before using backup and restore to move these websites. Virtualmin user settings did not by default have the ability to see nor set a user quota.

I let the Rocky minimal install use the defaults for setting up the drives. Webmin/Virtualmin doesn’t seem to detect this. But, that was going to be a different topic. :slight_smile:

The system I’m working on does not do bind and doesn’t contain email accounts. Basically just a webserver. I put these domains’ email on an email server.

I really can’t see how this could happen other than filesystem corruption. It might be worth using the Webmin Actions Log module to find which operation corrupted that file…

Below are the few actions up until the config grew tiny in size:

Changed module configuration Virtualmin Virtual Servers user ###.##.###.## 2022/07/07 9:49:37 AM
Logged into Webmin None user ###.###.##.### 2022/07/06 4:55:33 PM
Installed 4 updated packages Software Package Updates user ###.##.###.## 2022/06/23 9:51:27 AM
Installed 11 updated packages Software Package Updates user ###.##.###.## 2022/06/21 9:59:24 AM

Here is the entry from changed module entry:

[Changed file /etc/webmin/virtual-server/config]
16a17,22 
> ip6enabled=1 
> dns_ip= 
> defip6= 
> iface6= 
> netmask6= 
> defip=###.##.##.##

I will have to disagree on this specific option. as for the following reasoning:

  • The file is less then a single sector in size
  • I have checked both the host and guest filesystems
  • The file has a perfect Start and End line, which is high unlikely to happen in corruption cases

What i can see however is the following:

I have done the exact same, just a few days earlier, now i wonder @dumorian : could you check the yum/dnf history and see if you have also updated some of these packages:

Loaded plugins: fastestmirror
Transaction ID : 78
Begin time     : Wed Jun  8 01:06:44 2022
Begin rpmdb    : 675:4b5e255c9a5531d308daf821d2eb7e1b8d5a4ef5
End time       :            01:11:20 2022 (276 seconds)
End rpmdb      : 675:2b1e6cadadf939e476148695f18be3d60bbbccb7
User           : System <unset>
Return-Code    : Success
Command Line   : -y install clamav.x86_64 clamav-filesystem.noarch clamav-lib.x86_64 clamav-update.x86_64 clamd.x86_64 glibc.x86_64 glibc-common.x86_64 glibc-devel.x86_64 glibc-headers.x86_64 grub2.x86_64 grub2-common.noarch grub2-pc.x86_64 grub2-pc-modules.noarch grub2-tools.x86_64 grub2-tools-extra.x86_64 grub2-tools-minimal.x86_64 gzip.x86_64 kernel.x86_64 kernel-headers.x86_64 kernel-tools.x86_64 kernel-tools-libs.x86_64 python-perf.x86_64 rsyslog.x86_64 usermin.noarch wbm-virtual-server.noarch wbm-virtualmin-awstats.noarch wbm-virtualmin-htpasswd.noarch wbm-virtualmin-nginx-ssl.noarch webmin.noarch zlib.x86_64 zlib-devel.x86_64
Packages Altered:
../ snip /..
    Updated usermin-1.834-1.noarch                                  @virtualmin-universal
    Update          1.840-1.noarch                                  @virtualmin-universal
    Updated wbm-virtual-server-3:6.17.gpl-3.noarch                  @virtualmin-universal
    Update                     3:7.1.gpl-1.noarch                   @virtualmin-universal
    Updated wbm-virtualmin-awstats-2:5.11-1.noarch                  @virtualmin-universal
    Update                         2:6.0-1.noarch                   @virtualmin-universal
    Updated wbm-virtualmin-htpasswd-2:3.0-1.noarch                  @virtualmin-universal
    Update                          2:3.1-1.noarch                  @virtualmin-universal
    Updated wbm-virtualmin-nginx-ssl-1.16-1.noarch                  @virtualmin-universal
    Update                           1.17-1.noarch                  @virtualmin-universal
    Updated webmin-1.990-1.noarch                                   @virtualmin-universal
    Update         1.994-1.noarch                                   @virtualmin-universal
    Updated zlib-1.2.7-19.el7_9.x86_64                              @updates
    Update       1.2.7-20.el7_9.x86_64                              @updates
    Updated zlib-devel-1.2.7-19.el7_9.x86_64                        @updates
    Update             1.2.7-20.el7_9.x86_64                        @updates

Thre reason i ask this:

In this update i can see that almost every config file has been updated / modified, if this is the case the bug has been lingering around ever since the update from wbm-virtual-server-3:6.17.gpl-3.noarch to 3:7.1.gpl-1.noarch