Cannot write to directory /etc/webmin/virtual-server/ Still an issue

dumorian · July 28, 2022, 2:28pm

Last metadata expiration check: 2:16:59 ago on Thu May 26 08:13:10 2022.
Dependencies resolved.
================================================================================
 Package              Arch     Version             Repository              Size
================================================================================
Upgrading:
 libsss_autofs        x86_64   2.6.2-4.el8_6       baseos                 120 k
 libsss_certmap       x86_64   2.6.2-4.el8_6       baseos                 163 k
 libsss_idmap         x86_64   2.6.2-4.el8_6       baseos                 122 k
 libsss_nss_idmap     x86_64   2.6.2-4.el8_6       baseos                 129 k
 libsss_sudo          x86_64   2.6.2-4.el8_6       baseos                 118 k
 remi-release         noarch   8.6-1.el8.remi      remi-safe               29 k
 rocky-gpg-keys       noarch   8.6-3.el8           baseos                  12 k
 rocky-release        noarch   8.6-3.el8           baseos                  21 k
 rocky-repos          noarch   8.6-3.el8           baseos                  14 k
 rsync                x86_64   3.1.3-14.el8_6.2    baseos                 404 k
 sssd-client          x86_64   2.6.2-4.el8_6       baseos                 226 k
 sssd-common          x86_64   2.6.2-4.el8_6       baseos                 1.6 M
 sssd-kcm             x86_64   2.6.2-4.el8_6       baseos                 251 k
 sssd-nfs-idmap       x86_64   2.6.2-4.el8_6       baseos                 119 k
 wbm-virtual-server   noarch   3:7.1.pro-1         virtualmin-universal    25 M
 webmin               noarch   1.994-1             virtualmin-universal    38 M

Is where Webmin and Virtualmin last updated.

Steven_Verheij · July 28, 2022, 3:23pm

with the command

dnf history

you can then find the history of dnf run’s, in the output it also lists the id number for the update/install run, to get more details of the specific updates done, find the id for the run and then check its details with the command

dnf history info 205

in the example i used the id 205 as an example,

What i am suspecting is that you also updated from 6.17 to 7.1 if that is the case, it may help in pinpointing better where to look.

dumorian · July 28, 2022, 3:40pm

Here are a couple of issues:

The install did not set up quota commands for the XFS filesystem. Defaults did not work. I have not had time to sort this yet.
I have been fighting with IPv6. I do on the new network have a block available to me but I have not yet started working on that. Here is my interface config:

# cat ifcfg-eno4
UUID=########-####-####-####-##########
DEVICE=eno4
DNS2=###.##.##.#
IPV4_FAILURE_FATAL=no
BROADCAST=###.##.##.##
IPADDR=###.##.##.##
IPV6INIT=no
DEFROUTE=yes
NETWORK=###.##.##.#
IPV6_AUTOCONF=yes
BOOTPROTO=none
PROXY_METHOD=none
NETMASK=###.###.###.###
DNS1=127.0.0.1
BROWSER_ONLY=no
GATEWAY=###.##.##.#
DNS3=###.##.##.##
DNS4=###.##.##.##
IPV6_FAILURE_FATAL=no
DOMAIN=<our domain name>.com
TYPE=Ethernet
ONBOOT=yes
IPV6_ADDR_GEN_MODE=stable-privacy
IPV6_DEFROUTE=yes
==================

IPV6INIT=no

When I look at network interfaces, it also is set to disable IPv6, but when I look at the active interface, it does show a default IPv6 address.

I see DNS1 is still not correct.

At one point the system was showing eno1 in the modules configs instead of eno4. eno1 is not connected. I’m not sure how that happened but the system was operating properly through eno4. A bad guess by Virtualmin?

Apache vhosts have no IPv6 entries. I did have to manually run our old IPv4 address and the new IPv4 address such as this:

These have all been edited to remove the old address which is how I know no IPv6 addresses are in the conf file.

I mention this because I have mostly been working on networking and new addresses. When I found eno1 in the modules config, I changed that to eno4 and deleted the IPv6 address as IPv6 was disabled. Then Virtualmin complained about not being able to find the default IPv6 address, so I copied over the one from the active interface.

I mention this because one of the last things I edited on that system was the DNS servers to update one old IPv4 address when the old network went dark. I do not run bind on this system. I made this change on my other systems without issue. I had 2 CentOS 8 systems running at the time IBM made their announcement about CentOS. At that point I waited for Rocky to happen and for Webmin/Virtualmin to get up to speed with Rocky 8. So only 2 systems went through the conversion to Rocky, this being one. My other new Rocky systems are not showing any issues, other than XFS quotas. They are enabled, but Webmin doesn’t seem to think they are. I do need to get back to that to create and test the XFS commands that Webmin and Virtualmin both need. I did find two places where these need to be set. And yes, quotas are being enforced and were transferred over to one of my other Rocky systems during the backup/restore, backup from the old and restore to the new. I have had people hit quota.

And that’s all that I know of on the issues list for my Rocky systems.

Steven_Verheij · July 28, 2022, 3:52pm

you should easily be able to fix this by editing the file

/etc/fstab

in the file there should be a line that looks like this:

/dev/mapper/VG_System-root / xfs quota,seclabel,inode64,relatime,attr2,rw 0 0

in this line just add the following two options to the end and befor the 0 0:

,grpquota,usrquota

Do this for all the mount-points you want to enable quota’s on. the line then should look something like this:

/dev/mapper/VG_System-root / xfs quota,seclabel,inode64,relatime,attr2,grpquota,usrquota,rw 0 0

After a reboot, the quota’s for those filesystems should now be enabled.

dumorian · August 2, 2022, 1:34am

I just had this happen again on a different Rocky 8 server. The last thing I did was on the 28th, copying email over with imapsync. Today, I have the recheck config button. Clicking that, it can’t find IP addresses. Going to module conf, it shows interface eno1 when only eno4 is connected. After fixing that and retrying, I get an Apache mod_actions warning. mod_actions.so is in the modules directory.

I’m sorry but this is a destroyer. Can we get a script to create maybe 10 dated backups of the config file? Sort of like a log rotate?

Sadly, this is a new system replacing another CentOS 8 system and I have not yet configured my backups on this system so… After many days of careful work, I’m totally stuck. I tried bringing over a config from another system and failed. I will try again. I must have missed an edit somewhere in there. I notice one instance of FPM will not start/run. Likely my doing in the process of doing PHP upgrades.

I know I’m a bit outside of the norm running Rocky. The website states it is supported, but the install.sh says it is not. I did have to download and run an install.sh from GitHub that is said to support Rocky.

I am totally under the gun to get this up and running.

Steven_Verheij · August 4, 2022, 3:54pm

And i have had it happen again, this time on a bare metal CentOS 7 system where there are virtually no domains as it is mainly used as a hypervisor host. i think it is caused by something in the network stack / changes to networking, not even sure if it is caused by using webmin, or webmin just goes boom after changing the network settings outside of webmin itself. ( cli edit )

@Ilia @Jamie if wanted i can share access to this host ( and it’s 2 partners that i am about to go and decommission / repurpose after having replaced them by new / bigger machines. ) there is only one domain on this host, the one for the hostname itself so it can get SSL / DNSSEC / valid smtp and such.

This needs more attention as this bug / fault is breaking full hosts / installs and seems extreme hard to figure out the cause / regain a working system for / from. For sure it’s not a corrupt harddisk / filesystem. This host uses raid1, and it’s too coincidental for 2 full independent systems to have the exact same file corrupted. an exactly only that file.

Now while writing this post and at the same time looking into this problem i find a file that seems to contain a copy of the contents from before it broke, not sure why that was not visible before.

@dumorian I guess by now you have fixed your issue, though i just found out there is a file in that directory named last-config that seems to contain the info that was lost when the bug showed up. Not sure if this is recent change made by Jamie / Ilia, though maybe restoring that may work for you to fix it. I will not touch the server further until someone from staff says they will not be going to take a look. otherwise that is my next step to see and fix the problem, or better said, at least try to get back a working setup.

Steven

Jamie · August 4, 2022, 9:17pm

@Steven_Verheij remote access to your system would be very useful! You can send me login details at jcameron@virtualmin.com … please reference this bug in your email.

dumorian · August 4, 2022, 9:52pm

I get the ‘feeling’ it may start at network settings. That’s what pops up first as wrong when this happens. At the moment, I do not wish to have IPv6 working on my systems. I only just a few months ago can request a block of IPv6 addresses. I had disabled IPv6 on the interface. I had disabled it in virtualmin modules configs. I noticed that the running interface was picking up a default IPv6 interface when it started in spite of having it disabled in the interface settings.

Anyway, I don’t know. I only know that when I went to the networking settings module, it had hopped off of eno4 and showed eno1 which is not set to start at boot and is not plugged in.

@Steven_Verheij Hah! Yes, I had noticed that file as well, but apparently when I followed the suggestion given during the failure, it overwrote that file as well. Yes, even though that system was almost finished… millions of files transferred… I opted to start over just in case something went wrong during the install. I haven’t messed as much with network interfaces on this system. I also have been regularly backing up the config file until I add this system to my backup system. Thanks for that though.

Steven_Verheij · August 5, 2022, 12:50am

Mail has been sent @Jamie

Thanks and good luck, if there is any way i can help, let me know.

Steven

dumorian · August 5, 2022, 2:46am

I too hope he can “step in the trap”. I never said it, but I also don’t think this is file corruption on my systems. I have not noticed any corruption on these systems. It would be really odd for it to happen to only that file on 2 systems.

Good luck @Jamie and a big thanks to @Steven_Verheij for the wide variety of help. I’ve wound up way under the gun on getting a system moved.

Steven_Verheij · August 5, 2022, 1:04pm

The mail got stuck a few hours in the outbox, as such it’s been sent with a delay… though it is out since about an hour ago. Sorry for that ( outlook was not running properly, and after a restart of outlook it went out as expected )

Steven

Ilia · August 5, 2022, 8:48pm

Webmin doesn’t care about you changing network configuration outside of it. This cannot be the source of the issue.

Check this post out, perhaps it helps?

dumorian · August 6, 2022, 2:25pm

A FYI. I had a power blip yesterday evening. 2 of my Rocky systems are not yet on battery, so rebooted. One of the two did this reconfig thing. The /etc/webmin/virtual-server/config was back down to 8k as it happens. I did not try to rerun the Virtualmin config check. I just went via console and found that last-config was identical in size to my backup from 8/1/22. I copied mine over and all was well.

I do not know if that system was in the somewhat broken/lost state before the reboot so maybe it had nothing to do with the power outage other than making the reconfig become obvious.

Webmin has continued to function each time this has happened to Virtualmin, however it does not have complete information available. For instance after this happens, if I go to webmin>servers>apache, none of my vhost show in the interface, however they still exist in the http conf file. It seems there is at least some interaction, just not total breakage.

Either way, this has now effected 3 out of 4 of my Rocky systems

Steven_Verheij · August 9, 2022, 4:47pm

While i would love to believe that, i am sure it does care at least somewhat, it monitors ip’s and interfaces with the intent to change the ip in services it maintains. Unless that is not what you meant, i am sure webmin follows the ip’s at least.

it also looks them for virtualmin for the time it creates a virtual server…

And nope i do not run ubuntu, this system has been running virtualmin since early 2019 on CentOS 7, i assume things will be figured out once Jamie has looked into it. I just had another one doing this belly up thing, this time an internal only machine, ( simple use as a file server mainly, and a single html file on the default domain with links to resources on the lan/net )

I certainly hope Jamie will come with some info on the problem, or find the cause in the machines i gave access to.

Steven

Ilia · August 9, 2022, 6:37pm

That’s Virtualmin.

it also looks them for virtualmin for the time it creates a virtual server

Yes, correct.

Steven, do you have a way to reproduce a problem? Also, are you sure that this problem is created by Virtualmin?

Steven_Verheij · August 16, 2022, 12:28pm

Hey all,

I am aware that Webmin does the “system wide services” configuration in the sense of “root level” while Virtualmin uses those ( and some more ) services to provide virtual hosting on those services. to the best of my awareness, the both are somewhat integrated in the sense of them calling “api style” commands on each other to get things done. Some of the modules do direct edits on files to change the working of the setup. I think i know which option most of them use.

As of today i have been unable to reproduce the issue, although i had it happen on two different systems, one virtual, one physical. given these systems are on fully separated physical machines, i rule out hardware failures

the virtual system runs as a slave nameserver, virtualmin is installed though only runs a single domain, the default domain so ssl and such are nice and in check.

the physical server is used as a vm hypervisor, once again virtualmin is installed to keep ssl and such in check from a single default domain.

both systems are running some low level loads, are hardly ever logged into by a human other then to do some updates if webmin sends alerts for them. they have been running mostly fine since beginning of 2019 like that. The nameserver gets its rpc routines called for dns from around 10 remote systems that are actually hosting content.

given that both the config file and the localips file are timestamped the exact same once a system breaks, it leads me to conclude it has something to do with the network stack, or a change picked up by web / virtualmin. Given the contents of the config file, and virtualmin breaking on it’s corruption, i assume it’s mainly a file for virtualmin’s use. there are no software management packages or anything else that could touch those files other then web / virtualmin itself.

given that the 2 files mentioned before seem to mostly relate to networking. i looked back in the few notes i have made for myself. in both cases the only thing that has been done in the networking stack on both, is “a fix” on the ipv6 sections of the machines. the ns had an attempt to modify ipv6 into a properly working state as i want to get ipv6 ready / enabled. The physical host was fased out and replaced, I had started reshaping it’s networking to fit into the new style network layout i want. in that process i have renamed it’s wan br0 to br-wan06. while i renamed the interface config file, i forgot to rename the route6-br0 config file, so v6 was not working properly.

In both cases the problem showed up in virtualmin after installing updates ( via webmin ) and then follwing up with a reboot to apply the new updates. both files have a timestamp that is just a few seconds after the system booted up. and at that point virtualmin was broken, while before the reboot yet after the updates, the system was in a working state.

So yes given all the above, i think it’s virtualmin breaking itself over something that changed in the network stack and gets “confused” somehow. I am suspecting it needs a certain pre-condition to appear just after the reboot is finished and web/virtualmin starts.

I will keep trying in the next few days to reproduce the problem as it’s not easy to figure out what is actually broken once virtualmin corrupts it’s config file. most of virtualmin just hangs or throws misleading errors. it took me a long while to figure out the config file was corrupted. by comparing the files and folders from my other virtualmin systems. thereafter it took me another while to figure out there was a file named last-config that can actually be used to restore to a working status.

Steven

dumorian · August 22, 2022, 1:20pm

I built a new Rocky system. I found out what is going on with Quotas.

On a new test build, I could not get quotas to function. I looked at user and group quotas under disk and network filesystems. I was most interested in the /home directory. When editing that, I found that both user and group quotas were enabled. The save button was green. I decided to click the save button anyway and boom, it started working. This interface is clearly not reading and reporting the state of on/off.

Ilia · August 22, 2022, 6:19pm

Wait, what started working?

dumorian · August 22, 2022, 7:50pm

@Ilia Quotas. The Virtualmin script to check if the system is ready checks for IPv6 and then Quotas. I reported problems with Quotas above. It might be Quotas is where this error happens and maybe not networking.

Ilia · August 23, 2022, 11:15am

@Jamie, what this code is intended to do and when?