Backup fails after saving files, with "failed to list directory via SSH"

SYSTEM INFORMATION
OS type and version AlmaLinux 9.1
Webmin version 2.021
Virtualmin version 7.5
Related packages SSH 8.7

This is similar to old posts

and

neither of which were resolved.

I have a fresh install of AlmaLinux 9.1 and Virtualmin 7.5, with data restored from VM backup. All good so far.
I have two backup schedules - a weekly full and daily incremental.
The full backup works well, after I added the -O parameter to SCP to enable SSH 8.7 to connect to the destination Synology NAS.
The incremental also saves the backup files, but then fails with the message

Deleting backups from local file /usr/libexec/webmin/virtual-server/web1/incremental matching ...... older than 32 days …
** … failed to list directory via SSH : Invalid multiplex command.**

(this occurred on the first run of this incremental backup. web1 is the name of the VM server).
The folder /usr/libexec/webmin/virtual-server/web1 does not exist, at least not after the backup finished with the above error.

The only clue I can find is from IBM (so questionable as to its relevance)

indicates that “Invalid multiplex command” error may be due to “the ssh -O option is set to an unsupported value”. As far as I can see the -O option in SCP doesn’t require a value - it’s just saying “don’t use SFTP”.

Any help to understand why this is happening is most appreciated.

Hey Peter,
I never did get a reply or a resolution to my issue; I ended up moving away to an alternate backup solution and then eventually away from Virtualmin. Given that, please don’t rely on any of my thoughts here…

This may not help but from what I can gather from the following links (and I could be wrong):

My understanding is multiplex is a way to run concurrent ssh sessions or multiple commands at once. It appears to fail if multiplex is not enabled/configured, if the socket file is misconfigured, or in the scenario where multiplex is configured but the options given to the command are invalid. It seems it can be related to both or either of “-o” or “-O”

Reading between the lines, it looks like -o is used to specify an option that can alternatively be set in the .ssh config. For example,

ssh -o "ControlMaster=auto" 

Can equally be set in .ssh/config as

Host *
ControlMaster auto

Misconfiguring it can cause a multiplex command error.

ssh -O, according to man pages:
-O ctl_cmd
Control an active connection multiplexing master process.
When the -O option is specified, the ctl_cmd argument is
interpreted and passed to the master process. Valid
commands are: “check” (check that the master process is
running), “forward” (request forwardings without command
execution), “cancel” (cancel forwardings), “exit” (request
the master to exit), and “stop” (request the master to stop
accepting further multiplexing requests).

So if its using “-O” and using an incorrect ‘ctl_cmd’ it would throw a similar error.

Honestly, to diagnose further, you’d need to see the full ssh command its actually running, but I’m not sure how to fetch that in Virtualmin, and no longer have access.

Again, could be wrong on all of this, take my thoughts with a grain of salt.

Thanks for these links and your comments.
For others following I can confirm:

  1. I have Centos 7.9 and AlmaLinux 8.7 VM installations that backup using identical procedures and continue to work without error.
  2. The problem started with AlmaLinux 9.1, but I think the relevant thing is that Alma9 uses SSH 8.6, whereas Alma8 uses SSH 8.0 and CentOS 7.9 something even older.
  3. The problem is related to removing old backup files. For that it would have to read the directory and the error implies that this is where the error starts.
  4. I can’t see any mention in ssh_config or 50-redhat.conf in ssh_config.d of “ControlMaster auto”. From reading the articles on how to enable multiplexing, that line would be in a Host section of the config file.
    However, the error does refer to multiplex. Could it be the default in SSH 8.6?

I’m not sure where to go from here. Perhaps I should enable multiplexing and see what happens.

Thanks again for your help.

  1. It could be that the option passed using -O is no longer valid in SSH8.6?
  2. I am assuming that when it tries to execute the command to read the directory, its using multiplex to try and pass this command to an existing ssh connection.
  3. The config I’ve referenced is in the user directory, under .ssh/config. It would depend on the user context for who is executing the command.

I’d try enabling multiplex; but I’d also suggest trying to find the actual command Virtualmin is executing. Obviously its ssh -O <do something/read directory> but what? Is there an audit log or something that shows what commands have been run?

I’m still at a loss how to solve this, but here are the results from 3 servers with different OS.

I have three servers backing up with identical configs.
CentOS 7.9 works flawlessly
Alma 8.7 works flawlessly
Alma 9.1 gets the error.

Now for the detail
CentOS 7.9 and Alma 8.7 report
Deleting backups from local file /usr/libexec/webmin/virtual-server/host1/incremental matching ...... older than 32 days …
Deleting file host1/incremental/230317Fri0130 via SSH, which is 32 days old …
… deleted 4 KiB.

Alma 9.1 reports
Deleting backups from local file /usr/libexec/webmin/virtual-server/host2/incremental matching ...... older than 32 days …
… failed to list directory via SSH : Invalid multiplex command

Interestingly none of the servers contain a directory under /usr/libexec/webmin/virtual-server, at least it’s not shown in the ls command.

With the Alma 9.1 machine I had to add the -O option to SSH to get it to work with my Synology backup server.
The backup and file transfers work, it’s only at the clean up stage that the failure occurs.

Question: How can I find the SSH commands that Virtualmin constructs and executes for me? Since this might help to troubleshoot the problem.

Thanks in advance

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.