Virtualmin SSH backups started failing.

Virtualmin version 3.99 will be out in a day or two, and should include a fix for this.

I have the latest, and nothing got fixed.
My error is this for 3 out of the 50 or so domains I have:

Write failed: Broken pipe
lost connection

… completed in 3 minutes, 27 seconds

Hello,

I’m having Virtualmin 3.99 and I am also facing some backup issues.
These backup plans are running on 2 servers:


First schedule


  • All virtual servers
  • All features and settings
  • to SSH server
    File on server: /home/backupname/backup/%Y-%m-%d-%H%M
  • User & Pass set
  • Delete old backups after 35 days
  • Additional Destination options: Do Strftime-style, Transfer each virtual server after it is backed up
  • Backup format: one file per server, create destination directory
  • Action on error: Continue
  • Backup level: FULL
  • E-mail backup report to: my e-mail
  • Scheduled Backup Time: Complex schedule: Every Monday at 03:00

Second backup schedule:


  • All virtual servers
  • All features and settings
  • to SSH server
    File on server: /home/backupname/backup/INCR/%Y-%m-%d-%H%M
  • User & Pass set
  • Delete old backups after 35 days
  • Additional Destination options: Do Strftime-style, Transfer each virtual server after it is backed up
  • Backup format: one file per server, create destination directory
  • Action on error: Continue
  • Backup level: Incremental
  • E-mail backup report to: my e-mail
  • Scheduled Backup Time: Complex schedule: At cron time 0,30 * * * *

The output of mount command:

[root@turing remi]# mount
/dev/md4 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/md0 on /boot type ext4 (rw)
/dev/md1 on /data type ext4 (rw)
/dev/md2 on /secure type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

Output of df-h:

Filesystem Size Used Avail Use% Mounted on
/dev/md4 3.6T 7.1G 3.4T 1% /
tmpfs 16G 4.0K 16G 1% /dev/shm
/dev/md0 496M 51M 420M 11% /boot
/dev/md1 3.6T 1.3T 2.2T 38% /data
/dev/md2 496M 11M 460M 3% /secure

Error message I get by mail:

Uploading archive to SSH server XX.XX.XX.XX …
… upload failed! sshbackup@XX.XX.XX.XX’s password:
backup.be.tar.gz 0% 0 0.0KB/s --:-- ETAbackup.be.tar.gz 100% 22KB 21.8KB/s 00:00
Write failed: Broken pipe
lost connection

or:

Uploading archive to SSH server XX.XX.XX.XX …
… upload failed! sshbackup@XX.XX.XX.XX’s password:
286162_31137_28_backup.pl 0% 0 0.0KB/s --:-- ETA286162_31137_28_backup.pl 100% 6060 5.9KB/s 00:00
Write failed: Broken pipe
lost connection

… completed in 2 hours, 0:48 minutes

Let me please also state that both links have at least 100Mbit/s uplinks. (ssh server has 120Mbit/sec, production has 100Mbit/sec)

I hope somebody has any use to this so this might solve a problem.

I do not ‘really’ have a problem with this since backups are running every 30 min, but as you can see with the last failed one, it took 2:48h to complete. that might create a problem for me …

This has now become an everyday occurrence. :frowning:

Do you have access to the SSH server logs from the remote system? It looks like the connection is being killed, and the logs might explain why.

Thanks, but it’s a whole lotta nothing:

Apr 21 00:54:13 localhost sshd[22882]: Accepted password for backup from 192.168.1.11 port 33354 ssh2 Apr 21 00:54:13 localhost sshd[22882]: pam_unix(sshd:session): session opened for user backup by (uid=0) Apr 21 00:54:15 localhost sshd[22882]: pam_unix(sshd:session): session closed for user backup Apr 21 00:54:17 localhost sshd[22902]: Accepted password for backup from 192.168.1.11 port 33355 ssh2 Apr 21 00:54:17 localhost sshd[22902]: pam_unix(sshd:session): session opened for user backup by (uid=0) Apr 21 00:54:18 localhost sshd[22902]: pam_unix(sshd:session): session closed for user backup Apr 21 00:54:20 localhost sshd[22922]: Accepted password for backup from 192.168.1.11 port 33356 ssh2 Apr 21 00:54:20 localhost sshd[22922]: pam_unix(sshd:session): session opened for user backup by (uid=0) Apr 21 00:54:21 localhost sshd[22922]: pam_unix(sshd:session): session closed for user backup Apr 21 00:54:25 localhost sshd[22942]: Accepted password for backup from 192.168.1.11 port 33357 ssh2 Apr 21 00:54:25 localhost sshd[22942]: pam_unix(sshd:session): session opened for user backup by (uid=0)

Does the remote user perhaps have a shell like scponly that prevents execution of commands like mkdir ?

Nope, just a regular bash shell for everyone.

Since this problem seems to be limited to a few users, it would be useful if I could get remote access to your system to see what is really going wrong. Email me directly at jcameron@virtualmin.com if that is possible.

Hi I also have this problem. I’m using GPL Webmin / Virtualmin 1.782

I get an email after 2 days and 10 hours with this error:

gzip: stdout: Broken pipe
/bin/tar: -: Wrote only 4096 of 10240 bytes
/bin/tar: Error is not recoverable: exiting now

Can anyone please help fix this.

Just another quick update. I realised that I DO have automatic spin-down enabled on my backup server after 45 mins of non-use.

It’s a long shot as I know other users have the same error so this probably isn’t the cause, but I’m running another backup now with auto-spindown turned off just to see if this is the cause of the issue. If it is causing the problem is there some kind of “keep-alive” solution that webmin/virtualmin could have when backup up to keep the backup drive mounted and powered up?

Hi further to this, turning off the spindowns on my backup box hasn’t fixed this. I now get this error:

Uploading archive to SSH server 192.168.0.2 …
… upload failed! root@192.168.0.2’s password:
/home/webmintemp/426161_12111_39_backup.pl: No such file or directory

… completed in 2 days, 2 hours, 35:18 minutes

Can anyone please help with a fix?

I now have the fix as per help from a webmin bug request. If you go to Webmin Configuration, Advanced Options and then set ‘Maximum Age of Temporary Files’ to more days such as 30 days then the backups now complete as they’re not getting cleaned up before they’ve finished if very big!

Thanks and I hope this helps.