Virtualmin SSH backups started failing.

Jamie · March 1, 2013, 7:28pm

Virtualmin version 3.99 will be out in a day or two, and should include a fix for this.

Yourname · April 23, 2013, 9:35am

I have the latest, and nothing got fixed.
My error is this for 3 out of the 50 or so domains I have:

Write failed: Broken pipe
lost connection

… completed in 3 minutes, 27 seconds

remibruggeman · April 25, 2013, 6:58am

Hello,

I’m having Virtualmin 3.99 and I am also facing some backup issues.
These backup plans are running on 2 servers:

First schedule

All virtual servers
All features and settings
to SSH server
File on server: /home/backupname/backup/%Y-%m-%d-%H%M
User & Pass set
Delete old backups after 35 days
Additional Destination options: Do Strftime-style, Transfer each virtual server after it is backed up
Backup format: one file per server, create destination directory
Action on error: Continue
Backup level: FULL
E-mail backup report to: my e-mail
Scheduled Backup Time: Complex schedule: Every Monday at 03:00

Second backup schedule:

All virtual servers
All features and settings
to SSH server
File on server: /home/backupname/backup/INCR/%Y-%m-%d-%H%M
User & Pass set
Delete old backups after 35 days
Additional Destination options: Do Strftime-style, Transfer each virtual server after it is backed up
Backup format: one file per server, create destination directory
Action on error: Continue
Backup level: Incremental
E-mail backup report to: my e-mail
Scheduled Backup Time: Complex schedule: At cron time 0,30 * * * *

The output of mount command:
[root@turing remi]# mount /dev/md4 on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/md0 on /boot type ext4 (rw) /dev/md1 on /data type ext4 (rw) /dev/md2 on /secure type ext4 (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
Output of df-h:
Filesystem Size Used Avail Use% Mounted on /dev/md4 3.6T 7.1G 3.4T 1% / tmpfs 16G 4.0K 16G 1% /dev/shm /dev/md0 496M 51M 420M 11% /boot /dev/md1 3.6T 1.3T 2.2T 38% /data /dev/md2 496M 11M 460M 3% /secure
Error message I get by mail:
Uploading archive to SSH server XX.XX.XX.XX … … upload failed! sshbackup@XX.XX.XX.XX’s password: backup.be.tar.gz 0% 0 0.0KB/s --:-- ETAbackup.be.tar.gz 100% 22KB 21.8KB/s 00:00 Write failed: Broken pipe lost connection
or:
Uploading archive to SSH server XX.XX.XX.XX … … upload failed! sshbackup@XX.XX.XX.XX’s password: 286162_31137_28_backup.pl 0% 0 0.0KB/s --:-- ETA286162_31137_28_backup.pl 100% 6060 5.9KB/s 00:00 Write failed: Broken pipe lost connection

… completed in 2 hours, 0:48 minutes

Let me please also state that both links have at least 100Mbit/s uplinks. (ssh server has 120Mbit/sec, production has 100Mbit/sec)

I hope somebody has any use to this so this might solve a problem.

I do not ‘really’ have a problem with this since backups are running every 30 min, but as you can see with the last failed one, it took 2:48h to complete. that might create a problem for me …

Yourname · April 26, 2013, 3:59am

This has now become an everyday occurrence.

Jamie · April 26, 2013, 8:15pm

Do you have access to the SSH server logs from the remote system? It looks like the connection is being killed, and the logs might explain why.

Yourname · April 29, 2013, 10:54am

Thanks, but it’s a whole lotta nothing:


Apr 21 00:54:13 localhost sshd[22882]: Accepted password for backup from 192.168.1.11 port 33354 ssh2
Apr 21 00:54:13 localhost sshd[22882]: pam_unix(sshd:session): session opened for user backup by (uid=0)
Apr 21 00:54:15 localhost sshd[22882]: pam_unix(sshd:session): session closed for user backup
Apr 21 00:54:17 localhost sshd[22902]: Accepted password for backup from 192.168.1.11 port 33355 ssh2
Apr 21 00:54:17 localhost sshd[22902]: pam_unix(sshd:session): session opened for user backup by (uid=0)
Apr 21 00:54:18 localhost sshd[22902]: pam_unix(sshd:session): session closed for user backup
Apr 21 00:54:20 localhost sshd[22922]: Accepted password for backup from 192.168.1.11 port 33356 ssh2
Apr 21 00:54:20 localhost sshd[22922]: pam_unix(sshd:session): session opened for user backup by (uid=0)
Apr 21 00:54:21 localhost sshd[22922]: pam_unix(sshd:session): session closed for user backup
Apr 21 00:54:25 localhost sshd[22942]: Accepted password for backup from 192.168.1.11 port 33357 ssh2
Apr 21 00:54:25 localhost sshd[22942]: pam_unix(sshd:session): session opened for user backup by (uid=0)

Jamie · April 29, 2013, 11:26pm

Does the remote user perhaps have a shell like scponly that prevents execution of commands like mkdir ?

Yourname · April 30, 2013, 9:50am

Nope, just a regular bash shell for everyone.

Jamie · May 1, 2013, 3:51am

Since this problem seems to be limited to a few users, it would be useful if I could get remote access to your system to see what is really going wrong. Email me directly at jcameron@virtualmin.com if that is possible.

soydemadrid · February 9, 2016, 9:19pm

Hi I also have this problem. I’m using GPL Webmin / Virtualmin 1.782

I get an email after 2 days and 10 hours with this error:

gzip: stdout: Broken pipe
/bin/tar: -: Wrote only 4096 of 10240 bytes
/bin/tar: Error is not recoverable: exiting now

Can anyone please help fix this.

soydemadrid · February 9, 2016, 9:31pm

Just another quick update. I realised that I DO have automatic spin-down enabled on my backup server after 45 mins of non-use.

It’s a long shot as I know other users have the same error so this probably isn’t the cause, but I’m running another backup now with auto-spindown turned off just to see if this is the cause of the issue. If it is causing the problem is there some kind of “keep-alive” solution that webmin/virtualmin could have when backup up to keep the backup drive mounted and powered up?

soydemadrid · March 8, 2016, 7:46pm

Hi further to this, turning off the spindowns on my backup box hasn’t fixed this. I now get this error:

Uploading archive to SSH server 192.168.0.2 …
… upload failed! root@192.168.0.2’s password:
/home/webmintemp/426161_12111_39_backup.pl: No such file or directory

… completed in 2 days, 2 hours, 35:18 minutes

Can anyone please help with a fix?

soydemadrid · March 14, 2016, 8:06pm

I now have the fix as per help from a webmin bug request. If you go to Webmin Configuration, Advanced Options and then set ‘Maximum Age of Temporary Files’ to more days such as 30 days then the backups now complete as they’re not getting cleaned up before they’ve finished if very big!

Thanks and I hope this helps.