Skip to content. | Skip to navigation

Personal tools
You are here: Home TBSI Technology Blog Topics rsync

rsync

Feb 25, 2012

Backing up Windows computers with dirvish

by Eric Smith — last modified Feb 26, 2012 07:15 PM

We use dirvish to back up our Linux servers and workstations. Until now, it has not been possible to back up Windows computers. That changes with an open source program we wrote.

dirvish

For many years we have used dirvish to back up our Linux servers in an efficient way. dirvish presents what appear to be full snapshot backups, but they are incremental in both time to create and disk space to store them. dirvish relies heavily on rsync for efficient transfers and hard links for efficient storage. The backup server connects to the machine to be backed up (the target) over ssh for authentication and privacy, then runs rsync on the target to perform the incremental backup.

Problems with Windows and dirvish

It has always been problematic to back up Windows computers with dirvish. There are several problems:

  • Windows has no native ssh server.
  • Windows does not come with an rsync program.
  • Unlike Linux, Windows does not easily allow you to back up files that are open by another process.
Let's look at these one-by-one.

ssh server

There are commercial ssh servers available for Windows. An excellent one that I've used is VShell, by Van Dyke Software. There are no doubt others. However, these are expensive if you want to install them on every target computer you wish to back up. VShell, for example, is $350 with 3 years of updates.

Another option is cygwin. cygwin is a free and open source product that provides many Unix-like utilities, including an ssh server. However, until recently the cygwin ssh server has had a well-known problem where it hangs when using rsync. Fortunately this problem has been fixed. cygwin version 1.7.11-1 does not hang like older versions did.

So now we have a free ssh server we can use.

rsync

Once we're already using cygwin's ssh server, it's an obvious choice to use cygwin's rsync program. No problems here.

Backing up open files

The final problem we're left with is what to do with open files on the Windows target computers. For a long time I tried to skip all open files by explicitly listing them in the dirvish configuration file. But some files are just never closed: file accessed by services such as database and long running programs such as email clients would just never get backed up. And there's just no way to list every file that might be opened. What if I'm editing a Word file during a scheduled backup? And by default, dirvish would consider one of these open files as a fatal error and would mark the entire backup as unusable. While you can get around this by editing the source code to dirvish, this is not a very elegant procedure. And you're still left with the fact that these open files are never backed up.

An obvious solution to this is to use the Windows Volume Snapshot Service, VSS. VSS allows you to take a read-only snapshot of an entire drive. Once you have that snapshot, it can be made available under another drive letter. And the best part is that every single file in the snapshot can be read. You'll never get an "in use" error when trying to read the files.

So now we have an internally consistent set of readable files to back up.

Putting it all together

We have all of the pieces we need to use dirvish to back up a Windows target computer. But how to put it together?

It would seem like we could use dirvish's pre-client and post-client hooks to create and tear down the VSS snapshot. These are commands that dirvish will run before and after it runs rsync on the target. Unfortunately that won't work, because while the pre-client hook can create the snapshot, it will be inaccessible once the pre-client hook ends and rsync is executed.

So what we need is a program that looks works just like rsync, but creates a VSS snapshot during the duration of the rsync run.

I thought of modifying the source to rsync, but that seems like an ongoing maintenance problem forever.

Enter tb-rsync-vss

So what I did was write a program to bring together all of the parts: tb-rsync-vss. This is an open source program, licensed under the Apache Software License, Version 2.0. tb-rsync-vss is a native Windows executable that creates a VSS snapshot, maps it to a drive letter (which is an cygwin rsync requirement), and then calls the real rsync program with modified parameters to actually perform the backup. When rsync is complete, tb-rsync-vss cleans up and exits.

As far as the dirvish server is concerned, it's just running a custom version of rsync. As far as rsync knows, it's just running against a new drive (maybe drive "x:" instead of drive "c:"). The specifics of the configuration are covered in the README file.

On the tb-rsync-vss bitbucket page I've provided the source code, a Visual Studio 2010 project file, and pre-built .msi files for the 32- and 64-bit versions of tb-rsync-vss. True Blade is providing these back to the dirvish community as a thank-you for the many years we've used and benefited from dirvish and so many other open source products.

Jan 29, 2010

Using rsync with an Amazon EC2 Fedora 8 image

by Eric Smith — last modified Jan 30, 2010 07:10 AM

Amazon provides a number of Fedora 8 images. Unfortunately the provided kernels cause a problem with rsync. Find out how to resolve the problem.

We've recently started investigating Amazon EC2 for some of our computing needs. So far our progress has been excellent. I've been focusing on using the Amazon-supplied Fedora 8 (F8) images, in particular ami-48aa4921, although the problem I describe here applies to all of the F8 AMIs that Amazon provides.

One significant roadblock has been a problem with rsync. In particular, we use the excellent dirvish for our online backups. Unfortunately, Amazon uses the 2.6.21 kernel in its F8 images and this version does not support the lutimes system call. The version of rsync that comes with F8 uses lutimes to set the modification time on directories. lutimes isn't available until the 2.6.22 kernels. For more information on the issue with rsync and lutimes, see the rsync bug entry.

The symptom is errors in the dirvish rsync_error log files of the form:

rsync: failed to set times on "<directory-name>": Function not implemented (38)

Dirvish sees these as fatal errors and marks the images as failed. This prevents dirvish from performing its incremental backups.

Because rsync does not have a runtime switch to ignore lutimes, the easiest way to solve this is to produce a version of rsync that doesn't use the call at all. Unfortunately rsync does not have an autoconf switch to turn off lutimes, so I had to patch configure.in and rebuild. The change is simple, here's the diff I use:

--- rsync-2.6.9/configure.in.orig        2010-01-29 15:37:35.000000000 -0500
+++ rsync-2.6.9/configure.in    2010-01-29 15:38:07.000000000 -0500
@@ -528,7 +528,7 @@
 AC_FUNC_UTIME_NULL
 AC_FUNC_ALLOCA
 AC_CHECK_FUNCS(waitpid wait4 getcwd strdup chown chmod lchmod mknod mkfifo \
-    fchmod fstat ftruncate strchr readlink link utime utimes lutimes strftime \
+    fchmod fstat ftruncate strchr readlink link utime utimes strftime \
     memmove lchown vsnprintf snprintf vasprintf asprintf setsid glob strpbrk \
     strlcat strlcpy strtol mallinfo getgroups setgroups geteuid getegid \
     setlocale setmode open64 lseek64 mkstemp64 mtrace va_copy __va_copy \

I created a new RPM for rsync. Here's the diff for the .spec file:

--- rsync.spec.orig     2008-04-09 10:36:56.000000000 -0400
+++ rsync.spec  2010-01-29 22:21:13.000000000 -0500
@@ -1,7 +1,7 @@
 Summary: A program for synchronizing files over a network.
 Name: rsync
 Version: 2.6.9
-Release: 5%{?dist}
+Release: 5%{?dist}.trueblade.0
 Group: Applications/Internet
 # TAG: for pre versions use
 #Source:       ftp://rsync.samba.org/pub/rsync/rsync-%{version}pre1.tar.gz
@@ -10,6 +10,7 @@
 Patch1: rsync-2.6.9-acl-xattr-delete-bug.patch
 Patch2: rsync-2.6.9-hlink-segv.patch
 Patch3: rsync-3.0.1-xattr-alloc.diff
+Patch4: rsync-2.6.9-disable-lutimes.patch
 BuildRequires: libacl-devel, libattr-devel, autoconf, make, gcc, popt-devel
 Prefix: %{_prefix}
 BuildRoot: /var/tmp/%{name}-root
@@ -33,6 +34,7 @@
 %patch1 -p1 -b .acl_xattrs_bug
 %patch2 -p1 -b .hlink_segv
 %patch3 -p1 -b .xattr-alloc
+%patch4 -p1 -b .lutimes
 
 %build
 rm -fr autom4te.cache
@@ -62,6 +64,10 @@
 %{_mandir}/man5/rsyncd.conf.5*
 
 %changelog
+* Fri Jan 29 2010 Eric V. Smith <eric@trueblade.com> 2.6.9-5.fc8.trueblade.0
+- Added patch4 to remove lutimes, since the EC2 kernel in ami-48aa4921 does
+  not support it.
+
 * Tue Apr  8 2008 Simo Sorce <ssorce@redhat.com> 2.6.9-5.fc8
 - Security release: http://rsync.samba.org/security.html#s3_0_2 

Once I had the new RPM, I signed it and added it to our local RPM repository. Because it has a newer version than the one supplied with F8, it will automatically be picked up by "yum update".

The only remaining complication is that the AMI I'm using supplies its own copy of rsync in /usr/local/bin, in addition to the one supplied by default in /usr/bin. I'm not sure why Amazon did this, because the /usr/local/bin version has the same problem as the /usr/bin one. The /usr/local/bin version does not come from an RPM, so I just delete it using our automated server configuration tool. The files to delete are:

/usr/local/bin/rsync
/usr/local/man/man1/rsync.1
/usr/local/man/man5/rsyncd.conf.5
/usr/local/share/man/man1/rsync.1
/usr/local/share/man/man5/rsyncd.conf.5

Once the new RPM updates rsync and the unneeded /usr/local files are deleted, rsync and dirvish are again working correctly.

I'd rather solve this problem by upgrading the kernel to 2.6.22 or newer, but upgrading the kernel is a non-trivial task with AWS. I'd rather let Amazon handle that issue and instead focus on using the provided images. This way we can more easily upgrade when Amazon produces newer AMIs.

For a number of other approaches (but with fewer specifics), see this thread.