Creating Xen Redundant Virtual Machines with Backup Procedures

It is a good idea to backup the whole virtual machine to a separate machine to achieve redundancy. 99% uptime and full redundancy can be achieved using on-the-fly mirroring, ie network raid 1. Hardware and network performance will determine if this method will work or not. There are a few software that can achieve this. Many linux administrators use drbd and heartbeat.

An alternative approach is to do a full nightly backup and incremental hourly backup in the day. This is less write intensive and there is a chance of losing an hour’s worth of work if the actual server goes down. But still, it is a decent solution if there are hardware constraints. I will focus more on this method.

Here is the idea. Imagine we have 2 real machines, machine 1 and 2. lvphp4 is a php4 logical volume running in machine2. It has a backup in machine1. In machine2, write a cron script that runs every hour to ssh into machine1 to mount the lvphp4 data partition (say partition 2). Then sync the data over to machine1. Once done, umount and send an email to the administrator if you want. Do the same for machine2. Assuming that there will not be any base OS changes in the day, we will sync the data only.

# mount the required partition
kpartx -a /dev/vg/lvphp4
mount /dev/mapper/lvphp4p2 /mnt/lvphp4p2
ssh root@machine1 "kpartx -a /dev/vg/lvphp4;mount /dev/mapper/lvphp4p2 /mnt/lvphp4p2;"
rsync -var -e ssh --delete --stats --progress /mnt/lvphp4p2 root@machine1:/mnt/
# now umount and cleanup everything
ssh root@machine1 "umount /mnt/lvphp4p2;kpartx -d /dev/vg/lvphp4;"
umount /mnt/lvphp4p2
kpartx -d /dev/vg/lvphp4

If the virtual machine lvphp4 in machine2 fails for whatever reason, we can bring the backup in machine1 up really quickly by sshing into machine1 and run

xm create /etc/xen/lvphp4

I believe this part can be integrated into a monitoring software (nagios for example) to achieve redundancy.

The reason why lvphp4 fails in machine2 is most likely due to hardware failure in machine2. Do not autostart lvphp4 so that when machine2 boots up, lvphp4 doesn’t start by itself. Once machine2 is repaired, choose one night to transfer the backup over from machine1 to machine2. in machine1,

xm shutdown /etc/xen/lvphp4
dd if=/dev/vg/lvphp4 | ssh root@machine2 "dd of=/dev/vg/lvphp4"

I am using this method in the live environment and it works perfectly. rsync does it’s job really well.

Like it.? Share it:
Tags: ,

Comments are closed.