A few days ago, I learned a very important lesson about filesystems and snapshots. I learned that a complete copy is not always a Good Thing™.

I help manage a server for our local Linux Users Group. We have about 250 users on the system, and all of our system administration is done by volunteers.

A few months ago, I made a complete backup of our /home partition using the guidelines that have been told to me by Smart People™:

  • make a snapshot volume of /home (called home-snap)
  • make a new empty volume (called home-backup)
  • use ‘dd‘ to copy from home-snap to home-backup
  • remove the home-snap snapshot volume

All was fine, until a few months later, when we decided to reboot.

When the machine rebooted, it mounted the WRONG copy of /home. It looked in /etc/fstab to see what to mount, read the UUID, and started looking for that filesystem among the logical volumes.

Here’s a list of the available filesystems and their UUID’s.

root@pilot:~# blkid
/dev/mapper/vg01-home: UUID="1a578e6f-772b-4892-86e3-1181aadda119" TYPE="ext3" SEC_TYPE="ext2"
/dev/mapper/vg01-home-backup: UUID="1a578e6f-772b-4892-86e3-1181aadda119" TYPE="ext3" SEC_TYPE="ext2"
/dev/mapper/vg01-swap: TYPE="swap" UUID="303f2743-da69-466b-a200-40a1a369fa1c"
/dev/mapper/vg01-u804: UUID="b5689a93-b7ad-4011-a0f9-ffaf2d68bf6f" TYPE="ext3"
/dev/sdb: UUID="Uh0TI1-pxD4-M1Pm-5kP3-zU1a-IRgm-bD0JAq" TYPE="lvm2pv"
/dev/sda: UUID="9oZhBo-3DPP-1eay-kgGM-fd06-yuJB-c2eCo7" TYPE="lvm2pv"
/dev/sdc1: UUID="5c15308e-a81b-4fd9-b2c2-7ef3fe39ce0b" SEC_TYPE="ext2" TYPE="ext3"
/dev/sdc2: TYPE="swap" UUID="08c55fa5-3379-4f6a-b798-4b8f3ead6790"
/dev/sdc3: UUID="5a544a7f-90ed-474c-b096-1b5929c83109" SEC_TYPE="ext2" TYPE="ext3"
root@pilot:~#

Notice anything goofy? Yes, the UUID for the home volume is the same as the UUID for the home-backup volume! Of course it is… I used ‘dd‘ to copy the entire volume!

So our machine booted up, looked for a filesystem whose UUID was ‘1a578e6f-772b-4892-86e3-1181aadda119’ and it mounted it on /home. Unfortunately, it found the home-backup volume before it found the real home volume, and so our 250 users took a step back in time for the evening.

All of the files in our home directories looked like they did back in May.

On the surface, this does not seem like such a Bad Thing™. But over the course of the next few hours, users started receiving email, and logging IRC chats, and doing all of the other things that users do. These new emails and log files were written to home-backup instead of home, and so now we were starting to mix old and new files.

This is a lot like the movie “Back to the Future”, when Marty’s mom tries to kiss him. Except the characters involved here are not as good-looking.

The fix was quick and painless. I simply generated a new UUID for the home-backup volume, and then rebooted. The magic command is simply:

 tune2fs -U random /dev/mapper/vg01-home-backup

But the cleanup would come later. If someone were interested in the emails or log files that were mistakenly written to the wrong volume (their “past life”), then they would need to look on that volume for “new” files. Pretty easy work.

find /mnt/home-backup/porter -mtime -7

This will show all files in my “backup” home directory that are less than a week old. Since the backup was made four months ago, I would expect all files in that directory to either be more than four months old, or just one day old. This command will show you the new files.

So I am revising the backup procedure as follows:

  • make a snapshot volume of /home (called home-snap)
  • make a new empty volume (called home-backup)
  • use ‘dd‘ to copy from home-snap to home-backup
  • remove the home-snap snapshot volume
  • change the UUID on home-backup ◄— new

In fact, now that we already have a base to work with, I might just use rsync to copy files instead of dd to copy the entire volume. This will leave the backup with its own UUID, and will avoid collisions like the one we saw.