Linux: Backup Strategies

From ReceptiveIT
Jump to: navigation, search

What is a Backup?

So, what is a backup? In simple terms, it is a complete seperate copy of all your important data. In real terms, a backup is one string in your bow of risk mitigation, and other tasks such as RAID, are outside of the scope of this article.

All of us know that we need to do backups, but as they we won't need them just yet, they are often overlooked until it is too late.

In most real world scenarios, a backup will occur once every 24 hour period of a given work week. This will mean, that if the computer happens to catch fire, we would lose, at most, 24 hours worth of work. That, for most people, is an acceptable risk.

Where do you keep your backups? Some people will buy a separate hard disk to perform the backups, but if the fire happens to spread and burn down the whole building, then you have lost your data and your backup in one hit. In most real world scenarios, there is always an off-site component to the backups. Either a staff member will take some tapes home, or like in the following article, the data is automatically synchronised over the internet to another location.

In this example, we will have the following backup schedule

  • Hourly, we will synchronise our important data to an off-site computer
  • Daily, we will back up all our data to a separate hard drive that is permanently attached, but not part of the same RAID array
  • Weekly, we will archive a single backup to an archive server

Hourly

The Problem

  • We need to create a backup script that will take a list of important directories and copy them to an off-site server.
  • We need to make sure that the script isn't running more than once on the same machine
  • We need to make sure that people know when the script has succeeded, or failed.
  • Some directories cannot be backed up without either stopping the services, or taking a snapshot of the filesystem.

The Solution

  • I will create a BASH script that will automate the task of backing up, but keep the script generic with all configuration in separate configuration files.
  • I will make sure that the script enters an endless loop, and the script will be initiated with Upstart (You could also use inittab if your distribution does not have Upstart)
  • I will build email notifications into the script
  • I will use LVM to take a snapshot of the path in question before backing up the data, this way the data doesn't change half way through the backup process

Script Dependencies

The Disaster Recovery Server

Directories

  • I will assume that you have a directory called /netsync and it has sufficient space to receive a backup

Packages

apt-get install rsync

Configuration Files

vi /etc/rsyncd.conf
 
# GLOBAL OPTIONS

#motd file=/etc/motd
#log file=/var/log/rsyncd
# for pid file, dont' use /var/run/rsync.pid unless you're not going to run
# rsync out of the init.d script. The /var/run/rsyncd.pid below is OK.
pid file=/var/run/rsyncd.pid
#syslog facility=daemon
#socket options=

# Netsync module
[server-backup]
comment = Location for off-site backup
path = /netsync/server-backup
hosts allow = 192.168.10.10
read only = no
list = yes
uid=0
gid=0
dont compress = *.gz *.tgz *.zip *.z *.rpm *.deb *.iso *.bz2 *.tbz
vi /etc/default/rsync

Change RSYNC_ENABLE to be true

/etc/init.d/rsync start

Hooray! You now have a listening rsync server.

The Office Server

Assumptions

  • Linux distribution is Ubuntu Server 10.04 64-bit
  • Filesystem is XFS
  • Filesystems are stored in LVM containers
  • There is at least 15G of unallocated space in the LVM volume group

Directories

mkdir -p /etc/backup
mkdir -p /var/log/backup
mkdir -p /mnt/sync-snapshot

Packages

apt-get install rsync bsd-mailx gawk

Configuration Files

vi /etc/backup/sync.conf

SYNC_NOTIFY="[email protected]"
SYNC_SOURCES="/etc/ /var/lib/cyrus/ /var/spool/cyrus/ /usr/local/bin/" 
SYNC_TARGETS="offsite.domain.com"
SYNC_MODULE="server-backup"
SYNC_FREQUENCY="3600"

Snapshot selection

Mail servers like Cyrus have files that change all the time. We will need to snapshot the Cyrus directories before making a backup
touch /var/lib/cyrus/.sync-snapshot
touch /var/spool/cyrus/.sync-snapshot

The Script

First create the Upstart launcher script

vi /etc/init/sync.conf
# server sync

start on mounted

stop on runlevel [01456]

respawn
exec /usr/local/bin/syncserver.sh

Then create the sync script itself

vi /usr/local/bin/syncserver.sh
#!/bin/bash
# Hotsync Script
#

defaults () {
 # Defaults
 EXIT_CODE=""
 SNAPSHOT_MAGIC=".sync-snapshot"
 SNAPSHOT_PATH="/mnt/sync-snapshot/"
 SNAPSHOT_SIZE="15G"
 SNAPSHOT_MOUNTOPTIONS="ro,nouuid"
 SNAPSHOT_LVNAME="syncsnap"
 SYNC_USERNAME=""
 SYNC_PASSWORD_FILE=""
 SYNC_MODULE=""
 SYNC_CONFDIR="/etc/backup"
 SYNC_CONF_FILE="sync.conf"
 SYNC_LOGDIR="/var/log/backup"
 SYNC_LOG="/var/log/backup/sync.log"
 SYNC_NOTIFY="root"
 SYNC_HOSTNAME=`hostname`
 SYNC_FREQUENCY=3600
 SYNC_INCNUM=0
 SYNC_MAXINC=24
 SYNC_TIMEOUT=1800
 SYNC_SUCCESS_FILE="/var/log/backup/.sync-success"
}

function rsync_dereference () {
 ERROR_CODE=""

 case $2 in
  0)
   ERROR_CODE="Success. Move along, nothing to see here!"
   ;;
  1)
   ERROR_CODE="Syntax or usage error"
   ;;
  2)
   ERROR_CODE="Protocol incompatibility"
   ;;
  3)
   ERROR_CODE="Errors selecting input/output files, dirs"
   ;;
  4)
   ERROR_CODE="Requested action not supported: an attempt was made to manipulate 64-bit files on a platform that can- not support them; or an option was specified that is supported by the client and not by the server."
   ;;
  5)
   ERROR_CODE="Error starting client-server protocol"
   ;;
  6)
   ERROR_CODE="Daemon unable to append to log-file"
   ;;
  10)
   ERROR_CODE="Error in socket I/O"
   ;;
  11)
   ERROR_CODE="Error in file I/O"
   ;;
  12)
   ERROR_CODE="Error in rsync protocol data stream"
   ;;
  13)
   ERROR_CODE="Errors with program diagnostics"
   ;;
  14)
   ERROR_CODE="Error in IPC code"
   ;;
  20)
   ERROR_CODE="Received SIGUSR1 or SIGINT"
   ;;
  21)
   ERROR_CODE="Some error returned by waitpid()"
   ;;
  22)
   ERROR_CODE="Error allocating core memory buffers"
   ;;
  23)
   ERROR_CODE="Partial transfer due to error"
   ;;
  24)
   ERROR_CODE="Partial transfer due to vanished source files"
   ;;
  25)
   ERROR_CODE="The --max-delete limit stopped deletions"
   ;;
  30)
   ERROR_CODE="Timeout in data send/receive"
   ;;
  *)
   ERROR_CODE="Unknown error. Really, I don't know!"
   ;;
 esac

 ERROR_CODE="(${2}) ${ERROR_CODE}"
 eval "$1=\"${ERROR_CODE}\""
}

# Populate default variables
defaults

# Datestamp of when the sync process started
date >> $SYNC_LOG

# Get configuration
if [ ! -s ${SYNC_CONFDIR}/${SYNC_CONF_FILE} ]
then
        echo "Sync configuration does not exist!" >> $SYNC_LOG
        cat $SYNC_LOG | mail -s"BACKUP FAILED" $SYNC_NOTIFY
        exit 1
fi

# Source the backup configuration
. ${SYNC_CONFDIR}/${SYNC_CONF_FILE}

# Go Into a loop....makes sure sync doesn't overlap.
while true
do

CURRENTHOUR=`date +"%H"`

SYNCERR=0

DATE=`date +"%F %R"`
EPOCHDATE=`date +%s`
EPOCHDATE=$(($EPOCHDATE - $(($EPOCHDATE % 86400))))
TODAY=`echo | gawk 'BEGIN {print strftime("%F", ARGV[1])}' $EPOCHDATE`
EPOCHLAST=$(($EPOCHDATE - 86400))
YESTERDAY=`echo | gawk 'BEGIN {print strftime("%F", ARGV[1])}' $EPOCHLAST`
DAYOFWEEK=`date +%u`
THISSTAMP=`date +"%Y%m%d%H%M"`
EPOCHNOW=`date +%s`

SYNC_INCNUM=$(($SYNC_INCNUM + 1))
if [ $SYNC_INCNUM -gt $SYNC_MAXINC ]
then
  SYNC_INCNUM=0
fi

LOG="$SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-${SYNC_INCNUM}.log"
FULL_LOG="$SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-${SYNC_INCNUM}-full.log"
echo "Full sync $DATE." | tee $LOG > $FULL_LOG

for SYNC_SOURCE in $SYNC_SOURCES
do
 # Check to see if the source path needs to have a snapshot taken
 if [ -e /${SYNC_SOURCE}/${SNAPSHOT_MAGIC} ]
 then
  echo "Creating snapshot for ${SYNC_SOURCE} before sync..." | tee $LOG >> $FULL_LOG
  # Find mountpoint
  SNAPSHOT_DIRSPLIT=(`echo ${SYNC_SOURCE} | tr '/' ' '`)
  LEN=${#SNAPSHOT_DIRSPLIT[@]}

  for (( i=${#SNAPSHOT_DIRSPLIT[@]}+1; i>1; i-- ));
  do
   SNAPSHOT_TEST=`echo ${SYNC_SOURCE} | cut -d '/' -f1-${i}`
   mountpoint -q "${SNAPSHOT_TEST}"
   if [ $? -eq 0 ]
   then
     # Mountpoint found
     SNAPSHOT_MOUNTPOINT=${SNAPSHOT_TEST}
     break
   fi
  done

  # Find device for mountpoint
  SNAPSHOT_DMDEVICE=`df ${SNAPSHOT_MOUNTPOINT}  | awk 'NR==2 {print $1}'`
  SNAPSHOT_VGNAME=`echo ${SNAPSHOT_DMDEVICE:12} | cut -d '-' -f1`
  SNAPSHOT_SOURCELV=`echo ${SNAPSHOT_DMDEVICE:12} | cut -d '-' -f2`

  lvcreate -s -n ${SNAPSHOT_LVNAME} -L+${SNAPSHOT_SIZE} /dev/${SNAPSHOT_VGNAME}/${SNAPSHOT_SOURCELV}
  mount -o ${SNAPSHOT_MOUNTOPTIONS} /dev/${SNAPSHOT_VGNAME}/${SNAPSHOT_LVNAME} ${SNAPSHOT_PATH}
  SNAPSHOT=1
 else
  echo "Snapshot not required for ${SYNC_SOURCE} to sync..." | tee $LOG >> $FULL_LOG
  SNAPSHOT=0
 fi

 for SYNC_TARGET in $SYNC_TARGETS
 do
  # Reset working variables
  HOST_USERNAME=""
  HOST_PASSWORD_FILE=""
  HOST_MODULE=""
  RSYNC_USERNAME=""
  RSYNC_PASSWORD_FILE=""
  RSYNC_MODULE=""

  # Check to see if there is a custom config for this sync target
  if [ -e /${SYNC_CONFDIR}/${SYNC_TARGET}.conf ]
  then
   # Source the host specific backup configuration
   echo ${SYNC_CONFDIR}/${SYNC_TARGET}.conf
   . ${SYNC_CONFDIR}/${SYNC_TARGET}.conf

   # Populate host specific options
   if [ ${HOST_USERNAME} ]
   then
    RSYNC_USERNAME="${HOST_USERNAME}@"
   fi

   if [ ${HOST_PASSWORD_FILE} ]
   then
    RSYNC_PASSWORD_FILE="--password-file ${HOST_PASSWORD_FILE}"
   fi

   if [ ${HOST_MODULE} ]
   then
    RSYNC_MODULE="${HOST_MODULE}"
   fi
  else
   if [ ${SYNC_USERNAME} ]
   then
    RSYNC_USERNAME="${SYNC_USERNAME}@"
   else
    RSYNC_USERNAME=""
   fi

   if [ ${SYNC_PASSWORD_FILE} ]
   then
    RSYNC_PASSWORD_FILE="--password-file ${SYNC_PASSWORD_FILE}"
   else
    RSYNC_PASSWORD_FILE=""
   fi

    if [ ${SYNC_MODULE} ]
    then
     RSYNC_MODULE="${SYNC_MODULE}"
    else
     echo "Sync module does not exist!" >> $SYNC_LOG
     cat $SYNC_LOG | mail -s"BACKUP FAILED" $SYNC_NOTIFY
     exit 1
    fi
  fi

  echo -n "Syncing $SYNC_SOURCE to $SYNC_TARGET..." | tee -a $LOG >> $FULL_LOG

  if [ $SNAPSHOT -eq 1 ]
  then
   RSYNC_SOURCE=${SNAPSHOT_PATH}/./${SYNC_SOURCE}
  else
   RSYNC_SOURCE=${SYNC_SOURCE}
   echo
  fi

  # Perform Sync
  rsync -auR -v --timeout=$SYNC_TIMEOUT --delete ${RSYNC_PASSWORD_FILE} $SYNC_SOURCE ${RSYNC_USERNAME}${SYNC_TARGET}::${RSYNC_MODULE}/${SYNC_HOSTNAME} >> $FULL_LOG 2>&1

  # Catch errors
  EXIT_CODE=$?

  if [ ${EXIT_CODE} -ne 0 ]
  then
   # Dereference Rsync error code
   rsync_dereference EXIT_CODE ${EXIT_CODE}

   echo "ERROR ${EXIT_CODE}" | tee -a $LOG >> $FULL_LOG
   SYNCERR=1
  else
   echo "OK" | tee -a $LOG >> $FULL_LOG
  fi

  # Unmount snapshot if required
  if [ $SNAPSHOT -eq 1 ]
  then
   echo -n "Unmounting and removing snapshot..." | tee -a $LOG >> $FULL_LOG
   umount ${SNAPSHOT_PATH}
   lvremove -f /dev/${SNAPSHOT_VGNAME}/${SNAPSHOT_LVNAME}
  fi
 done
done

if [ $SYNCERR -gt 0 ] 
then
        echo "Sync was not completed successfully." | tee -a $LOG >> $FULL_LOG
        cat $LOG | mail -s "$THISSTAMP Sync Failed $SYNC_INCNUM - ${SYNC_HOSTNAME}" $SYNC_NOTIFY
else
        echo "Sync SUCCESSFUL!" | tee -a $LOG >> $FULL_LOG
        touch $SYNC_SUCCESS_FILE
fi

if [ $SYNC_INCNUM -eq $SYNC_MAXINC ]
then
        cat $SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-?.log $SYNC_LOGDIR/${SYNC_HOSTNAME}-sync-??.log | mail -s "$THISSTAMP Sync digest $SYNC_INCNUM - ${SYNC_HOSTNAME}" $SYNC_NOTIFY
fi

sleep ${SYNC_FREQUENCY}
done

Start the sync

initctl start sync

Now all you have to do is watch the log files in /var/log/backup and check your emails at [email protected]

Multiple Targets

  • This script can support multiple targets. That is, it will rsync the contents of the source directory, to each server specified in sync.conf as SYNC_TARGETS
  • You can specify a configuration file per target to set things like username, password and module name.
HOST_USERNAME="backup-user"
HOST_PASSWORD_FILE="/etc/backup/secure.conf"
HOST_MODULE="rsync_backup"

Conclusion

The script has hopefully told you that it has successfully backed up your data, and you should be able to confirm that the data is also on the disaster recovery server.

  • The script is always running. It will sync, then sleep for SYNC_FREQUENCY, and then start all over again.
  • If the directory needs to have a snapshot taken before a sync, simply touch SNAPSHOT_MAGIC in that directory. If the file exists, a snapshot will be performed
  • Snapshot size is by default 15G, but can be overridden by setting SNAPSHOT_SIZE
  • I use XFS as my preferred filesystem. SNAPSHOT_MOUNTOPTIONS might need to be tweaked for other filesystems.
  • If there was a problem with the sync, the script will send an email before sleeping.
  • If the sync succeeded, the script will touch SYNC_SUCCESS_FILE. This is useful if you use pro-active monitoring like Zabbix.
  • A digest email will be sent out every so often. The exact frequency can be calculated by SYNC_MAXINC x SYNC_FREQUENCY

Daily

Weekly