« Symbolism | Main | slow but steady »
March 14, 2003
Online backups
One of the really neat things about having a computer is the hidden feature better known as crashes. We pay big money for this option called stability, but truth be told, there will never be 100% stability. With this in mind, the onus is always on the user to do this thing called backups.
"There are only two types of data. Data that is backed up, and data that is yet to be lost."
A great quote that surmises any situation. In the past I've used my servers as a backup medium for my desktops and laptop. I was under the impression that this should be more than enough. Recently I began to doubt this after having a machine disappear from my online grasp. Steps were taken to prevent that, but nothing was taken to protect the data within.
First step was to install a second drive in the machine. While not providing any kind of RAID services, we have essentially faked it with a not-so-fine granularity. This was the first step to protecting data locally. Unfortunately, this also leaves us with one single point of failure. Enter the joy of two servers and a small command known as rsync.
rsync is a fairly easy to understand utility through just reading the man page. Takes a few minutes to setup, test, confirm, and then you're off running full backups. One of the more interesting (but not highly spoken about) features is the --bwlimit option. Essentially you can average your bandwidth to be limited to some selected number. Please note it's averaged and as such it's not true bandwidth limiting (it can be bursty). Our tests tonight show that setting a bwlimit to 16 gets us an average between 25kB/s (typical) with bursts as high as 50kB/s.
But how does one secure rsync? You can invoke it to use an ssh shell pretty easily, but one of the bigger questions I had was what exactly does this do? A little web searching later and I have an answer. The tunneling is pretty much standard issue tunneling and required setting up ssh keys. Bennot Todd though has found a much more interesting means of locking down the process at the ssh level. It makes sense when you think about it. I am glad he found the --server option though to rsync.
After some time trying to figure out why an ssh key wasn't working, Kevin and I ran our first test. Things went very smoothly, and tonight at about midnight the first full backup of the servers began.
Next up will be database data and log files.
Posted by Dan at March 14, 2003 09:54 PM