Thursday, May 17, 2007

Backup Strategy for SVN Repositories

Recently I started out setting up backup of SVN Repositories. Our source repositories are in GB's, so i needed some facility to do this easily. Unfortunately i couldn't find good routines in SVN distributed source to this task, so had to implement this myself.

The choices provided by SVN are:
1) hotcopy
2) dump
3) sync

Here's how i break out the choices
Hotcopy
This option allows you to make full repository backup. Very useful, but for large repositories, doing this daily gets prohibitive. Recovering is a snap, just plug the copy back at its original place.
Dump
This option allows you to do full or incremental dumps. I liked the incremental option. So i thought i would do a full hotcopy monthly and then incremental dumps daily. I scripted this , but during recovery i found that applying the first incremental to hotcopy gave me errors ,rest of incremental dumps went fine. No idea why that was happening, so gave up on this thought.

I then started thinking; just do way with hotbackup and go with full dump monthly and incremental dumps daily. This works out like a charm, so this is the choice i went with.
I had to script this as well, a nice shell script. I can't share that but the logic goes like :
First do a full dump, then daily do incremental dumps. After 200 or so dumps do a full dump and repeat. Since our repository is large, creating a full dump takes long time.
I have written recover script as well, which loops through all files and applies them in order.

Extending this thought, i said why don't i use this technique for post-commit operation, so i get incremental dumps for each commit. Since after 200 files generated, it does full dump, question became what happens when full dumps happen, since it takes some minutes ,if another commit happens then? To prevent 2 dumps from happening simultaneously, i put a lock file ,so only one dump happens at a time. This works great!

So, i have a robust backup scheme with this option. Note that you have to save your conf files and hooks file separately with this option.

Synch
I was thinking of doing this on same machine as a replacement to dump option, but didn't have time to research it further. Doing sync on separate machine is better. Somehow synch as backup didn't sound good to me. Anyone has any comments on synch??

Hopefully this post helps people figuring out SVN repository backup and restore.

1 comment:

dhsoni said...

http://svn.haxx.se/users/archive-2007-04/1006.shtml