Products Why ExaGrid? News/Events Partners Support Company Info Contact Us

ExaGrid's Eye on Deduplication

Current Articles | RSS Feed RSS Feed

Theories and Reality About Offsite Replication in Disk-based Backup with Deduplication

Posted by Bill Andrews on Thu, Oct 16, 2008 @ 10:08 AM
Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Share on LinkedIn LinkedIn 

There is a theory that suggests that if you use an inline de-duplication process versus post process that the replication to an offsite location will be complete at the offsite sooner.

There are a number of elements to be understood to determine if the theory is in fact correct.

With the inline process the de-duplication is occurring on the fly and therefore, the unique blocks can begin replicating to the second site immediately. However, there are two flaws in the theory. Because inline de-duplication takes longer than post process the de-duplications and replication is still running long after the post process de-duplication is complete. Secondly, if you turn on replication in an inline approach, the processor and memory is now shared across the inline de-duplication and the replication further slowing down de-duplication and further expanding the backup window which extends the time to complete replication.

With post process the backups run to the disk at disk speed and the backup is complete long before the inline approach. The post process de-duplication then kicks off de-duplication and replication in parallel. The first step is serial which is to complete the backup first and then the next steps are in parallel, de-dup the data and replicate. The post process de-duplication and replication starts while the inline approach is still backing up. The question is...does the post process complete the de-duplication and replication before the inline completes backup and replication?

Let's try some math...

With inline, a backup window could be 6 hours. If you turn on replication it could expand that backup window to 8 hours. Therefore, the time to backup and replicate to the second site is 8 hours.

With post process, the backup window is shorter, let's say 4 hours. Once the backup is complete the de-dup and replication work begins and may take another 4 hours. Total time to backup and replicate to the second site is 8 hours.

The time is relatively the same on both approaches as there are no free lunches in technology.

Bill Andrews is President and CEO of ExaGrid Systems a company that provides fast, low cost and scalable disk-based backup with data de-duplication solutions.

COMMENTS

Currently, there are no comments. Be the first to post one!
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics

Subscribe by Email

Your email:

Connect with ExaGrid