Products Why ExaGrid? News/Events Partners Support Company Info Contact Us

ExaGrid's Eye on Deduplication

Current Articles | RSS Feed RSS Feed

4 Myths on Disk-based Backup with Post-Process Deduplication

Posted by Marc Crespi on Wed, May 19, 2010 @ 01:01 PM
  
  
  

One of the blogs I most frequently read is the Mr. Backup Blog, written by Curtis Preston.   He continues his valuable role as a customer watchdog for our industry.   Most recently, he observes that vendors who supply disk-based backup with deduplication appliances which employ post-processing deduplication publish only their ingest rates and not the rates at which they post-process (i.e. deduplicate and replicate data).

Curtis raises a good point about publishing these numbers.  At ExaGrid, as we dialogue with our prospective partners and customers, we have always talked about the following:

  • Product ingest rate - this is the speed at which data lands to disk during backup and defines the customer's backup window. ExaGrid's current maximum rate here for a customer with 100 TB of primary data is 18 TB/hour with Symantec OST (both Backup Exec and NetBackup), and 12.6 TB/hour without OST. As a result, we provide the shortest backup window in the market.
  • Post-processing rate - this is the speed at which we deduplicate and replicate the data.  Here, we achieve a rate of 7.2 TB/hour for the configuration described above.  This rate is among the fastest for deduplication engines in our market including both post-processing and in-line implementations.

However, we typically have not displayed this prominently on our web site. Our upcoming web site updates will change this.

As always, the goals of ExaGrid's advanced post-processing deduplication approach are to:

  • Deliver the customer the shortest possible backup window by landing backups as quickly as possible with no in-flight processing to slow backups.
  • Retain a full copy of the most recent backup set to provide the fastest restores without incurring the overhead or reading deduplicated data. Customers will often say that 95% of restores are from that most recent backup, so this approach avoids deduplication overhead for 95% of all restores.
  • Get the data offsite as rapidly as possible to protect against disaster.
  • Offer the customer the ability to tune the product so that they can balance the size of the local backup window with the time it takes to replicate the data off site by allowing them to select the timing of deduplication/replication.

Obviously, there will be those that try to create myths about post-processing deduplication.  Here are my favorites:

  • Myth 1 - Post-processing systems require more complex management
    • Truth: ExaGrid's software does it all. You point your backups at a NAS share, we do the rest automatically in our software.
  • Myth 2 - Backups cannot run while post-process deduplication is running
    • Truth #1: ExaGrid's deduplication engine automatically allows for backups to flow simultaneously if and when necessary. In fact, if a customer environment cannot take full advantage or our ingest rate, we will automatically kick off deduplication and replication during the backups.
    • Truth #2: When sensible, a customer can actually configure the ExaGrid product to land backups and deduplicate simultaneously.
  • Myth 3 - Restores cannot be performed while deduplication is running
    • Truth: Restores can be run at any time after a backup job is complete, whether deduplication is running or not.
  • Myth 4 - With limited hours during the day, post-processing has to run a race to keep up.
    • Truth #1: This is simple. As ExaGrid sizes a system for prospective customers, we take into account how much data they backup per day, per week, etc. In fact, about 95% of our customers do a full backup on the weekend and backup a sub-set of their data on week days (incremental on files, fulls on databases and e-mails). So the largest backup is done on the weekend when our rapid post-processing is complete long before Monday's backups.
    • Truth #2: As stated before, deduplication can run during backups if and when necessary, so there is no backup failure if the processes overlap.
    • Truth #3: All systems, regardless of how and when deduplication is done, have to be sized appropriately to handle daily backup amounts.

ExaGrid typically relies on our installed customers to validate our story.  With over 2400 systems installed at over 600 unique organizations, we have a large population of customers who will attest to the benefits our approach brings.  You can see 160 of them and read their stories at:

http://www.exagrid.com/why_exagrid/customer_success_stories.asp


Tags: , , , , , , ,

COMMENTS

What about global dedupe? 
 
 
 
Ask for references, talk to one with your amount of data, your type of data, and your requirements.  
 
 
 
Always POC to make sure the solution meets your expectations set by the salesman.

posted @ Friday, February 11, 2011 4:18 PM by Really Curtis?


Comments have been closed for this article.

Subscribe by Email

Your email:

Connect with ExaGrid

request budgetary pricing resized 600

Browse by Tag