Products Why ExaGrid? News/Events Partners Support Company Info Contact Us

ExaGrid's Eye on Deduplication

Current Articles | RSS Feed RSS Feed

Who's Afraid of Next Generation Deduplication Architectures?

Posted by Marc Crespi on Tue, Jun 02, 2009 @ 10:37 AM
Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Share on LinkedIn LinkedIn 

One of the things that continues to trouble me about how vendors use their blogs is that they often degrade into nothing more than competitive mud-slinging.  Don't get me wrong, vendor blogs are perfectly good vehicles to put factual information about your product out there and to compare and contrast it to other approaches.  However, it seems that the factual product content continues to decline while the rumor, innuendo, and FUD continues to rise.  I cite Rich Colbert's recent blog entry over at Data Domain's Dedupe Matters as a prime example.  Given ExaGrid's message to the market of having a next generation architecture for disk-based backup with data deduplication, I do not think it is a stretch to assume he was referring to ExaGrid in his tantrum.

While I plan to briefly respond to some of his points, I will quickly get back to information regarding the ExaGrid product as I think that is the source of the clear frustration Rich is feeling about the challenge by a vendor he will not name.

Rich's first premise is that if you were not first to market, you are "late" or "disorganized" and can never hope to rival the first market mover's technology or adoption.  Thankfully, there are many examples of why this assertion is inaccurate.  Just ask some of the other late and disorganized companies such as DELL, Microsoft, or even Sun Microsystems, none of who were the first entrants into the markets they ultimately dominated.  And while not all of their current fortunes are bright, there is no question they entered markets with already coronated leaders and figured out what they missed and exploited it to great success.  And there have been first entrants that have withered as times change.  Anyone remember Novell?

On another point, contrary to Rich's implication that only Data Domain is established in this market, ExaGrid now has close to 400 customers in its portfolio with over 2,000 installed systems across a wide variety of verticals. We continue to have record quarter after record quarter, even in this tough climate. 

And, our customers love us.  In fact, more than 110 of our customers demonstrated their satisfaction by having their deduplication success with ExaGrid documented with their names and titles. This is more customer success stories than all other vendors in our space combined, including Data Domain.

While Rich's current employer can cite more customers than ExaGrid due to being the first entrant, there certainly is no question that ExaGrid's technology has been validated by the market and is responsible for our rapid growth and greater than 70% competitive win rate.

But with all of that said, IT buyers need information about products not random musings by vendors.   We as vendors need to simply put our products forward so that customers can decide which one better meets their requirements.  On the product front, ExaGrid brings the following unique things to this market that were not present in first generation approaches:

  • Scalability - our GRID based architecture maintains a customer's backup window and restore performance as their data grows and avoids fork lift upgrades when you reach a system's capacity.
  • Backup/Restore performance - our post-process architecture provides for faster backups (maximum of 5 TB/hour) and optimized restores and tape copies of most recent data by eliminating deduplication overhead.
    • Contrary to assertions by exclusively in-line vendors, it is meaningless to compare restore rates for deduplicated data.  If 95% of the time restores will come from the most recent backup which is in non-deduplicated form with ExaGrid , then what is the point?
    • Suggesting that it is important to compare restores from deduplicated data is like saying an airline with a 5% on-time arrival record is equal to one with a 95% on-time record because the better airline is also late 5% of the time.
  • Unified management - ExaGrid's management interface places an entire multi-site installation in a single web interface for all configuration and management reducing management time and complexity.
  • Backup job aware reporting - ExaGrid uniquely can provide deduplication ratios and replication status by backup job so that users can really maximize their space savings and understand exactly which backup jobs are ready for restore at a DR location.

I wonder if it is the above differences that made Rich afraid to "help our organic search results" by mentioning us by name?  If a company is as invincible as he made Data Domain sound, why be afraid your prospective customers will find a later to market company making wild claims such as a markedly better approach?

0 Comments Click here to read/write comments

Clarity about client side, inline and post-process data deduplication

Posted by Bill Andrews on Tue, Sep 16, 2008 @ 04:15 PM
Share on Twitter Twitter | Share on Facebook Facebook | Buzz This  Google Buzz | Submit to Digg digg it |  Add to delicious  delicious |  Share on LinkedIn LinkedIn 

There is great debate around disk-based backup systems with data de-duplication as to whether client level, inline de-duplication or post process de-duplication is better.

The idea of data de-duplication is to avoid storing redundant data. Some only store unique roughly 8KB blocks of data and some only store the actual bytes, at the byte level, that change. Both of these methods deliver similar de-duplication rates. But the question remains... where is the best place to de-duplicate the data?

Client level de-duplicates the data where the "backup agent" lives on each application server. The advantage of this approach is that less traffic is sent over the network and therefore the backup window is the shortest with this approach. The disadvantage of this approach is that you have to replace your existing backup application with the new client-based de-duplication application.

Inline de-duplication is when the disk-based backup appliance, connected to your existing backup server, de-duplicates the data on the way to the disk. The advantage to this approach is that it uses less disk than post process and theoretically should cost less. The disadvantages are that this approach provides for the slowest backup windows as the de-duplication slows the backups down from writing to disk, expanding the backup window. These systems require more memory and processor so they are not necessarily less costly.

Post process de-duplication is when the disk-based backup appliance, connected to your existing backup server, allows the data to write directly to the disk from the backup server, at disk speed. The de-duplication work begins after the backup is complete. The advantage is that backups occur much faster than the inline approach resulting in a shorter backup window. The disadvantage is that more disk is required to land the backup and then compare. However, the cost of the additional disk is no more than the additional processor and memory required for inline process and therefore post process systems do not cost more than inline. In fact, in most cases they cost less. If you choose a post process system, make sure that the system is sized properly to de-dup all your data well in advance of the next backup coming in.

There is a further advantage / disadvantage debate as to which approach allows for replication of changed data to be received at the offsite system the fastest. I plan to expand upon this in a separate post.

Bill Andrews is President and CEO of ExaGrid Systems a company that provides fast, low cost and scalable disk-based backup with data de-duplication solutions.

2 Comments Click here to read/write comments

All Posts

Subscribe by Email

Your email:

Connect with ExaGrid