By Bill Andrews, ExaGrid CEO
Many customers believe that to solve backup challenges they simply need to replace tape with disk. And to keep disk storage to a minimum, they simply turn on data deduplication.
Disk does solve many of the backup problems tape causes. And data deduplication does reduce the amount of disk required, because data deduplication only stores the unique blocks or bytes from backup to backup. These are table stakes for a disk-based backup appliance; they must include disk and data deduplication.
However, how data deduplication is implemented matters. The wrong implementation can mean:
- Slower backups than writing directly to disk
- A backup window that grows and grows as data grows
- Slow restores
- Slow offsite tape copies
- Slow VM instant recoveries
- Costly forklift upgrades along the way
Bottom line: Not all implementations of data deduplication for backup are created equal.
For example, if you deduplicate inline (as the backups are occurring), it adds in a very compute-intensive process during the backups that slows the backups down. Furthermore, if you deduplicate inline, all the data is deduplicated and each time you request a restore, offsite tape copy, or VM instant recovery, you have to wait for all the data to be put back together or re-hydrated. These inline approaches tend to be scale-up architectures, which means that the all of the backups pass through a head end controller to be deduplicated. As the data grows, only disk capacity is added resulting in an ever-increasing amount of data passing through and being deduplicated inline by the head end controller. As data grows, no additional processor, memory, or bandwidth is added – just disk capacity. As the data grows, the backup window becomes longer and longer until it is outside of the allotted backup window time. The only way out is to replace the front-end controller with a bigger, faster, and more expensive controller. This is called a “forklift upgrade” and is a very painful and expensive approach to add disk with data deduplication to the backup process.
The alternative is to back up the data to a disk landing zone and then deduplicate the data after the backups are complete. This not only provides for faster backups, but since the most recent backups are in their complete and un-deduplicated form, they are instantly ready for fast restores, fast offsite tape copies, and instant VM recovery in seconds to single-digit minutes. In addition, if the system is a scale-out architecture, as data grows, not only is disk capacity added, but , processor, memory, and bandwidth are also increased. As data grows, appliances with all four resources (processor, memory, bandwidth, and disk) are added into a GRID system. This approach ensures that as the data grows, so do all resources needed to process, deduplicate, and move the data. The backup window stays fixed in length and eliminates the need for an expensive forklift upgrade. This approach provides for the fastest backups, the fastest restores, the fastest offsite tape copies, the fastest VM instant recoveries, and a fixed backup window length as data grows. It also means no forklift upgrades. On top of that, these systems cost the same as—and often less than— inline scale-up approaches.
ExaGrid offers the best of all worlds, plus a landing zone and scale-out architecture. Why pay more for slower backups, an ever-expanding backup window and slow restores, copies, and recoveries when you can pay the same or less for faster backups, a fixed-length backup window and faster restores, tape copies, and recoveries?
To learn more about how you can design a backup infrastructure to meet the needs of your virtualized server environment today and well into the future, join us February 11th for a webinar with George Crump, Founder and lead analyst from Storage Switzerland and Kevin Russell, VP of systems engineering from ExaGrid.
All pre-registrants will receive an advance copy of Storage Switzerland’s latest whitepaper, “Disk Backup Is About More Than Deduplication.”