Is Disk With Deduplication Really Faster than LTO4 Tape for backups?
Posted by Marc Crespi on Mon, May 03, 2010 @ 11:19 AM
The short answer to the question is yes. However, to really understand why, read on. In fact, there are multiple reasons that disk is faster than tape for backups and one of the reasons is the BIG reason.
Sometimes I will be approached by someone who will cite the LTO4 speeds with compression over a fibre network as they try to convince me that disk is actually slower or at best equal to tape. They cite the fact that most disk-based backup appliances connect over Ethernet and utilize a NAS interface. Therefore, there are bottlenecks that do not exist with tape.
Let's make this easy. Let's not even debate whether a single backup job to tape is slower than a single backup job to disk. For this moment in this blog post, let's call it a tie. Or better yet, for arguments sake let's allow just for this moment that tape could be faster (gasp!). Even in the rare case when this could true, this is not at all the horse race you care about. The key thing in backup is to complete all of your backups in the shortest time possible, specifically within your allowed backup window.
Disk gains a lot of its speed advantage through concurrency-the ability to run a lot of backup jobs in parallel. With tape, the number of backup jobs you can run simultaneously is limited by the number of tape drives in your tape library. With a 4 drive tape library, you can run 4 backup jobs simultaneously. The fifth, sixth, seventh, and eighth jobs patiently wait for a drive to be free. With disk, you do not have this limitation. For example, each ExaGrid appliance can handle between 6 and 20 concurrent backup jobs. A 60 TB ExaGrid System can support 120 concurrent backup jobs or streams whereas other systems may support as little as 18 for that amount of data.
A good comparison for this is draining a swimming pool. Imagine a swimming pool with 100 gallons of water in it. Let's assume I install two drains in the pool, each capable of draining 1 gallon per hour. Obviously, it will take 50 hours to drain the pool (100 gallons divided by 2 gallons/hour = 50 hours).
Now imagine the same pool but instead of two drains, I install 10 drains each capable of draining ½ gallon per hour. Even though each individual drain can only drain at half the rate of the drains in the other example, the pool will drain faster. The 10 drains will drain the pool at a rate of 5 gallons per hour versus 2 gallons per hour or more than twice as fast. The pool will drain in 20 hours versus 50 hours.
So, before you upgrade your tape library to the blazing fast speeds seen with LTO4, remember you will always be limited by the number of tape drives.
Folks, its time for disk-based backup. And with data deduplication in a plug-and-play appliance, it fits within even some of the tightest IT budgets.