Backup and Recovery vs. Data Protection
Where We Were
Traditionally, backup and recovery involved an installed agent copying files from the host to the backup server, which then cataloged and stored the changed files on a tape, or *gasp*, an optical disk platter.
Daily backups were easy to do when RAID sets had a maximum size of 1-2TB, data mapped perfectly to a host, and information was spread across the environment with plenty of spare CPU and RAM available in the off-hours for backup operations.
Before the nostalgia sets in, let's also recall the excessive amount of data written into SCSI-attached 4mm DAT or 8mm AIT drives with a set of tapes sitting on top of each server waiting to be swapped out every day.
That’s right, nothing to pine for here.
The "big iron" storage vendors have been wrestling with how to make themselves more relevant in the backup and recovery space for more than a decade. One of the earliest successes was NetApp's SnapProtect line, which enabled the vaulting and mirroring of snapshots with other NetApp arrays or volumes. In more recent times, we’ve seen EMC's vision of ViPR controlling the creation, retention, and tiering of data throughout a storage environment. While ViPR will gladly assimilate DataDomain into their “borg-like” storage topology, where does that leave all of the Avamar and NetWorker customers? It was announced at the 2013 EMC World conference with great fanfare, but market traction has been light.
Data Protection is Different
While the mission, data center, corporate culture, infrastructure, storage arrays, and entire thought process of IT have shifted several times, the tried-and-true backup has remained fairly consistent. We now have deduplication and changed block tracking (CBT) backups to help handle the ever-accelerating sprawl of data throughout and beyond our datacenter, but backup is still just backup.
Do you feel it?
A quiet slipping of the sand down the shore.
A change is coming.
It began in the earliest days of snapshots, when customers started pondering, "Why am I backing this up?" In many of the earliest cases it was because the data was incredibly transactional in nature, such as snaps of Oracle log volumes were taken every 5-15 minutes and essentially obsolete within the hour.
Today’s mentality for data protection versus traditional backup and recovery presents a very fine gray line. Customers have essentially been forced into a fundamental shift of the principles driving buying behaviors. They've accommodated virtualization and work hard to coordinate storage array snapshots into coherent data protection strategies, but the word that comes up more often than not in discussions on snapshot effectiveness is "kludge".
Today’s crop of data protection products, as exemplified by Actifio, Cohesity, Zerto, and Rubrik, are much more concerned with managing copies of your production data, managing the replication of the information between sites (gradually including integration with array-based replication) or accommodating native storage of that information in next-gen storage topologies. Today’s products aim to meet today’s requirements for lightning-fast restores from online snapshots, virtual data provisioning, or recovery operations for DevOps with intelligent converged solutions sporting the latest in scale-out hardware architectures.
A legacy backup administrator mitigates the one-in-a-million chance of a catastrophic disaster recovery event through integration with next-generation storage. Some customers are realizing the infeasibility of taking full backups of data over the weekend and going to an incremental forever or “synthetic full backup” approach. Even with deduplication solving many data copy storage issues, the process of gathering everything together is overly laborious. Hence, the reason every major backup vendor has some kind of "incremental forever" or "synthetic full" approach lately.
The death watch for traditional backup has already begun. The ease and ability to secure data in a location other than the place of creation and getting data restored and allocated for use is advancing every day. Veeam recognized this new fact of life with their Veeam Explorer functionality. Veritas also launched the Velocity platform in response to this shift in their key markets.
You are probably thinking, "Why does it matter?"
“Who cares if I'm doing ‘backup’ or ‘enterprise data management’?"
“It's all the same, right?”
Well, yes and no. It's admittedly a fine line that both sides of the argument are trying to blur more every day. As time progresses and data grows, the ability for legacy methods to effectively protect your data gradually wanes.
Data is expected to be online, available, and at full operation at all times, making the production data all but impossible to touch and backup windows a thing of the past. The promise of block-level change protection, whether through application or storage integration, is the future for protecting data at scale. NDMP (particularly with the newer snap diff functionalities), VMware CBT, and Oracle Incremental Merge help streamline traditionally slow processes, but the future is direct storage integration. The new face of the industry will be from data protection technologies replicating changes as they happen within the host or calling granular CBT storage snapshots and performing catalog operations on those changes as they exist within the source storage. Today’s robust scale-out storage technologies (I’m looking at you, Cohesity, Infinidat, and Swiftstack) eliminate much of the risk in keeping all of your data copies in a single storage array, negating the requirement for “a whole separate box” just for your backups (although a second copy via replication is never a bad idea).
If you'd like to learn more about how ReluTech is helping customers address the problems of today's PetaByte-scale storage environments and the data protection challenges they bring, we're here to help!