How to review and test backup procedures to ensure data restoration
Through routine, frequent testing of the full backup and recovery process for all backup technologies in play, you can avail yourself of early warnings that you are not prepared to assuage data losses and disturbances. You can then fix the issues before data gremlins such as hard drive failures, natural disasters, or ransomware delete your IP and PPI data hoards. “This will help you avoid regulatory investigations, potential litigation, and lost business through brand damage,” explains M. Scott Koller, counsel at BakerHostetler.
[ MORE FROM THIS SERIES: Reviewing incident response plans for data risk preparedness | How to conduct a tabletop exercise ]
Though the process will vary from organization to organization, the common thread in testing backups and restores is to perform a complete restoration of every last file to a clean system, says Koller. Frequent tests and test results should ensure that the technologies and procedures in question backed information up successfully, restored it successfully, and that the backup targeted and captured all the data that should be backed up, Koller explains.
This process will most certainly include careful comparisons of the restored data and its state to that of the original data stores and files. “File counts and cryptographic hash values should be able to verify that there has not been any data corruption. If possible, hot swap the servers to ensure the new systems work in the old systems’ stead and that backup was seamless,” advises Koller.
Microsoft Technet publishes tips for testing backup and restore procedures under the heading “Developing Backup and Restore Procedures” at “Testing Backup and Restore Procedures”.
These tips include steps such as testing restores against as many simulated hardware, software, and service failures as are possible in the real world, ensuring that you test every restore option you may use, ensuring that all backup and restore instructions work as prescribed, testing again after any changes such as those that require change management, and verifying that existing backup media such as tapes work with new backup hardware and software.
According to the Technet information, by testing backup and restore procedures, you can continually ensure that the process completes within the desired backup window.
It’s important to test often enough as well as to test in a quality sort of fashion. But how often is often enough What is and is not quality testing “The factors that should affect how often you test backup and restore capabilities fall under Governance, Risk, and Compliance (GRC) and include regulatory constraints, data retention periods / data criticality, risk assessment, policy, audit preparation, and strategic planning,” says Adam Gordon, CSO, New Horizons Computer Learning Centers Of South Florida. Testing frequency realities are another matter as some companies only find out whether backups work when they need them, and the success or failure of the backups and the restore operation in the middle of a crisis is the only test of their adequacy. This is certainly not often enough.
“Some companies progressively perform proactive tests at varying degrees of time throughout the year. However, the issue with these companies is that the testing may or may not be consistent and standardized, and most likely is not well documented as part of a formal BC/DR review and testing process,” says Gordon. In these instances, it is not actually the frequency of the tests that is the issue but rather the quality of testing that counts. Mature testing that relies on industry standards and best practices to achieve consistently accurate and comparable test results is absent, according to Gordon.
Desirable or not, change comes often in IT. While you could consider each potential instance of sudden backup and restore testing that results from unplanned changes, this is a harrowing task given how often change happens. “The better way to think of this issue is to pose the opposite question: “What activities and events do not carry with them the likelihood of unplanned testing of backup and restore system capabilities”” says Gordon.
To answer that question for your organization, you need a formal change management system. Such a system will ensure an awareness of change, its potential affects and consequences, and the need to prepare for these ahead of time since something could go wrong during even planned change, according to Gordon. “The lack of a well-developed change management system is at the heart of issues leading to unplanned backup and restore events in most cases,” says Gordon. The presence of a good change management system will tell you whether and when the current change brings consequences that require testing backups and restores anew.
You should use commonly occurring real life data disaster scenarios to simulate what your backups will and won’t do in a crisis. These scenarios include a single file or data chunk loss, partial data set losses, full data set loss, and multiple data set loss, says Gordon. You can test backups and restores against any of these scenarios using either manual or automated processes wherein the manual test requires one or two technicians to execute the restore process on sample data to confirm restore success while the automated test uses scheduled tasks and scripts on sample data.
“Both manual and automated testing must be done in a non-production environment that is isolated from production to ensure that data corruption does not take place in the production data,” says Gordon.
You need to account for instances where you might test backups and restores differently than you typically would. “It is impossible to say with any certainty what the specifics of a scenario’s response will be until we know the details of that scenario, and play it out in real time to assess the moment by moment impact of decisions,” says Gordon.
According to Adam Gordon, CSO at New Horizon Computer Learning Centers of South Florida, there are rules of thumb for a couple of prominent scenarios today including the cloud and Bring-Your-Own-Cloud:
While no panacea, these tips can remind you to look at backup and restore testing from many angles, so that you can eventually address all knowable testing risks.
Note: If you’re really serious about testing backup and restore functions, why not celebrate your commitment with acknowledgement of and involvement in World Backup Day, which arrives March 31