Seven Ways to Make Sure You Can Restore a Backup

A disaster recovery plan is only effective if you can restore the data. However, despite the growing risks, particularly those linked to ransomware, not all companies are certain of being able to recover their data from their backups.

When it comes to backup and recovery, regular and rigorous testing should be an integral part of any plan. But there are other steps IT managers can take, such as auditing backup processes, following the 3-2-1 rule, and verifying the integrity of backup files.

Backup testing must go hand in hand with a thorough understanding of the most critical systems and data, and how systems depend on each other in the production environment.

In this article, we summarize some of the key questions that IT leaders and business continuity teams should be asking themselves.

What are the keys to reliable recovery from backup?

Businesses need to know that their backups work, that they can recover data and restore systems with minimal interruption and without data loss or corruption.

This topic is broken down into several interrelated elements. Each company’s backup and recovery plan defines the recovery time objective (RTO), i.e. how quickly data should be recovered, and the recovery point objective (RPO). , that is to say how far back we are willing to go to find the last usable copy of the data.

These metrics define what successful recovery looks like for the business. In the case of ransomware, there is another key parameter: the ability for the company to cleanly restore its data.

Some systems are critical or must be restored first due to their dependence on other applications.

There is no point restoring systems after a cyberattack if it re-infects systems with ransomware code. And it may be that the RPO for, say, a power outage is different than the RPO for ransomware. It all depends on the company’s risk tolerance.

The reliability of the recovery also depends on the integrity of the recovered data. Are the recovered files working as they should, or was some data not restored or corrupted?

Businesses also need to consider the order in which they restore data. Some systems are critical or must be restored first due to their dependence on other applications. A recovery test should verify that the systems are restored in the correct order.

This in turn depends on access to media. Backups to the cloud require bandwidth, while local copies require backup systems to be up and running. Offsite backup media must be retrieved and brought onsite, or uploaded to a backup system or the cloud.

Companies should also verify that backup or failover systems are operational as intended. This includes cloud capacity and disaster recovery facilities, if necessary.

Finally, can the company access the support services it needs to recover the data? These include power and cooling, communications and key personnel. It’s not enough to verify that the backup software worked as expected.

How to audit backup processes?

A backup audit – or backup and recovery audit – is a formalized process to verify that backup and recovery are working as they should.

Backup audits should include checks on the location of data and the applications it supports, the protection of existing data, and the location of backup targets. This includes data held in and backed up to the cloud.

The audit will then focus on data recovery, including compliance with RPO and RTO objectives, and review the company’s backup as well as recovery policy and procedures. This includes the technical criteria as well as the designation of the person who will manage the recovery process.

The result will be a report with recommendations for action.

What is the 3-2-1 save rule?

The 3-2-1 rule has long been used to ensure adequate data protection. It states that companies must keep three copies of their data, on at least two types of media or storage systems. A copy of the data must be off-site.

It’s much easier to follow the 3-2-1 rule since the market offers a plethora of cloud backup services. However, in many sectors, off-site physical backup remains essential, particularly to protect against ransomware.

All parts of the 3-2-1 rule must be verified to ensure effective recovery and data integrity.

How to test the integrity of a backup?

A backup is useless if it cannot be restored properly. It may seem obvious, but testing the integrity of backups is an essential part of any backup and recovery or business continuity plan.

Files can become corrupted or infected, physical media such as magnetic tapes can degrade over time, become inaccessible or even be destroyed in a disaster. Cloud services may become unavailable or degrade, affecting the ability to retrieve sensitive data in the correct order.

Testing the integrity of backups is an essential part of any backup and recovery or business continuity plan.

Backup software uses tools like checksum validation and hashing to verify logical recovery. Vendors have also introduced AI-powered features to look for unusual patterns in data to spot ransomware and other forms of corruption.

The only sure way to test the integrity of a backup is to try to restore it. This poses practical problems, especially when it comes to restoring data on critical production systems that are in constant use. IT teams may need to test recovery one system at a time, or across virtual machines.

Some suppliers have developed alternatives. Commvault, for example, offers a “clean room” recovery product that will allow customers to restore data to a virtual replica of their environment in the cloud.

But it remains essential to also test recovery on physical hardware, especially for older systems that cannot be easily replicated on cloud technologies.

Why is it important to test backup and recovery procedures?

Testing procedures is as important as testing technology, but it is easy to overlook.

Much of backup testing rightly focuses on technical aspects, such as whether the backup software works as expected and whether backup files can be recovered and restored.

But often, when the restore fails, it’s for non-technical reasons. In a typical disaster recovery situation and ransomware attack, staff are under pressure, lines of communication are disrupted, and it is difficult to maintain control.

Procedures must be tested as realistically as possible.

Backup and recovery procedures should define what needs to be done, and when. And who is responsible for it. A clear plan and solid procedures will be a huge help when the worst happens. But this means that procedures must be tested as realistically as possible.

In this way, possible weaknesses can be identified and corrected before the procedures are used in anger. Can backups be found? And the recovery systems activated? Are the systems recovering in the correct order? Is the disaster recovery environment – physical or cloud-based – working as expected? And does everyone know their role?

Disaster recovery is one of those cases where it really is about tools, processes and people. All elements must be subjected to strength tests.

What are the objectives of backup testing?

The main goal of backup testing is to ensure that files can be restored from backup copies to production systems.

Testing should ensure that production systems perform as they should after restoration. If a company plans to switch to a standby configuration in its own data center, with a disaster recovery provider, or in the cloud, it should verify that the switchover is working. And, above all, that it can take over the production system when the time comes.

However, backup testing is not limited to the purely technical question of whether the backup works. As we have seen, companies need to test their general procedures to ensure that plans are executed in the correct order, that communications and control work as intended, and that everyone knows their role.

Going deeper, comprehensive backup testing can reveal much more about an organization’s readiness and resilience. Are RPOs and RTOs, for example, respected? And if they are, are they right for the business? Businesses evolve and an RTO that was acceptable five years ago may no longer be acceptable today.

Businesses must also consider regulatory requirements around business continuity and downtime.

How often should you test restores from backups?

The answer is simple: “as often as possible”. Large-scale backup and recovery testing is disruptive and potentially expensive, and may only be performed once a year. Other tests may be more frequent. This could be spot checks on critical applications, or integrating testing as part of application updates, for example.

Some systems may be tested daily, but this will depend on the criticality of the system, the importance of its data and, of course, the company’s view of risk.