What is Disaster Recovery Planning (DRP) and Why Do I Need It?
A disaster recovery plan (DRP) is a documented, structured approach with instructions for responding to unplanned incidents.
This step-by-step plan consists of the precautions to minimise the effects of a disaster so the organisation can continue to operate or quickly resume mission-critical functions. Typically, disaster recovery planning involves an analysis of business processes and continuity needs. Before generating a detailed plan, an organisation often performs a business impact analysis (BIA) and risk analysis (RA), and it establishes the recovery time objective (RTO) and recovery point objective (RPO).
A disaster recovery strategy should start at the business level and determine which applications are most important to running the organisation. The RTO describes the target amount of time a business application can be down, typically measured in hours, minutes or seconds. The RPO describes the previous point in time when an application must be recovered.
Recovery strategies define an organisation’s plans for responding to an incident, while disaster recovery plans describe how the organisation should respond.
In determining a recovery strategy, organisations should consider such issues as:
- Resources — people and physical facilities
- Management’s position on risks
Management approval of recovery strategies is important. All strategies should align with the organisation’s goals. Once disaster recovery strategies have been developed and approved, they can be translated into disaster recovery plans.
Disaster recovery planning steps
The disaster recovery plan process involves more than simply writing the document.
In advance of the writing, a risk analysis and business impact analysis help determine where to focus resources in the disaster recovery planning process. The BIA identifies the impacts of disruptive events and is the starting point for identifying risk within the context of disaster recovery. It also generates the RTO and RPO. The RA identifies threats and vulnerabilities that could disrupt the operation of systems and processes highlighted in the BIA. The RA assesses the likelihood of a disruptive event and outlines its potential severity.
A DR plan checklist includes the following steps:
- Establishing the scope of the activity;
- Gathering relevant network infrastructure documents;
- Identifying the most serious threats and vulnerabilities, and the most critical assets;
- Reviewing the history of unplanned incidents and outages, and how they were handled;
- Identifying the current DR strategies;
- Identifying the emergency response team;
- Having management review and approve the disaster recovery plan;
- Testing the plan;
- Updating the plan; and
- Implementing a DR plan audit.
Disaster recovery plans are living documents. Involving employees — from management to entry-level — helps to increase the value of the plan.
Creating a disaster recovery plan
An organisation can begin its DR plan with a summary of vital action steps and a list of important contacts, so the most essential information is quickly and easily accessible.
The plan should define the roles and responsibilities of disaster recovery team members and outline the criteria to launch the plan into action. The plan then specifies, in detail, the incident response and recovery activities.
Other important elements of a disaster recovery plan template include:
- Statement of intent and DR policy statement;
- Plan goals;
- Authentication tools, such as passwords;
- Geographical risks and factors;
- Tips for dealing with media;
- Financial and legal information and action steps; and
- Plan history.
Scope and objectives of DR planning
A disaster recovery plan can range in scope from basic to comprehensive. Some DRPs can be upward of 100 pages long.
Disaster recovery budgets can vary greatly and fluctuate over time. Organisations can take advantage of free resources available online.
A disaster recovery plan checklist of goals includes identifying critical IT systems and networks, prioritising the RTO, and outlining the steps needed to restart, reconfigure and recover systems and networks. The plan should at least minimise any negative effect on business operations. Employees should know basic emergency steps in the event of an unforeseen incident.
Distance is an important, but often overlooked, element of the DR planning process. A disaster recovery site that is close to the primary data centre may seem ideal — in terms of cost, convenience, bandwidth and testing — but outages differ greatly in scope. A severe regional event can destroy the primary data centre and its DR site if the two are located too close together.
Specific types of disaster recovery plans
DR plans can be specifically tailored for a given environment.
- Virtualised disaster recovery plan. Virtualisation provides opportunities to implement disaster recovery in a more efficient and simpler way. A virtualised environment can spin up new virtual machine (VM) instances within minutes and provide application recovery through high availability. Testing can also be easier to achieve, but the plan must include the ability to validate that applications can be run in disaster recovery mode and returned to normal operations within the RPO and RTO.
- Network disaster recovery plan. Developing a plan for recovering a network gets more complicated as the complexity of the network increases. It is important to detail the step-by-step recovery procedure, test it properly and keep it updated. Data in this plan will be specific to the network, such as in its performance and networking staff.
- Cloud disaster recovery plan. Cloud-based disaster recovery can range from a file backup in the cloud to a complete replication. Cloud DR can be space-, time- and cost-efficient, but maintaining the disaster recovery plan requires proper management. The manager must know the location of physical and virtual servers. The plan must address security, which is a common issue in the cloud that can be alleviated through testing.
- Data centre disaster recovery plan. This type of plan focuses exclusively on the data centre facility and infrastructure. An operational risk assessment is a key element in data centre DR planning, and it analyses key components such as building location, power systems and protection, security and office space. The plan must address a broad range of possible scenarios.
Types of disasters
A disaster recovery plan protects an organisation from both human-made and natural disasters. There is not one specific way to recover from all kinds of disasters, so a plan should tackle a range of possibilities. A natural disaster may seem unlikely, but if it can happen in the organisation’s location, the DR plan should address it.
- Application failure
- VM failure
- Host failure
- Rack failure
- Communication failure
- Data center disaster
- Building disaster
- Campus disaster
- Citywide disaster
- Regional disaster
- National disaster
- Multinational disaster
Testing your disaster recovery plan
DR plans are substantiated through testing, which identifies deficiencies and provides opportunities to fix problems before a disaster occurs. Testing can offer proof that the plan is effective.