When we plan our Business Continuity Planning, there are various factors we need to look at. In my earlier article about BCP, I have mentioned about the importance of including cyber security threat in Business Continuity Plan. As we all know, normally we consider disruptive events such as hurricanes and floods. If we go for building our own datacenters, both primary and backup, then we might need to do a lot of research in finding locations that are prone from all the natural disasters and also we need to consider the adaptability of the backup site to recover following a disaster.
Going to cloud – escape from earth quake, floods
The best approach I would consider is having a Disaster Recovery site in the cloud. Haaha..this may sound like hosting our application in sky to get rid of all the disasters in earth. The benefit of using this approach (Disaster Recovery as a Service) is about dealing with the risk involved in it in terms of a rapid recovery following a disaster, testing abilities and the cost. In most of these models, pay-per-use policy is there which means we need to pay only when it is used. For small or medium sized organizations, this would be good as there is no upfront investment required compared to having a private backup DR site.
Testing makes us Stronger
When we go for BCP and DR Plans, the most important factor we should consider is testing. Through testing only we would be able to find out the gaps or issues with the plan and provide updates and release new versions. If you are not testing, during the disaster you would be witnessing unforeseen issues regardless whether it is cloud based or on-premise, so testing is important. I too admit that testing is a lengthy process and there are costs involved depending on the type of test you perform. With this new DRaaS (Disaster Recovery as a Service) approach, testing burden will be on your service provider. We need to only validate whether our application is working as it is supposed to be. Also through the testing we would be able to identify how long it will take to recover an application and the performance related to that application, if it is not running from the primary site.
No more tape transport
If we go with the traditional model, then we used to copy data to tape and transport those tapes to an offsite location for keeping it safe and will be shipped back in case if it is needed. It would be time consuming, slow, costly and difficult to test. The traditional approach of shipping backup tapes is no longer effective and is inefficient if we look at disaster recovery on a broader aspect. September 11 2001 incidents changed the overall outlook of business and started to consider disaster recovery on a serious note. Enterprises started to think of backup sites where they can run the show in case of a disaster. We also need to understand that backup and disaster recovery are two different services.
The story of RPO’s and RTO’s
We need to closely look at RPO’s and RTO’s when we plan to decide about the backup site options. RPO’s and RTO’s of your mission critical assets are determined through Business Impact Analysis (BIA).
RPO stands for Recovery Point Objective, is the amount of data you can afford to lose (measured in time) if an application has to be recovered following a disaster. In case if you have a scheduled backup that runs every day at 4 AM and suppose you have a disaster or application crash at 2 PM, you are going to lose valuable data between 4 AM and 2 PM. This is where you should consider what sort of RPO you need to plan. If you can’t tolerate much of a data loss, then you should go for a narrow RPO. But the narrower your RPO is, the more expensive will be your data protection solution. Recovery Time Objective (RTO) is the maximum time your application or system can go offline. If your RTO is 2 hours, then you need to prepare for making your application or system back online in 2 hours and for that we need to look for options like redundant site or hot site and that would be a costly option. If you have a long RTO, then we can think about less costly options like a warm site or a cold site. The point I am trying to tell here is that the choice of your backup site should be based on your RPO’s and RTO’s and that are determined through Business Impact Analysis.
Backup site – Now a possibility for SMB’s
For a medium and small sized business, hot site or redundant site would be a costly option.
This is when most of the small and medium sized enterprises start to see the value and importance of cloud. They can opt for “Disaster Recovery as a Service” approach as there is no upfront cost required compared to other options. Here we do not own hardware or a technology. Most of the solution offers pay per use policy; we need to pay only when we fail over to cloud. Failing over enables enterprises to continue operations much quicker than the traditional DR approach. More than just data recovery option, most of the Disaster Recovery as a Service (DRaaS) solutions typically replicate infrastructure and applications to help ensure full continuity of business operations.
The DRaaS Architecture
DRaaS Service Providers will work with their customer to deploy a replication technology, mostly software based, and continually it will copy the data changes in the customer environment to the cloud. Images of customer’s servers will be created and stored as standby Virtual Machines. All the other important decisions like boot order of the servers and its IP addressing scheme, VLAN’s and all other considerations will be taken upfront and used in case if we go for a recovery. Following a disaster, the provider uses pre-built automation techniques to boot the Virtual machines. After that the customers will get access to their environment through a portal providing console access. With this approach, there is no need to physically transport backups to the disaster recovery sites or manually rebuild the servers. Our data and applications are stored, mirrored offsite, and the recovery is managed completely by our DRaaS service provider.
SunGuard, IBM, NTT, Axcient, Virtustream, Dell, Vmware, Verizon and Acronis are some of the leading DRaaS providers.
Challenges to be discussed in Part 2
Still most of the enterprises are hesitant to trust a service provider primarily due to the concerns about availability, performance security and compliance reasons. I too have lot of concerns over this. We shall look at the challenges in Part 2 of this series.
Authored by Aju Nair