Disaster Recovery Planning
Disasters happen – and some are unavoidable. So, what can we do to make sure that our company assets will be safe?
In this article, we will cover 4 main topics:
- What is disaster recovery?
- Planning Theory and Implementation
- Disaster Recovery with Therefore™
- Customer Story
What is Disaster Recovery?
First, we need to define what a disaster is.
A disaster can be defined as a “sudden or great misfortune” or simply “any unfortunate event.” More precisely, a disaster is “an event whose timing is unexpected and whose consequences are seriously destructive.”
These definitions identify an event that includes three elements:
- Significant destruction and/or adverse consequences
It’s also important to remember that lack of foresight or planning for disasters can make a disaster’s impact so much worse than it needs to be. (https://www.umsystem.edu/ums/fa/management/records/disaster-guide-disaster)
Disaster Recovery is the methods and procedures for returning a data center to full operation after a catastrophic interruption (including recovery of lost data). It will sometimes utilize the use of alternative network circuits to re-establish communications channels in the event the primary channels are disconnected or malfunctioning. (https://www.gartner.com/it-glossary/dr-disaster-recovery)
Why is this so important?
- 93% of companies that lost their data center for 10 days or more due to a disaster, filed for bankruptcy within one year of the disaster
- 94% of companies suffering from a catastrophic data loss do not survive – 43% never reopen and 51% close within two years
- 77% of companies who test tape backups found back-up failures
- 7 out of 10 small firms that experience a major data loss go out of business within a year
Planning Theory and Implementation
Let’s look at the differences between a disaster recovery plan and a security recovery plan.
Both plans focus on what must be done after a disaster, but a DR plan is focused on general business continuity after a disruption, while a security recovery plan focuses specifically on information asset protection after a data breach has occurred.
The type of response required by each type of planning also differs. A DR plan includes plans for external communications, for example to customers and shareholders, while a security plan is much more inward facing, with the main goal of carefully analyzing the root cause of the issue and collecting evidence.
The general focus of the two plans also differs. A DR plan is mostly concerned with recovering operations and processes ASAP, while a security recovery plan places a larger emphasis on longer-term recovery, improvement, and future loss prevention.
Finally, when we talk about the dynamics of each plan, security planning differs a bit because cybersecurity threats are evolving literally every day and can be very unpredictable. For this reason, the plan needs to really be actively maintained with the latest knowledge. A DR plan, on the other hand, is more static because the scenarios it prepares for, for example natural disasters, tend to have a more predictable outcome and consequences. However, it’s essential that both strategies be frequently revisited & updated, so neither one of them is static.
The key to effective DR planning is to
- Document, manage, and test plans
- Develop a common governance, communication, and escalation methodology
This minimizes confusion & decreases time to recover from events, allowing you to maintain service level agreements.
Business continuity plans are another important component to a healthy disaster recovery plan. They are very closely associated with each other, and the two can be thought of as both sides of a coin; intertwined but with different focuses.
Disaster recovery planning is the process, or the actions, by which you resume business after a major disruptive event has occurred. It’s really focused on sort of “rolling up your sleeves” and doing everything possible to restore normal operations. On the other hand, business continuity planning is more theoretical, focusing on how the business will continue to function even after smaller disruptions.
Disaster recovery plans should be very factual and realistic, focus on what types of disasters could happen, and the potential effects on system and people. They should also address which actions are required to return to normal operations and how, exactly, the situation is communicated externally, for example to shareholders, the media, or the authorities.
Business continuity plans should focus on more subjective topics, such as which information and systems are most vital to continue business. I say subjective here because these are generally topics people could have differing opinions about. For example, some may say it’s more important to restart a production line ASAP, while others would argue that re-establishing the facility’s telecommunications network has the highest priority. Business continuity plans should also answer questions about communication structures and who needs to access them in case of an emergency.
By keeping these core tenets in mind, you’re laying the foundation for an effective strategy that will minimize confusion and decrease recovery time when disaster strikes.
Disaster Recovery with Therefore™
If we look at this chart, we can see hardware/system failures are still the most common cause of data loss. So, what does this mean? If you want to make sure your information is safe, you need to have a system to manage it. This means that a properly backed up information management system is an essential component of an effective disaster recovery plan.
Here are Therefore, we generally recommend utilizing Therefore™ Online for those looking for peace of mind. Therefore™ Online runs on the Microsoft Azure platform, the leading cloud infrastructure provider. This allows us to offer the service globally, in many different data centers in different geographies. These data centers are also geo-redundant, which means each one is paired with another physical data center in the same geography but in a different location. This means that if a natural disaster should affect one datacenter, the paired backup datacenter is far enough away that the risk is massively reduced if not eliminated altogether. Therefore™ Online also offers a high availably with an SLA of 99.95% uptime. The system is scalable to meet performance needs, complies to many global standards, and is highly secure: Besides the innate security that comes with Therefore™ in general, Azure offers additional layers of encryption both at the file system level and while data is in transit. In short, Therefore™ Online provides the best protection and backup possible in case of disasters, and should be the preferred option for technical implementations that require high resiliency.
However, we understand that cloud solutions are not for everyone. There are still many, many businesses and organizations that choose to run their applications on site using Therefore™ On-Premise solutions. This is where good disaster recovery planning becomes essential, since on-premise installations are the responsibility of the party running them alone. Fortunately, there’s a lot you can do with Therefore™ to help you not only plan, but recover after disaster strikes.
Therefore™ On-Premise and Disaster Recovery Planning
3 Key Concepts:
- Secure storage
- Automatic and manual backups
- Failover concept
Documents in Therefore™ are made up of two components; the actual files themselves, which are packaged in a secure proprietary format, plus the index data which is used to classify and later find them in the system. The document itself is saved to what we call a category in Therefore™, which is a dedicated space for related documents. For example, you can have categories for things like invoices, statements, employee files, reports, or whatever type of document you may have. At the same time, the metadata of the document, which we call index data, is saved to the database, and a link is maintained between the two.
Now that we understand how documents are stored, we can discuss how they’re backed up.
Primary and backup storage is a standard feature offered in every Therefore™ system, regardless of version. The storage locations are simply selected when the system is configured the first time. For example, it’s highly recommended that the backup storage be in a different physical location than the primary storage. This could be a disk in another part of the building, a different building completely, or even in another part of the world. All that’s needed is internet connectivity for the Therefore™ Server to connect to it.
The Therefore™ system can automatically replicate the data to both primary and backup storage without any user intervention at all. This data transfer is determined by a completely configurable migration schedule, so the system administrator has the power to determine the delta, or amount of risk they’re willing to tolerate in case of a disaster. For example, they may say that documents should be migrated every single day at midnight so the most they can lose in case the system is inoperable would be one day’s worth of information.
The standard backup utilities provided by database manufacturers like Microsoft can be used to make backup copies of the Therefore™ database and, in the case of a disaster, restore it to recreate your Therefore™ system with either no or only minimal data loss.
Finally, Therefore™ also offers some tools for manual backup and recovery options. Even in the case that the Therefore™ server is destroyed or inoperable, it’s very easy to get up and running again. If you’ve stuck to your DR plan and backed up your documents and database, it’s easy to just fire up some new hardware, reinstall Therefore™, and be up and running within very little time. Depending on your license type, you may need Therefore™ support to reset it for your new hardware.
There are also utilities like the Export Utility and Document Recovery Utility that you can use in emergency scenarios where data is only partially backed up. Therefore™ as a product is focused on keeping your information safe even in a worst-case scenario, so we offer the tools and expertise you need to make sure you can recover and get back to business as quickly as possible.
Failover concept refers to the ability to set up a highly available system that’s meant to continue operating even in the event of a disaster.
While the storage and backup features of Therefore™ ensure that you can recover and get going after the fact, the failover features are meant to keep you going without any interruption to your business.
So, how does this work in Therefore™? Therefore™ technology is based on and is compatible with many Microsoft technologies, including high-availability clustering. The main objective of a clustered system is to provide protection and fallback in case a resource, in this case a Therefore™ server, goes offline.
A cluster is made up of 2 nodes, in this case Therefore™ servers that are basically joined together within a private network. The system is set up in such a way that it can “listen” for both nodes and knows how to distribute the load between them. So, in the case that, for example, there’s a fire in the server room and one node is destroyed, the system recognizes the failure of that node and automatically “fails over” to the other. Users experience no interruption to the system because the resource redundancy provides everything needed for the system to continue operating normally. In general, the nodes are in separate locations to avoid the risk that we mentioned, such as physical destruction or inability to access the first resource for any reason.
So, now we’ve seen the different types of tools and options that Therefore™ offers you so you can leverage the software as part of an effective disaster recovery strategy.
GEV Austria, located in Salzburg, operates as a warehouse for spare parts for domestic goods such as home appliances used in kitchens; ovens, dishwashers, refrigerators, and so on.
A torrential rain caused a river near the warehouse to swell and flood the surrounding area. Unfortunately for GEV, their warehouse was close enough to the river to suffer heavily from this unfortunate event.
Since the river had flooded heavily, at its highest point the office was completely covered in about 90 centimeters, or 3 feet, of water. As you can imagine, this caused massive damage. Everything in the basement at ground level, including a slew of Canon MFPs unfortunately, was completely destroyed by the water. Furthermore, half a million Euros worth of inventory was destroyed or damaged beyond repair.
Well, because they were prepared and kept a cool head, they were able to remove the Therefore™ server from the premises and bring it to a safe location shortly before the warehouse flooded. Once the floodwaters had retreated and the building was pumped out, the system was rebooted with no issues.
GEV was able to get back to business very quickly. Just 3 days after the disaster, they were able to fulfill their usual number of daily orders once again, leaving their suppliers and customer astounded at the speed of their recovery. Thanks to the fact that they were saving all documents digitally in Therefore™, they suffered no damage to physical files they couldn’t remove in time. Everything was digital, so it was safe. Even all the pending orders they had were saved in Therefore™ and thus recovered, meaning that in the end they didn’t even lose any sales. Since they trusted in Therefore™, their data was preserved, and no business-critical information was lost.
The general manager of GEV Austria, Mr. Friedrich Staller, had this to say about the incident:
Even though it hurt to lose our inventory, the goods can be replaced. Thanks to our electronic archive, we didn’t lose a single business document and therefore we didn’t lose a single order. By saving our document management system we also saved our company.
We’re very happy to hear that GEV Austria was able to prepare for this possibility by investing in an information management system like Therefore™, which did wonders in making a terrible situation more tolerable.
See our prerecorded webinar on Disaster Recovery here: https://www.youtube.com/watch?v=MHivqJnqjEI
Check out our upcoming webinars here: https://therefore.net/webinars/Back