The demands on your Backup & DR system have changed considerably over the last few years. You'll rely far more on your IT systems than your ever did. Having to tell a customer to "come back later because our computer system is down" is a message that's no longer tolerated. Instead of patience, you're more likely to lose the customer and have your name plastered over social media.
It wasn't that long ago that doing a backup once a day (overnight) was fine, the occasional backup failure was OK (because you had the prior day's backup) and extended downtime for a recovery was a pain, but manageable. Doing a full DR test was hard (because you didn't have the hardware resource, or the time), but as long as you were recovering the odd file now and again, that proved data was being backed up successfully.
When it comes to RPO/RTO (there’s an option on this page for a FREE RPO/RTO Worksheet), there's always a chasm between what the business believes, what IT thinks is possible and the actual reality. In order to narrow that chasm, you need some great, solid features at your disposal. Here's eleven of them…
1. Rapid Recovery
You need systems to be back in minutes from even a reasonably severe disaster (e.g. complete failure of a server). It can't be hours or days. In the event of something massive (e.g. you lose your premises), then certainly within a handful of hours (because you'll need to relocate your staff). This is your RTO - Recovery Time Objective.
2. Minimal Data Loss
Recreating data is between hard and impossible. While some businesses might have repeatable processes that involve keying data, there's a lot of freestyling that is purely digital e.g. emails, updating records, writing notes (CRM) etc. People will never remember what the changed yesterday, the day before, or probably even this morning. Achieving zero data loss is expensive. Reducing that to a few minutes is within the scope of most businesses. Backups should be at most every few hours and for critical, constantly changing data, every 5-15 minutes. This is your RPO - Recovery Point Objective.
3. Image Based Backups
In the event of losing an entire server, having only file based backups means you first of all have to install an operating system, then all your applications, then configure them (you've got that all documented - right?) and then restore the files. You could be anything from a few hours for a basic file/print server, to days for a complex ERP system. With an image based backup, you get a snapshot of the whole disk partition (operating system, applications, configuration and data) that you can use quickly.
4. Automated Testing
You need to know that these images you backed up will actually work. That needs to be checked by the backup system and not reliant on someone doing that manually. That quickly gets cast aside due to lack of time and then eventually forgotten. If your backup results in a bootable virtual machine (VM) that machine should be getting booted after every backup.
5. On-Premise (local) and Remote (cloud) Backups
Using removable media and manual processes to get data offsite suffers from the same problem as doing manual tests. Time gets in the way. Having a cloud only backup opens you up to very lengthy recovery times. Bringing vast volumes of data back from the cloud can take days. When you have a disaster, you ideally want to have local access to your data and systems. If you lose your premises, then you ideally want to resort to running your systems in the cloud. Pulling data from the cloud is probably of no benefit because you don't have the hardware to use it. Having your data purely onsite (even if it's a remote site, connected by VPN) leaves you highly vulnerable to malware (see Protocol Gap below).
6. Hardware You Can Recover To Rapidly
If you are backing up to a local device, being able to use that as a server to run your backups from gives you the fastest recovery time possible. Having to move hundreds of gigabytes or terabytes of data from the backup device to new equipment, even with fast networking, can take hours. Far better if you can operate it in situ. This sort of system is also great for testing software updates. You can test them in isolation on your backup device before committing to your production critical live machines.
7. Backing Up While In Recovery Mode
Being able to recover a failed server quickly is fabulous - it gets everyone back and running quickly. However, they're still creating data and that data still needs to be backed up. Ideally you want this to just happen automatically.
8. Recovering Back While Still in Recovery Mode
At some point you need to get your recovered server back onto new or repaired hardware. If you then need downtime of tens of hours to copy data back, all you did was postpone the disaster to a point where it was perhaps more manageable. However, not every business can manage to shut systems down overnight or over the weekend plus you then need people available to manage that recovery. Far better if you can essentially back up your live recovery machine to new hardware, while it's servicing its users and then just arrange a few minutes of downtime while you shut down the recovery machine and bring up the new live one.
9. Protocol Gap
You need some sort of technology gap between your day to day systems and the systems that manage your backups. Think of it like a moat or trench that surrounds where your backups are stored. If the technology we use day-to-day to access and manipulate data can be abused by malware, you don't want that technology to have any access to your backups. There's too many companies have been unable to recover from a ransomware attack because the malware encrypted or their backups too.
10. Rapid Ransomware Recovery
A good backup system sees every file write operation. It's a good place to not only monitor for malicious ransomware activity, but also to see what files the ransomware encrypted. Being able to ask your backup software to compare the last good backup (which might only be minutes old) and replace files on the live system which have been maliciously changed means you can not only ignore the ransom, but get your company back and running quickly.
11. Under Maintenance
You can't afford to be running backup software that's out of date and no longer supported by the vendor. When something isn't restoring during a DR, nobody wants to turn to Google.
These eleven things are the difference between a poor/good backup system and one that's epic. Recovering from a disaster shouldn't be a challenge, it should simply be "oh, one of these again - we practice these regularly".
“We chose Datto for ourselves some five years ago, having spent almost three years looking for something that was better than good. Like our customers, we rely massively on our IT systems in order to function. In leading by example, our customers elected to set that same benchmark too.”
In the nearly thirty years in business, we've had our fair share of heinous disasters to recover from. Years ago, some of them were unavoidable because the technology to do better either didn't exist, or was too expensive. Today, it's available. Yes, it's more expensive than a bad solution, but you're likely paying business insurance for things that have a tiny chance of happening. With the current cyber landscape the way it is and getting worse, this is more likely a "when" rather than "if" scenario.
Get in touch. We can help get you ready for an uneventful disaster recovery that should be so boring, there's no shock-factor story to tell.