Not only does well setup monitoring alert you to impending disaster before it happens, it also provides the peace of mind that all systems are nominal.
“An ounce of prevention is worth a pound of cure.”
So it’s all well and good to have your servers and backup systems in place but they can’t be expected to cry for help without some form of monitoring and notifications. Knowing your disaster recovery plan will be there when you need it is crucial, and that’s where proactive monitoring and testing comes into it.
There are many reliable, effective and low-cost monitoring solutions available that provide everything from daily email notifications to real-time, comprehensive alerting to an email, SMS and dedicated monitoring screens.
What your business specifically requires again depends on your particular infrastructure setup but there are several vital system components that should always be included in even the most basic of monitoring solutions:
- Hardware: Notification of failed/failing hard disks (or SSDs) in both your production and backup systems is of utmost importance. Your systems should be designed to cope with one or more drive failures but being notified of the state of the disks ensures you can act on an alert before it leads to a system failure.
- Software: Alerts/warnings from the backup software itself are just as important so you know whether the jobs are running, whether they succeeded or failed and contain useful error reports in the case of the latter. Assuming backups have been running without incident is never a good idea as you do not want to be presented with the scenario of a system failure and then discover your backups have not been successful for weeks.
- Operating Systems: In the case of monitoring the production environment itself you can install agents on the servers that will report in real-time if critical services or hardware components experience issues or go offline. In the case of virtual platforms, at the very least you should schedule scripts that interrogate the hypervisor’s hardware sensors and report alerts accordingly.
When it comes to a comprehensive disaster recovery plan the best practice is not to have only one or two sides of the triangle meaning that if you are implementing a well-designed backup solution spend the time and resources to set up adequate monitoring as well. If you disregard monitoring in conjunction with your backup system you are undermining your own efforts in staving off disaster.