Part of cyber resilience is considering what to do when the worst happens. And that worst case scenario is sadly likely to inevitable at some point. This worst-case scenario will take the form of a significant incident, a disaster from which the school or college needs to recover, and in planning for this a Disaster Recovery (DR) plan should have been created. But what should such a plan look like?
I have given this quite a bit of thought. Is this disaster recovery plan a long and detailed document or something much more simple and digestible?
On one hand we might want the long document and all the details as in the event of a disaster we will want as much information as possible to help us with first isolating and managing the incident and then later with recovery. The issue with this is that when the fire has been lit under the IT Services team due to an IT incident, the last thing anyone wants to do is wade through a long and complex document. I have seen a disaster plan which included lots of Gantt charts with estimated timelines for different parts of the recovery, but how can we predict this with any accuracy against the multitude of different potential scenarios. Additionally, the information you will actually need is likely to depend very much on the nature of the incident.
The flip side is the much more managable document which is easier to digest and look to in a crisis situation, when things are high stress but its shortness will lack some of the detail you may want. That said, a shorter document will be easier to rehearse and prepare with when running simulated and desktop incidents such that staff remember the structure and are largely able to act without needing to refer too often to the supporting DR plan. It is also more likely to be applicable across a wider range of scenarios.
The above however suggests only two options, being the detail or the brevity and ease of use, but my thinking on DR has led me to think we need to have both. We need to have a brief incident plan which should be general and fit almost all possible incidents. It should consider how an incident might be called and then which roles will need to be implemented including contact details for the various people which might fill each of the roles. It should consider the initial steps only, getting the incident team together so they can then respond to the specific nature of the incident in hand. It is the outline process for calling and the initial management of an incident.
Then we need to have the reference information to refer to which will aid in the identification, management and eventual recovery from an incident. Now most of this should already exist in proper documentation of systems and setup and of processes, however this is often missed out. When things are busy its often about setting things up, deploying technology or fixing issues, and documenting activities, configurations, etc, is often put off for another day, a day which often never happens. I think the creation of this documentation may actually be key.
The specifics of a DR plan will vary with your context so I don’t think there is a single solution. For me there are 3 keys factors.
- Having a basic plan which is well understood in relation to calling an “incident” and the initial phases of management of such an incident. This needs to be clear and accessible so as to be useful in a potentially high stress situation.
- Having documentation for your systems and setup to aid recovery. This is often forgotten during setup or when changes are made, however in responding to an incident detailed documentation can be key.
- Testing your processes to build familiarisation and to ensure processes work as intended, plus to adjust as needed.
DR planning is critical as we need to increasingly consider an incident as inevitable, so the better prepared we are the greater potential we have for minimising the impact of the incident on our school or college.