Written by: Daniel J. Poucher, CBCP
The pandemic proved that companies could react swiftly to the rapidly evolving situation. It was a true example that there really is a solution to every problem. During a crisis, people prove time after time that when coming together, many things are possible – even those things we never considered before. Deploying a remote workforce with little warning and successfully navigating the constant challenges in our required response efforts was an impressive feat. However, did the pandemic leave many with an “illusion of preparedness” or a false sense of readiness? Many companies are saying, “We don’t need to enhance our business continuity plans (BCPs), we lived through the biggest BCP event of our time, we never experienced downtime with our technologies, our plans are just fine!” Not so fast, perhaps.
Disaster recovery planning (DRP) evolved in the 1970s and 1980s when data centers and organizations saw an increase in dependency on computer systems to run businesses. However, it has evolved into enterprise-wide business continuity planning. Although the continuity of systems access is still a foundational element, there is much more that contributes to an organization’s ability to operate, and ultimately all factors must be equally considered in resiliency efforts.
Business Impact Analysis (BIA)
A business impact analysis (BIA) is the process of identifying and prioritizing an organization’s functions and the resources that support them. Resources include personnel, hardware and system requirements, non-technological requirements such as supplies, forms, reports, and even things such as critical outside relationships.
The business areas of the organization must drive the process, not the technology group. Yes, that group must be included and sign off on the requested resources, but there are often critical dependencies a department relies on that Information Technology (IT) is not aware of or has not considered in backup strategies. The BIA must start with the business, and IT must conduct a gap assessment afterwards to ensure that the recovery strategy can meet the expectations of the business.
Personnel
It can be argued that people are the backbone of an organization. Let’s face it, even if systems are up, how much good does it do us if employees are not available to manage the operations that those systems support? One of the biggest areas of concern that has come from the pandemic is the concept of personnel continuity.
As a result of the pandemic, many companies have looked at the success of their remote work capabilities and have decided to make this more of a strategic initiative and incorporate this into future business models. If it was successful for months or even years, why not make this a permanent way of conducting business? Sure, there are jobs and positions that are most successful on-site, but online meeting tools have significantly reduced our dependency on in-person meetings. If budgets can be largely cut by reducing the need for office overhead, travel, and other expenses, why not increase revenue or put that money into new, innovative business ideas? In cases where organizations are opting to go back on-site, many are now considering remote working as the backup plan for personnel if the production facility becomes inaccessible. This saves the cost and resources for maintaining multiple locations to be used in interruption scenarios, whether in-house or outsourced to a third-party provider.
This makes incredible sense, as long as it doesn’t come with an unforeseen price tag when recovery plans are implemented. If organizations are downsizing real estate footprints and moving to an increasingly remote workforce, or using remote working as the backup for personnel if displaced from their primary location, are continuity plans considering what will need to happen if remote working is not possible?
There are plenty of examples of regional events that have caused widespread power outages lasting for hours or days. Hurricanes Katrina (2005) and Sandy (2012), as well as incidents like the northeast US blackout (2003) and the northeastern “Halloween Nor’easter” of 2011, are examples of extended power outages caused by both weather occurrences and infrastructure failures. BCPs must include considerations for regional events – it can’t be assumed that employees have generators.
If the event is location-specific and only affects primary sites, working from home is a viable backup option. However, if employees are working from home as their production location or are expected to work from home as a backup, what happens in regional outages where this is not possible? Organizations must ensure a backup plan exists that does not rely on employees working remotely. This may include maintaining a corporate headquarters with a full-scale generator or contracting with an outside provider that can provide seats for employees without power to continue critical operations. If companies have generators, it is essential to understand the generator’s capabilities. Many only support data centers and emergency power which means it cannot support desktops, etc.
The presence and movement of your personnel needs to be documented within your plan. Document what they will need to do, where they will go, and who can back up a process or procedure if the primary person is not available. You can recover your systems but if you don’t recover your people, you won’t be able to start working again. After all, what good are running systems if no one has access to them?
Third Parties
Today, reliance on third party providers is paramount to our operations, whether it be for technology, other products and supplies, or even services. The loss of a critical vendor can significantly halt operations. Throughout the recovery planning process, it is essential that an organization identify critical vendors and implement a strong third party risk management program that includes due diligence, assessment of risks, and monitoring. It is also vital to decide what third-party assets your company will utilize. Understanding new and existing third-party relationships and their potential risks is pivotal to this process.
Due diligence should include an assessment of your providers’ continuity plans. Validating that your vendors have plans providing continuity in a variety of scenarios is important to ensuring your ability to serve your customers. Following the pandemic, it is imperative to ensure that a providers’ strategy includes proper backup and recovery practices, pandemic protocols, and considerations for geographic specific impacts resulting from a widespread regional or global event. Several companies were crippled with hot spot areas of illness which had worldwide impacts. Requesting proof of successful testing of the plans and information about subcontracting are critical components to validating your vendors’ readiness.
Testing backup and recovery strategies
To ensure that your strategy remains viable, you must perform ongoing testing of data backups, recovery, and failover to backup data centers. Although the pandemic event was documented as a test of recovery abilities, it must be considered as a scenario specific to working remotely. In other words, this was not a test of application recovery abilities. Systems were not affected, and traditional disaster recovery testing must continue to be a part of validating that an organization can have systems, functions, and other dependent resources up and running in the documented recovery time objectives (RTOs)/maximum allowable downtimes (MADs) to minimize downtime and subsequent loss.
Conclusion
The last couple of years have dramatically changed the way we conduct business and how recovery strategies are structured. Organizations have enhanced their resiliency posture in many ways. However, caution must be exercised when a plan for remote access to systems is considered a business continuity plan. The pandemic is one scenario of many and is still considered to be less likely to occur than scenarios that statistically halt operations. Therefore, impacts on facilities, technologies, and other resources must be evaluated to ensure readiness against any type of situation.