Reasons for invoking business continuity, invocation statistics and technology and industry influences for DR & BC.
Daisy delivered a webinar during Business Continuity Awareness Week (BCAW) to illustrate how frequently business continuity services are used, the reasons for invocations, and why investing in resilience is so important. It covered our latest invocation statistics and a look at the business continuity invocation trends from the last 20 years.
Watch the full webinar below:
As the UK’s leading business continuity provider, Daisy has monitored and collected a wealth of data over time, illustrating how and why our business continuity services are used. Daisy’s Head of Availability Services Les Price and Platforms & Infrastructure Manager George Wignall live and breathe continuity and resilience service provision, and in this session, they share their unique industry-provider outlook and experience gained from responding to three customer incidents every week, year on year (on average).
Les and George present our latest customer invocation statistics and review business technology and business continuity milestones over the last 20 years.
- The key milestones in business technology over the last 30 years
- The evolution of the business continuity industry and key milestones in business continuity services over the last 20 years
- An understanding of how and why business continuity services are used, including reasons for invocation and frequency (shown over time)
- A business continuity provider’s perspective delivered by senior staff within operations and service delivery
Les Price, Head of Availability Services, Daisy Corporate Services
Les has been responsible for all operations and service delivery to Daisy’s Business Continuity customer base for the last 20 years. His commitment to customer service has helped Daisy achieve and maintain outstanding customer service levels for rehearsals (testing) and invocations throughout this time.
George Wignall, Platforms & Infrastructure Manager, Daisy Corporate Services
George has worked in IT for more than 30 years, both as a customer using recovery services and in the industry providing them. Heading up a team of recovery specialists, George has overall responsibility for the platforms and infrastructure underpinning Daisy’s multiple-award-winning availability services, including backup, replication, recovery, archiving, storage, cloud and our Shadow-Planner software.
So today we are going to look back over the last 20 years and how the delivery of Disaster Recovery Services has changed through Business Continuity into modern Business Resilience services to UK customers.
We will be reviewing actual data and telling a true and factual story of how it has been for myself and my teams over this time.
We will talk about the ongoing threats as well as the new and emerging threats that businesses are now having to prepare for. And we will look at how Business Continuity services are having to change shape to meet the current-day needs.
I am going to be a little self-indulgent, but I hope you will find this journey through time interesting and informative.
Just as a point of clarity on some of the terminology that we will be using:
- When we refer to an ‘incident’, this is any unplanned event that has the potential to impact a customer’s normal business operations in some way
- We always actively encourage our customers to put our services on ‘standby’ using our 24×7 dedicated helpline, should there be a potential for their business to be impacted by an incident, that way we can start the preparation to deliver the services needed
- An ‘invocation’ is when that ‘unplanned’ event is impacting normal operations, and so the customer has decided to ‘invoke’ or activate the services within the Business Continuity contract
- Quite often, an ‘incident’ occurs without any forewarning, and so customers will go straight to the invocation of the services
- The other term we will use is ‘DR Test’, which more recently became widely known as a Business Continuity rehearsal. This is when the customer can test the services being provided will deliver to meet the expectations of the business at the time of disaster
I make no apologies for the amount data we are about to share with you and please bear with us as we go through as there is a lot to cover, we hope you will find it of interest, and we hope you will have some questions for us at the end.
I am assisted today by colleague George Wignall.
Good morning everyone, I’m George Wignall, and I’m the Platforms and Infrastructure manager for Availability Services at Daisy.
I was ‘seconded’ to a group company that provided Disaster Recovery and Business Continuity services in the UK to help it prepare and be ready to support its customers that fell victim of the millennium bug. That company was called ICM.
So what was the millennium bug that was going to bring the ‘end of the world’, well according to the press anyway?
The issue was simple: back then software and hardware were using two digits for the year instead of four, this would impact all programs using time-based calculations when the digits went from 99 (1999) to 00 (2000). Time would be resetting backwards, not forwards.
So prepared as we could be, we all waited with bated breath for the midnight hour.
I stayed with the DR and Business Continuity provider as it was proving interesting – I’m still here now… George is going to tell us about the technology changes over the years.
This opened IT to everyone in the workplace. Office applications became hugely popular with VisiCalc popularising spreadsheets, WordStar word processing and Microsoft Office the office suite.
This drove the decentralisation of IT systems from completely central systems out to the branches, all of which needed to be protected.
While tape capacities had increased to cope with the growth in IT use, its main problems still remain. Backup failures were a regular occurrence – either with the tapes themselves or backups not completing because tapes filled and no one was there to insert the next tape.
Worst, tape failures were not always evident – only being noticed when a restore was attempted – and most of the time, backups weren’t verified.
Physical recoveries were the norm as virtualisation was still in its infancy – so was standardised hardware, so most recoveries were technically complex and not undertaken by the customer.
- Disaster Recovery back then started as an extension to a maintenance contract (break/fix) to deliver replacement computers to customers who had suffered an event that meant the maintenance contract was invalid so this could be things like fire or flood damage for example
- The service extended to the use of mobile computer rooms
- Back then, less than 30% of customers performed annual DR testing
- Up to the millennium, there weren’t many drivers for business to adopt business continuity, those that existed could be ignored and were largely not enforced
We had three Workarea sites with a plan to open up a nationwide network of Workarea sites taking the service to within 1hour of all major cities in the UK.
Doing a DR test was like a ‘jolly boy’s outing’ being a few days out of the normal business watching tape recoveries, stumbling around with little or no preparation. More often than not, the would run out of time and comment, “we’ll fix it next year” hopefully…
As you can see, we were delivering one or two DR Tests each week with on average one IT invocation per week.
Workarea invocations were very rare, on average one every six months or so, in those days it had to be a major business interruption for an extended period before a business would undertake the relocation of its staff and getting services working at the recovery site.
This, with the apparent threat of the millennium bug and corporate failures such as Enron and visible systems failures within the Public sector, all contributed to business continuity becoming adopted as a mainstream business practice for major organisations.
This, in turn, led to the larger organisations looking at the supply chain and the risks that existed.
The consequence of this is was that those smaller organisations, that may not have been previously directly affected by regulation or PLC corporate governance were now facing pressure from their major clients to show proof of continuity as part of general trading terms.
This development was crucial to the growth of mid-market opportunities for businesses such as ours as for the first time it created a driver for organisations to invest in a BC plan as it could be the difference between winning or retaining a client – or not.
The Civil Contingencies Act 2004
The Higgs report in 2003 on corporate governance following Enron collapse amongst others
The FSA expansion of scope to mandatory compliance in some financial sectors
BCI and BSI Standards issued for Business Continuity Management leading to BS25999 in 2007
The number of DR or BC providers in the UK continued to reduce with mergers and takeovers taking place
And we at ICM opened up new Workarea Recovery sites across the UK with uptake seeing significant growth in our business and our delivery teams and activity
We were now completing six–eight DR tests each week with rising numbers of Workarea tests – but still low numbers of Workarea invocations- it just took too long to spin up services and move people – most incidents being utility failures but also denial of access becoming more predominant, some due to terror-related incidents
In the main, invocations were still based on the traditional IT DR services with equipment failures of some sort being the main reason customers invoked.
But Disaster Recovery was changing into Business Recovery, and this presented new challenges in how to protect a business, not just technology.
But technology and people recovery were still disparate, very little joined up testing, with low commitment to using Workarea services for short term business interruptions.
The final piece was virtualisation becoming the common server platform, allowing standardised server recovery and the introduction of automation. This led to service tiering becoming the norm – servers and applications could be recovered in related groups, streamlining recoveries.
This not only streamlining their recoveries but also allowed them to be recovered by priority and not by their position on a tape.
The number of providers in the UK market was continuing to decline.
All the supply chain and regulatory requirements were now driving business into serious BC at all levels, no longer a tick in the box for the IT manager, now rehearsals were expected to be successful first time and every time… the technical complexity of an IT DR test increased significantly.
As the new combined business, we were now delivering 100 rehearsals a month, supporting over 300 incidents a year and delivering 180 invocations a year, 15% of which were now Workarea as maturity and confidence in the services through testing, was making a difference.
Customers with a pedigree of testing Workarea were more likely to invoke as confidence in the service had grown, and technology had made it simpler and quicker to initiate.
The first recognised modern cyber-attack was in 1988 when the Morris worm infected every computer system in ARPAnet, the forerunner of today’s Internet.
In general terms, until 2010 cybercrime was mainly focused on hacking government secrets – from espionage and civil rights to meeting ET. It was hard to imagine then how quickly cybercrime would develop into the threat that it is today.
Many companies are now happy to invoke for less than a day, as we can securely connect directly into their environments.
There is strong take-up of new products for online backup and replication that bridge on-prem, hosted and cloud-based services.
This is coupled with an increased focus on cyber-aware protection, with a focus on AV integration and ransomware recoveries.
Tests were becoming increasingly complex with data coming from different sources, hence the change in terminology to a ‘rehearsal’.
Demands from the business increased, putting pressure on IT to deliver successful recoveries meeting RTO and RPO demands. BC Managers were now running the rehearsal, not a business manager in a part-time role.
As Phoenix we were now completing over 800 rehearsals a year, supporting some 230 incidents and delivering 80 invocations a year with WAR invocations now up to 20% of all invocations.
Continued mergers see Business Continuity suppliers consolidated down to four major players in the UK – Phoenix, Sungard, IBM and HP.
The rise of zero-day attacks and botnets suddenly became a reality, hitting the mainstream in 2017 with the worldwide WannaCry ransomware attack: Affecting over 150 countries, 200,000 computer systems and costing $ billions to fix. In the UK, over 70,000 devices were affected.
The NHS was surprised by the impact on non-traditional IT systems, affecting MRI, theatre and blood storage systems, to name a few.
Outside the UK, it’s estimated it cost Merck shipping between $600-$800M and FedEx $300M.
While the direct monetary costs are easy to quantify, the indirect people, reputation, productivity costs, etc. are incalculable.
Cybercrime is now big business – for both the good guys and the bad guys.
In less than ten years, cybercrime has evolved to become a normal, daily business threat.
We’ve also seen an uptake in self-recovery, rather than solely assisted recoveries.
This also means that rehearsals can be used by IT to enhance business value and reduce service risk – allowing patching, training, rehearsing changes, etc., to be performed in a safe, sandboxed environment
The rise of public cloud and “as-a-Service” has meant that business’ IT is now a web of on-prem, hosted and public cloud.
Rehearsals and invocations now involve multiple services, organisations, teams and locations. There are now more points of failure than ever before, and IT is no longer a single chain, but a complex web of integrated services.
While technology may be all-pervasive, few business continuity providers are multi-disciplinary, covering Workplace, networking, services, systems, applications, cloud, InfoSec, etc.
Business Continuity suppliers in the UK are now down to Dasiy and Sungard that can provide true national coverage for UK businesses.
Workarea now accounts for nearly 50% of all customer invocations each year.
Rehearsals are now complex mini projects with desired outcomes and no room for failure from the supplier or the customer.
The volume of rehearsals reduced in recent years due to various contributing factors: the complexity, the wide mix of skills needed and customer commitment.
In 20 years, a total of over 12,000 rehearsal have been delivered.
Now more than 70% of all rehearsals include or are exclusively Workarea.
The yellow line here shows the % conversion of incidents to invocation; we can see that back in the early days, an IT incident was more likely to turn into an invocation. With the maturity of risk management and BC planning over the years we can see that customers now are more likely to put the service on standby for any potential risk to their BAU activities and so there is a lower conversion rate to actual invocations.
The blue and the green lines show the reducing trend of ITDR invocations due to technology changes and the increasing trend of Workarea invocations over the 20 years.
Workarea incidents and invocations now starting to see the impact of cyber attacks although not all customers either want to talk about being the subject of an attack, or realise they could use the service they have in place to help them.
The fact that cyber attacks are being managed by Information Security teams rather than Business Continuity management could be accountable for the low numbers at the moment, but we are seeing a change in this regard.
Sadly there has been a growth in terror-related incidents, such as the 2001 Birmingham bombing and the 2005 London bombings, but these are rare.
2004 saw widespread disruption following a fire in a BT cable duct in Manchester that destroyed 130,000 business and residential phone lines, leaving companies without telephone links, fax lines and access to their networks.
2007 saw a number of major floods, including Tewksbury (pictured). These events led to a significant uptake in Workarea services as business’ lost access to their premises. These significant events aside, most invocation were still IT failure related, with a growing number due to utility and environmental failures in buildings.
We have supported some 1,340 incidents over the seven years, with 428 invocations being 32% of the recorded incidents, with 128 being Workarea recovery invocations – some 30% of the total. But let’s look at the changes over that time.
What we are seeing now is about 30% of incidents turn in to invocations, some 50% of which are not IT failure related.
Hardware Failure is now classified along with a software failure including some form of data loss or corruption that is not cyber or virus connected – 23 incidents
Planned Maintenance now includes any pre-planned activity including platform or communications upgrades and physical office moves – these are standby incidents that should it turn to invocation, become one of the other categories – 44 incidents
Environmental Failure now includes flooding caused by a failure of some type to the buildings water supply – 22 incidents
Power Failure is any incident that has caused an interruption to the electrical mains supply to a building including a UPS failure – 30 incidents
Communications Failure is any voice or data or internet access issue causing business interruption – 28 incidents
Adverse Weather now includes flooding due to extreme weather – 2 incidents
Fire can be to the customer’s building or any adjacent building that causes a denial of access to their building – 7 incidents
Denial of Access is any other event that stops a customer having access to their normal site – 1 incident
Civil Unrest or Terrorism can include any other unlawful activity such as theft of computers – 1 incident
Cyber Attack/Virus is any form of cyber activity DDoS etc. – 4 incidents
- Hardware Failure – 12 (33% of invocations)
- Planned Maintenance – 0 as the invocation becomes one of the other categories
- Environmental Failure – 4 (11% of invocations)
- Power Failure – 4 (11% of invocations)
- Communications Failure – 11 (30% of invocations)
- Adverse Weather – 0 invocations in the last 12 months
- Fire – 1 invocation (4% of invocations)
- Denial of Access – 0
- Civil Unrest or Terrorism – 0
- Cyber Attack – 4 (11% of these incidents)
- Workarea invocations and IT invocations now almost a 50 -50 split
Unfortunately, the risk of terrorism is still present today, with the official UK threat status still being ‘substantial’, (“attack is highly likely”).
Sadly we’ve seen this rise to ‘critical’ since then on two occasions, both times in 2017 following the Manchester Arena bomb and Finsbury Park mosque attacks.
Communications failures for business’ is increasing, especially with the reliance on access to the internet, the cloud and the rise of the Internet of things (IoT). Now that virtually everything can be connected to the internet, from vehicles, toasters, fridges to more traditional CCTV, the rise in cybercrime is inevitable.
The ongoing power and environmental risks will ever be present – more variable weather patterns also seem to be putting an increasing load on power distribution systems.
The reality is now though, a proven BC plan can be activated and be effective within hours (not days as it was in the early years) and the reluctance to displace personnel to a recovery site is significantly reduced in this era of 24/7 everything.
Providing flexible and adaptable Workarea solutions using the investment made already in our infrastructure to deliver a newer brand of services to meet the new requirements of the modern era where people are always working, and infrastructure is always on. Cybercrime is all around us, most we don’t even get to know about, but organisations need to know there is a service that can work that will maintain business operations in spite of such an attack. We have already delivered a number of these invocations using our Safe Haven product. I expect that we will be delivering many more in the future.
Hopefully, that has given you an insight into our world as a BC provider.