[BreachExchange] Recovering from ransomware: a conversation with Veritas CEO Greg Hughes

Fri Dec 3 10:21:54 EST 2021

https://www.mckinsey.com/industries/technology-media-and-telecommunications/our-insights/recovering-from-ransomware-a-conversation-with-veritas-ceo-greg-hughes

As CEO of data protection and availability company Veritas, Greg Hughes
stands on the frontlines of the ransomware battles, which are growing in
number and sophistication. From his vantage point of serving close to 90
percent of Fortune Global 500 companies, Hughes has seen firsthand how
attackers use tactics that are increasingly difficult to detect and defend
against. The threat has grown markedly over the past year and a half for
several reasons, elevating from one that was the domain of IT leaders to
one that executives and boards must understand thoroughly.

Hughes spoke with McKinsey’s Paul Roche, a senior partner who leads the
firm’s software practice, and Jim Boehm, a partner specializing in digital
risk and cybersecurity, about the growing threat of ransomware and data
leakage; why it now needs to be a C-suite and board-level priority; and how
organizations can best develop reliable backups and prepare to recover from
an attack. Their edited conversation appears below.

McKinsey: Ransomware has become an increasingly important topic in the
context of cybersecurity over the last 18 months. Given your company’s
visibility across the industry, what are you hearing from customers,
especially recently? I imagine there is a bunch of anxiety and lots of
conversation about it.

Greg Hughes: I spend a lot of my time talking to CIOs as part of my job,
and ransomware has risen very quickly over the past six months to become
the number one issue that I’m hearing about from them. The Colonial
Pipeline cyberattack in May of this year really is only the tip of the
iceberg. There are a number of companies that have had near-death
experiences related to ransomware, so it’s a critical topic.

I’m aware of one 350-year-old company which was almost bankrupted by a
ransomware attack because they had to stop their whole business operation.
This company is older than the United States, it’s been through multiple
wars and survived. But ransomware nearly brought it to bankruptcy, which
gives you a sense of what a threat it is. There’s just a ton of top-down
pressure around it. CEOs don’t want to wake up and find out that their
whole business operation just shut down.

McKinsey: The threat has really evolved from targeting big businesses to
also targeting small and medium-sized businesses. I have a family member,
for instance, who runs a medical practice that was hit with a ransomware
attack via an IT provider. Can you talk about the types of solutions you
have to configure to serve all these types of companies?

Greg Hughes: The threat extends up and down the corporate ladder, and
targets small-to-medium businesses, county agencies, even city agencies.
When you think of the threat in healthcare—small healthcare providers or
hospitals—it’s scary stuff. In many cases, those kinds of organizations are
the least prepared and the most vulnerable.

Countering the dynamic threat
McKinsey: The nature of the threat is evolving, with data exfiltration and
poisoning of backups. Negotiations with attackers now really center on
preventing data that’s already been exfiltrated from being released. And of
course, there are multi-spectrum attacks where you’re being taken offline.
What kind of conversations are you having with your customers about the
full spectrum of the threat and their concerns?

Greg Hughes: In our position as a backup provider, we take primary data and
move a copy to secondary storage. That’s fundamentally what backup
does–both in and out of the public cloud. Companies will ask us what they
should do around their primary data, and it really comes down to two
important things:

First, all sensitive data should be encrypted. There’s just no reason to
have anything in clear text anymore. Second, there are data loss prevention
(DLP) tools that will check the perimeter and indicate whether the most
sensitive data, such as confidential or regulated information, is leaving
your site. Those are two minimum components of a proper data leakage plan.

McKinsey: What are some of the best backup practices that you really
emphasize?

Greg Hughes: The most important point, which may be kind of obvious, is
that your backup is your insurance policy. Backup data is what allows you
to restore your primary system, so it plays a key role in responding to a
ransomware attack.

The first step is to make sure that your backup application, like all your
other applications, is upgraded to the latest version. This almost
shouldn’t have to be said, but don’t try to fight today’s ransomware issues
with yesterday’s technology.

Second, redundancy is good insurance and it’s inexpensive. There’s a
concept known as “three, two, one”—three different copies of your data on
two different media, one stored offline. That’s really a minimum standard.
Storage is very cheap these days, so make sure you have enough redundancy.

The third component is immutable storage—you need to have a backup copy in
immutable storage, meaning that it’s tamper-proof. It can’t be altered.
There are a lot of different immutable storage options now. It can be on
disk or in the cloud, and there’s always the old standby of tape, which is
immutable, and can be taken offline too.

The final principle is to make sure your whole backup solution is
secured–tightly secured–end-to-end with zero trust access, intrusion
detection, intrusion prevention, two-factor authentication, and role-based
access control. It’s very important to make sure that you have segregation
of duties and responsibilities. The folks who touch primary systems
shouldn’t be able to touch your backup systems. Advanced anomaly detection
is another layer of security.

The cloud and other keys to backup
McKinsey: Cloud-based backup makes the “three-two-one“ backup principle you
mentioned accessible to many more companies. Why does ransomware make it so
important to have a cloud-enabled backup as a part of your overall backup
and recovery strategy?

Greg Hughes: The cloud has been a major area of innovation in backup, to
the point that you can now think of the cloud as another target for backup.
There are many different storage types in the cloud, including cheap
storage that can be used for archives, or immutable storage. Also, in the
case of disaster recovery, instead of having a devoted and expensive data
center as a secondary backup, you can effectively spin up a data center on
demand using the cloud. That’s a powerful concept.

And then, of course, you need to back up your data in the cloud. One common
misconception is that the cloud provider will take care of ransomware. The
cloud service providers are very clear to say that backing up your data in
the cloud is your responsibility, so you must use the same techniques that
you’d use on-premises to protect your data in the cloud.

McKinsey: In addition to technology solutions, what else do companies need
to work on to build a good backup strategy, especially when it comes to
backup access due to a ransomware attack?

Greg Hughes: An area where we often get drawn into a conversation with our
customers focuses on operational resiliency rather than just perimeter
security. The reason you really need operational resiliency is that the
primary threat vector—which is spear phishing—works. It preys on human
vulnerability, so the bad guys are going to get in. The malware’s going to
get in.

Most advanced enterprises are trying to figure out how you handle a
worst-case event—what I would call a “cyber wildfire”—that just wipes out
your data center or your data that’s in the cloud. The key to resiliency is
a multilayered plan, with no single point of failure.

The National Institute of Standards and Technology (NIST) cybersecurity
framework is very good. It’s a five-point framework, but three of the
elements have to do with resiliency: protect, detect, recover. How do you
protect your data and your systems? How do you detect when an attack is
happening as quickly as possible? And how do you recover from that attack
as quickly as possible? It’s not just about backup. It’s about that whole
process.

McKinsey: What do organizations need to do from a process point of view to
make sure that the backups they are doing are actually helpful and
alleviate the problem?

Greg Hughes: The “recover” component is where so much of the planning is
essential, because restarting applications and business services requires
so much coordination across so many different stakeholders. You’ve got
compute, storage, network, applications. You need a digital run book,
really, a ransomware run book that coordinates all those pieces. And the
first step is to make sure you’re recovering from a known, good copy of
your backup. That’s where we use anomaly and malware detection to help us
determine that last known, good copy.

You also want to scan the data using good malware scanners before using it
for recovery. You want to have that in an isolated recovery environment so
it’s not touching your primary systems.

Preparing for a full recovery
McKinsey: We encounter a lot of companies that are confident about their
preparation, because they run tabletops and war game situations. However,
if you ask, “Well, how long does it take you to restore from backup,” they
say, “I don’t know.” And we’re hearing that recovery is often incomplete
for companies that get attacked by ransomware. Sometimes critical systems
are left out or become corrupted during the recovery process, and it just
takes a really long time. What do you see in your work when it comes to
recovery? How can companies make sure they stay ahead of those kinds of
problems?

Greg Hughes: The main thing we believe about recovery is that you’ve got to
test your recovery plan. A plan is only as good as your ability to test it,
and how frequently you test it. Also, the volume of applications that need
recovering is going up a lot.

The other thing that’s happening is that boards are starting to ask, “Do we
have everything protected?” They’re reading the news and they’re thinking,
“Do we have all of our applications and data protected?” That’s a
surprisingly challenging question to answer. One problem we see is very low
visibility. Make sure you have good reporting that lets you see that all
your applications, all your virtual/physical machines, and all your data
are protected.

McKinsey: I’ve seen a board ask that exact question. It’s a question that
boards should be asking. And they don’t want to hear an answer other than
“Yes.”

Greg Hughes: Exactly, there’s only one right answer to the question they’re
asking. But we have also seen cases where, unfortunately, the enterprise
finds out that they have only backed up, say, 20 percent of their data
after they’ve been hit by a ransomware attack. That’s the worst time to
find out.

Innovation in cyber risk solutions
McKinsey: Let’s talk about innovation going on in this area. What do you
see coming down the road six, 18, or 60 months from now? What potential
changes are generating collective excitement in the industry?

Greg Hughes: First of all, the cloud is a major area of innovation. The
cloud offers scalability & elasticity, which is significant because backup
and recovery by their very nature scale up and down over time. Also, we
need to optimize for the variety of workloads in the cloud, across
containers and different databases, while also working across multiple
clouds to make it easier for enterprises to protect their data with a
single policy across any cloud provider.

A second area of innovation is applying Artificial Intelligence (AI),
Machine Learning (ML), and data analytics techniques in and around this
space. One of the big questions there is: how do you identify the last
known, good copy of backup as quickly as possible? The last good copy is
the most current backup without malware. There is a lot of innovation going
on now in the backup and recovery space.

McKinsey: Two things that we’re also hearing from clients are that when
ransomware spreads, it spreads very, very quickly. And it impacts multiple
systems, multiple stacks of technology. How capable do you think today’s
recovery solutions are of handling that level of complexity?

Greg Hughes: This is a big and ongoing area of investment for us, as is
making sure that we’re backing up all the different workloads optimally. A
large enterprise may have dozens of technology stacks. Some of them go back
years, some of them are the most modern container-based cloud technology
stacks, and everything in between. And it’s not like they’re going to rip
out all the old stuff, so a backup provider needs to support all these
technology stacks. It’s a significant investment to keep up and requires
compatibility with hundreds and hundreds of different workload types.

Choosing priorities for a recovery operation
McKinsey: The business risk problem, especially when it comes to
ransomware, is the operational component of getting systems back online.
What is the typical timeline for getting fully back online? I’ve heard
everything from days to weeks or months. And what are some best-practice
timelines that companies serious about testing recovery operations should
be shooting for?

Greg Hughes: This really comes down to the classic model of thinking about
recovery, which is that you want to tier your business services—tier zero,
tier one, tier two, and tier three—in terms of prioritization. Then, you
want to tie that to the applications and infrastructure that support those
business services. This way, you know what’s the highest priority to lowest
priority. And then you assign service level objectives with recovery time
objectives (RTOs) and recovery point objectives (RPOs) to each of those
tiers.

That conversation has been led primarily by IT, but given the threat of
ransomware, it’s important to bring that conversation to the business. It’s
got to be a top management, CEO- and board-level conversation, so that when
there’s a malware or ransomware attack, people are prepared for the length
of time it will take for the services to respond.

McKinsey: That’s a good perspective. Instead of asking, “What’s my
target?”, companies should ask themselves, “What should your target be?” It
should almost be a question that you pose back to the business. And then,
what’s your willingness to achieve that target—how much time and money are
you willing to invest in solutions, in preparation, in testing?

So, the goal for recovery operations really should be a dialogue as with
any other security or business resilience solution. You want to focus on
the highest priorities first.

Greg Hughes: That’s a good point. It’s a dialogue between IT and the
business, between the CEO and the board. It’s a dialogue to look at what
the competitive standards are if you can figure that out. And then,
finally, for a lot of regulated industries, it’s a dialogue with the
regulators as well.

The critical role of vendors–before and during a crisis
McKinsey: Once an attack does happen, what are some of the things that are
key to managing the crisis well? What should CEOs be thinking about that
will help them get back online and operating as efficiently as possible?

Greg Hughes: There are several elements to the answer for that question.
The first is to remember that your vendors want to help, so quickly getting
them on the phone and explaining what’s going on in a regular cadence is
important.

The second point is anybody who’s gone through an attack will probably say
it’s the most challenging professional experience they’ve ever faced. It’s
going to be a 24 hours a day, seven days a week kind of thing. You need to
be prepared for that.

McKinsey: You mentioned communicating with your vendors, and that is a big
oversight we see often. Too many companies fail to bring their vendors to
the table, especially when they’re doing testing, running their playbooks,
doing tabletop exercises. Vendors are partners with you, not just to
provide a product or a service but to make sure your business is
successful, so they need a seat at the table.

Also, an attack is going to constrain your ability to make decisions,
potentially even some of the decisions that a CEO or business leader would
make, and you won’t fully understand those constraints unless you have key
stakeholders in that conversation. If you don’t have your vendors there
during testing to understand how they will be able to help you, you might
be practicing making decisions that might not be yours to make or might not
be possible.

Greg Hughes: That’s true. One way to look at it is that your vendors should
know your ransomware run book—read it, provide feedback, advise where it
could be improved, and keep it updated.

McKinsey: Are there good resources you recommend for people interested in
learning more?

Greg Hughes: There are a lot of good resources out there to help (see
sidebar). One place I’d point to is the US Cybersecurity and Infrastructure
Security Agency (CISA), which publishes guides about ransomware resiliency
and how to recover from ransomware. It’s a terrific resource, very
comprehensive, and covers the whole spectrum of activities you want to
launch very quickly if you’ve been attacked.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.riskbasedsecurity.com/pipermail/breachexchange/attachments/20211203/c2d8d74c/attachment.html>