Nabugu - stock.adobe.com
How AI might change the data protection space
AI is making waves throughout IT, including in backup and disaster recovery. Learn where AI is being used for data protection and where it might be headed.
It is no secret that everybody is talking about AI today, including generative AI. I recently sat down with Mike Leone, principal analyst at TechTarget's Enterprise Strategy Group. I was curious to get his views about how AI changes data protection strategies, but also how it fits in or should be used by backup and recovery tools.
I'll summarize our initial views here, but this topic deserves more coverage so be sure to stay tuned for more.
To get us started, I was specifically interested in the current role of AI in the context of backup and recovery: Does it create new data protection or compliance needs? What can it do from an operational efficiency perspective? What can AI do to help with recovery?
AI as an anti-ransomware tool
AI in ransomware prevention and preparedness is about powering operations teams to be more efficient in completing their day-to-day tasks. These might include automated monitoring and management, as well as faster responsiveness when something comes up. Let's take the example of cyber-resiliency.
AI enables you to be smarter in how you respond to a ransomware attack, for example, by detecting it and being proactive about using predictive analytics on existing data. AI can also help to do all of this at scale, which is key.
There are a couple of different ways that organizations can use AI to help prepare for a ransomware attack or respond after an attack has occurred. Recovery readiness is one aspect, but in reality, organizations don't want to see a successful attack in the first place -- they want to prevent it from happening.
Taking this to the next level of preparedness, being able to test adversarial reactions and being able to leverage data to simulate different types of attacks can be very powerful.
This highlights the critical component of base modeling and monitoring. Today, that's where anomaly detection and prediction come into play. It helps organizations understand some of the normal patterns and behaviors within data. Organizations are analyzing a lot of data but you must keep up with the fact that some of these attacks are going to be very intelligent.
Watch out for data deluge
Unfortunately, bad actors are going be using AI themselves. Generative AI can be used here in two ways. First, to understand existing data patterns and behaviors. And second, to generate synthetic data sets with slight variations to simulate and test algorithms being used for monitoring approaches. Does this mean more data deluge?
Taking a closer look at the data deluge most organizations are already dealing with, AI use in general might actually be another source of massive data creation. There is going to be a lot of data generated by those AI models, especially when using custom large language models within a business to deliver private and compliant Chat GPT-like experiences. More and more data is going to be used by organizations to support AI. The more mature the implementation, the more use cases, the more data. And you need to make sure that all the data is protected and recoverable to be able to pass an audit. The compliance component is going to become incredibly important in this space.
This opens up numerous questions and issues on the data protection side: Is it all important? Should all of it be backed up? It's going to be a whole new opportunity for the data protection space to come up with a backup plan for data that's being generated by these large language models within organizations.
Intelligent data recovery
We know it's a matter of when, not if, you'll need to recover and get back on your feet. We believe that there is a place for AI to help accelerate recovery, to make it easier to spit out a recovery plan. It may need to be fully integrated with the security processes and teams -- for example, the ability to use an algorithm to make a recovery faster and more accurate by really understanding the relationships and dependencies of the data that is backed up.
As you understand the relationships between the data, the machine learning models can understand the dependencies and prioritize what needs to be recovered. This is essentially the next stage of recovery; let's call it "intelligent data recovery." This stage is about reducing the time it takes to restore those critical systems and minimize downtime.
Using backup data to do more
With more use of AI, including generative AI capabilities, we are expecting to see expanded use cases that go beyond current uses cases, such as the ransomware and predictive examples we mentioned. The ability to better understand the relations between data components will not only help with intelligent recovery, but also with compliance-type capabilities such as e-discovery. The ability to use AI for bot-like features will likely open a new era in simple, plain language reporting.
The bigger picture of AI governance and responsible AI is also looming big in this conversation. With access to essentially all of an organization's data from the backup and archive infrastructure, what should you allow users to do? Are there some limitations to the (private) models you can build?
We expect to see more in the next few quarters on this topic from backup and recovery vendors, but also have a word of caution: Don't just "AI wash" your message. Articulate your views, strategies, use cases and governance guardrails.