Xplore_AI: Into the Breach!

Written by: Chris Stephenson

Chris StephensonXPLORE_AI: INTO THE BREACH! - April 2024

One of the fascinating things I’ve noticed after thirty-five years in the IT space is how some things just haven’t changed, despite the eye-watering technological advancements we’ve made. Chief among the unchanged items is the ever-present data challenge. Year after year I seem to have the same “Groundhog Day” conversations with customers and partners about both the importance and complexities of managing data. The conversation always ends with the same rhetorical shrug and some half-hearted jokes about the low cost of data storage, followed by nervous laughter. I’m starting to understand how my dentist feels when we have our semi-annual flossing talk. We both know nothing’s going to change, but that doesn’t stop him from bringing it up.

Well, it now appears this stubborn data ambivalence may be starting to wane, catalyzed by continuous and unrelenting data growth, regulatory crack downs on data over-retention, increasing costs (and emissions) for storing data and an alarming rise in cyber incidents around the world.

The good news is this confluence of factors is helping business leaders recognize the rapidly expanding risks and vulnerabilities associated with their data. Evidence of this shift can be seen in the forecasts showing that the $4B data governance market will quadruple over the next 5-7 years (20% CAGR). One critical first step, however, regardless of the size or complexity of an organization, is getting to know your data.

Embracing Data Enlightenment

According to IDC, roughly 68% of enterprise data remains invisible or unused, hence the moniker “dark data”. Most organizations have been so focused on creating, collecting, storing, and moving their data around, that they have missed the opportunity to exploit its value. 

Data enlightenment goes far beyond the traditional approaches of meta data scans, regular expressions, and key word search and tag. Truly knowing your data requires comprehensive cognitive AI-powered solutions that can index, interpret, score, and prioritize data based on a wide range of attributes such as record type, topic, concept, risk level, or sentiment. Only from this granular level of understanding, do we have a chance of truly leveraging our data and more importantly mitigating the myriad risks it contains.

Why Cognitive AI?

Although much of what has been written about over the last 18 months has been focused on Generative AI (GenAI), this promising and relatively new flavor of AI is not designed to address most of the use cases we discuss with our customers. Data Breach Readiness and Reporting is a glaring example of this. Data breaches do not require the generation of new artificial data. Rather, these disruptive and stressful incidents do require technology that can rapidly interpret, score, and prioritize breached data as a human would, except with enhanced speed, scale, and precision. This requires the intuitive and interpretive capabilities of Cognitive AI (CogAI), which is what drives the Nuix Neo Data Privacy solution, supporting data breach use cases.

Nuix Neo Data Privacy surfaces and scores a broad range of sensitive, personal, or otherwise ‘risky’ and relevant artifacts. The immediate result is an automated report that shows the scope of a given breach, offering detailed insights into each discovered item, things like personally identifiable information or personal health information (PII/PHI), confidential information, intellectual property, as well as content category, record type, sentiment, etc. This orchestrated, automated process allows a customer to act in record time to help them remain in compliance with relevant regulations. Yet while this is a market-leading breach solution, what I have just described is a purely reactive use case. This technology can also be used prior to a breach as a proactive measure.

The ‘Pre-Breach’ Mindset

Cyber breaches are rising at alarming rates around the world. In fact, on the heels of a steady increase in 2023, the quantity of data incidents in the U.S. nearly doubled in the first quarter of 2024, amounting to 841 total (642 cyberattacks and 85 compromises). For added context, Q1 of each year typically is the period with the fewest reported compromises.

The unfortunate truth is that data incidents have become an inevitable part of our hyper connected, fast-paced digital world. It is no longer a matter of ‘if’ but ‘when’ you will be breached. This even includes air-gapped environments, since according to researchers at Stanford University, 88% of data breaches are caused by employee mistakes.

Knowing your data can drastically reduce risk exposure by providing detailed intelligence about where your ‘crowned jewels’ are stored and how much private and sensitive data you have. A pre-emptive posture enables you to maximally protect what matters most to the business, and adeptly respond to incidents when they occur.

Given the realities and implications at play, consider viewing your data as always being in one of two states: Pre-breach or post-breach. Taking this binary view, as part of a natural course of business, changes the operational mindset and helps generate clarity around data awareness, privacy, prioritization, and management. From this perspective, new questions come to light: 

  • When a breach does occur, how will we know what was exposed?
  • Do we know where our most valuable and most sensitive data exists?
  • What kinds of PII and PHI do we store…how much, who owns it, who would we need to notify, and what jurisdictional rules would apply?
  • What is our current data risk exposure level today?

Note how this one simple shift in perspective changes how we think about our data. It helps move an organization from a purely reactive state to a proactive state of readiness. Imagine having the intelligence at your fingertips that could show you the magnitude of a breach on your organization? Fortunately for Nuix customers, this is now a reality.


It has famously been said that sunlight is the best disinfectant. Leveraging cognitive AI-enabled solutions illuminates your dark data to optimize breach preparedness and reduce your organization’s data risks. Unfortunately, this does not yet solve the dental hygiene challenge, but I remain hopeful.


Chris Stephenson
Head of AI Strategy & Operations