The Context of Privacy Compliance: Data Morphology
Hopefully by now you are well on your way toward EU General Data Protection Regulation (GDPR) or California Consumer Privacy Act (CCPA) compliance. If not, like the 60% of you who aren’t there yet, you’ve probably realized there are other privacy regulations that do apply and you’re aspiring to those.
Either way, one of the first steps to getting there is figuring out what personal data you maintain and where it is—something that Nuix is very good at helping with. It’s the first step in the seven-stage methodology we encourage for regulatory compliance.
Classification with Nuance
In the process of discovering what personal data you have, you will occasionally find some very nuanced situations that can create an interesting investigative process. Data privacy isn’t always easy or straightforward, and data doesn’t always fit neatly into clear and distinct ‘buckets.’ Look at nature if you don’t believe me—the duck-billed platypus comes to mind.
Not only are data classifications not always clear, they can change over time. Data can morph from one classification to another based on circumstances. In other words, the content does NOT change, but the context does—what we call data morphology.
Let’s look at some examples of content changing value over context changes.
Unused, Outdated, Useless
The first time I came across this was during a network drive cleanup project (aka defensible disposition). The client asked, “Why are so many people creating so much garbage on our file systems?”
The answer is that nobody ever intentionally creates useless content. The value of the content changes over time and circumstances. The value of a piece of content decreases over time, when somebody leaves, when the purpose for its creation disappears, when security structures change, when new versions come out, and so on.
In all cases, the content doesn’t change, only the context. I have another blog coming on garbage files soon—stay tuned!
That Invoice Is Suddenly Important
Similarly, an invoice is an invoice is an invoice—unless it becomes evidence of fraud or some other crime. The invoice didn’t change, but now it is evidence and should not be disposed of according to your retention schedule.
This means the function of this piece of content is different. The invoice could also become part of a training program, a security model, or a project summary. In each case, its function changes, but not its content.
Hey, That’s Mine
Personal data, when viewed through a data privacy lens, can also morph in importance. A photograph of a person may not be considered ‘personal’ because there is no way, just by looking at it, to know who it is. Once you add a name to it, even though the picture doesn’t change, the game changes.
Alternatively, what happens if you can identify someone just by their picture, but they change their appearance significantly? Maybe that guy grows a beard and shaves his head? Or that woman cuts 16 inches off her long hair, colors it red, and starts wearing glasses. The picture didn’t change—but good luck identifying the person by it!
Identification and privacy are trickier than you’d expect. According to the EU’s ICO, “A name is the most common means of identifying someone. However, whether any potential identifier actually identifies an individual depends on the context. By itself the name John Smith may not always be personal data because there are many individuals with that name. However, where the name is combined with other information (such as an address, a place of work, or a telephone number) this will usually be sufficient to clearly identify one individual. (Obviously, if two John Smiths, father and son, work at the same place then the name, John Smith, and company name alone will not uniquely identify one individual, more information will be required).”
Accept Nuance and Embrace Context
The point is that merely identifying a number though a regular expression or being able to search for a name may not be enough.
Accounting for data morphology in the pursuit of data privacy compliance in the wake of GDPR or ahead of CCPA means leveraging a variety of search capabilities in a repeatable fashion against a full and comprehensive index of content and context data. It sounds overwhelming but can be accomplished with the right tools and approach.