EDRM Enron Data Set
The EDRM Enron Data Set Cleansed of Private, Health and Financial Information
The Enron data set originally published by EDRM has served for many years as an industry-standard collection of email data for electronic discovery training and testing. Since this data set was published, it has been an open secret that it contained many instances of private, health and financial data about the company’s former employees.
Cleansing the Data
Nuix specialists cleansed the EDRM Enron data set of private information. We identified and removed more than 10,000 items of information including:
- 60 containing credit card numbers, including departmental contact lists that each contained hundreds of individual credit cards
- 572 containing Social Security or other national identity numbers—thousands of individuals' identity numbers in total
- 292 containing individuals' dates of birth
- 532 containing information of a highly personal nature such as medical or legal matters.
Many items contained multiple instances and types of information. This included departmental contact list spreadsheets with dates of birth, credit card numbers, Social Security numbers, home addresses and other private details of dozens of staff members.
In removing these items and making the cleansed data set available to the community, we hope to protect the privacy of hundreds of individuals.
Nuix is also pleased to offer the legal and investigator community the methodology we used for identifying personal and financial data in corporate data sets.
- Download the EDRM Enron data set case study: "Removing PII from the EDRM Enron Data Set: Investigating the prevalence of unsecured financial, health and personally identifiable information in corporate data" for a detailed methodology.
- Download the cleansed EDRM Enron data set. (v1.3, last updated July 29, 2013)
What Risks Lie in Your Data?
Although the EDRM Enron data set is more than 10 years old, most organizations still face significant risks relating to private information stored in their systems.
- Using Nuix Investigator tools and the methodology outlined in our case study, you can identify inappropriately stored private, health and financial data and take immediate steps to remediate the risks involved.
- Nuix also offers information governance products and solutions to locate and remediate these risks in email, file shares, and archives.