Building Unstructured Data Solutions for Today and Tomorrow: Part 2 – Endpoint & Cloud Collection
Most organizations that deal with investigations or litigation will typically need a way to collect and preserve data stored on endpoints across their enterprise. While collection is generally not the most expensive stage of the EDRM, it can add up over time. For example, manual collections can range from $250 to $500 per hour, which accounts for roughly 4% of total expenses per case (see pages 159-160). That doesn’t sound like a lot, but with today’s disparate data sets and growing data volumes, those expenses can grow exponentially quickly.
Likely this process may also be evolving for some who are looking to bring more collections in-house, while others may not have the right tools. Perhaps the existing tools are over-collecting and can’t provide precise collection capabilities. There is also the scenario where too many tools are acquired and create what I like to call the “Frankenstein effect.” Having too many tools causes a disjointed process and does not help with a repetitive workflow that can likely change over the course of time.
While there’s nothing wrong with this approach, there may be larger business-driven objectives that are impacted by not having a more streamlined solution in place. At the end of the day, more stakeholders could be served if the right workflows and/or proper repositories are leveraged.
Before diving deeper into the available collection options, I want to highlight just a few of the most common business-driven objects that I come across. They usually start with: “We need to…”
- “…be able to perform self-collections and reduce external costs…”
- “…consolidate the number of agents we use due to agent proliferation…”
- “…standardize on a best-of-breed but usable platform that doesn’t require forensic training …”
- “…decentralize collections for self-service across geographies, with less disruption over the network…”
- “…create defensible business process around eDiscovery, investigations, and regulatory response for courts and compliance…”
- “…support targeted collections to minimize data going into downstream processing, hosting, analytics, and hourly attorney review…”
Building on a Solid Foundation
Aside from being able to solve the business objectives laid out above, I think the most important fact is that collecting data is the foundational component to providing the 'end-to-end' solution, and thus breaking down the “Frankenstein effect” is a huge advantage.
Here are a few things to consider about our approach:
- Defensibly and repeatedly collect from endpoint data sources using error-free templates, with minimal training and disruption to users
- Targeted collection reduces irrelevant junk entering attorney review
- Meet court deadlines for even the largest discovery requests; prevent spoliation, sanctions, or regulatory fines
- Part of our 'end-to-end,' best-of-breed software suite is trusted by corporations, global law enforcement, and regulators.
How Do We Do It?
Our endpoint-based collection technology provides an intuitive interface built on top of extremely powerful, robust, and extensible technology. The best part is that within minutes or hours after installation, you can start using the software and seeing results.
Some of the key features Nuix Enterprise Collection Center offers include:
- Multi-OS collections, including Windows, Linux, and Mac (yes, even devices with T2 chips)
- Collection of deleted files from NTFS and FAT file systems
- Forensically defensible disk images (E01 & DD) and targeted collections (MFS01)
- Volatile system information collection, including processes, RAM, and network packets
- File/folder deletion along with the ability to scrub deleted file space
- Portable collections delivered on a USB device for air-gapped or hard-to-reach systems
- Reports and audit log of actions undertaken for complete defensibility, including email alerts.
Collections with a Purpose
With this endpoint collection technology, you can streamline and collect only what is required. Robert O’Leary, Head of Investigations at Nuix USG and former Detective with the New Jersey State Police, loves how easy it is to browse file systems in real-time.
When asked how valuable this capability is to investigations, Robert said, “The ability to browse the endpoint for specific information that you may not have anticipated is amazing. For instance, perhaps you are looking for a specific Windows User Account, but you inadvertently realize other user accounts of interest have profiles on the endpoint.”
Like the file system browse feature, there are many other features that provide users complete flexibility and control over how the collection is performed, what specifically is collected, and where the collection is stored.
- Built-in and customizable filters for document types, time stamp, hashes, and custodians
- The ability to use plain text or regular expression searches to further reduce the data set
- Inspecting complex container files such as PST or ZIP files and only collecting the responsive items
- Browsing file systems in real-time for precise, targeted collections
- Various pre- and post-collection tasks can be sequenced so manual work can be avoided, such as deploying scripts, running executables, and shifting collected data from one location to another
- Collected data can be stored locally, to the network, to a staging location, or even directly to Amazon S3
- Built-in fault tolerance provides the ability to resume incomplete collections.
Nuix even offers solutions when dealing with modern data sources like cloud data repositories such as Office365 or Amazon S3. Using our traditional processing framework (we will talk more about this in my next blog about data processing and preparation), we can extend our cloud connectors to be used for traditional collections.
With this solution, you can target modern data sources using a “point-in-time” download. Once this collection is executed, the data in the targeted repository is downloaded exactly how it exists. Any data not visible to the end-user can also be downloaded if necessary.
With the 'one-two punch' of endpoint and cloud collections, the first step to tackling business requirements for various unstructured data initiatives can become a reality.
Collecting with Confidence
Based on what I described above, here is a recap of a few ways of Nuix Endpoint solutions can help with these objectives:
- Bring collections in-house or replace multiple tools with one
- Create a repeatable and defensible collections process
- Automate complicated collections
- Create forensically defensible evidence files and preserve the chain of custody
- Delete files and folders securely while maintaining a defensible log of actions
- Locate and process critical or sensitive files for collection or deletion
- Move quickly from collections to processing and beyond.
All in all, having a collection solution in place today is the most critical and foundational component of your downstream workflow. Again, there is no right or wrong answer. Carefully consider the approach your organization is taking and determine if that meets your needs for today, but importantly, tomorrow.
To underscore that point, I must go back to my colleague Robert here again. When we talked about collections in preparation for this article, he said, “having a collection solution is certainly a must, but more important is one that can provide a comprehensive view into the data including things like user activity. Being able to do this quickly and effectively is something organizations need in order to decrease the time to triage.”
I’m sure that has given you a bit to think about! In part three of this series, I’ll continue to expand upon the idea of this holistic solution, moving from collections right into my favorite topic—data processing and preparation.
Until next time!