Using ten dimensions of data to make fact-based decisions

Written by Stephen Stewart


No matter what kind of organization you work for or with, you’re challenged to make fact-based decisions every day. This holds as true for incident responders looking into potential misuse of company resources as it does for litigation support personnel delving into evidence collected for pending litigation.

Nuix offers powerful data collection, processing, search, and analytical capabilities for all sorts of unstructured data, whether the data is located behind your firewall on enterprise endpoints, mobile devices, file shares, legacy email archives, or in cloud repositories like MS O365, Amazon S3, or others. In a nutshell, Nuix handles more data and more file types across more locations at ridiculously fast speeds.

To best describe Nuix’s core strengths, I like to focus on the three “Vs”: volume, velocity, and variety.

  • Nuix can process an extraordinary volume of data—from gigabytes to terabytes—and can create up to petabyte-scale data lakes.
  • Nuix’s parallel processing technology can manage any data size at great velocity.
  • The variety of file formats Nuix can process is world-class.



In my opinion, the variety of supported data sources is one of the most unique things about Nuix. The depth and breadth of natively supported file types never cease to amaze me. It would be impossible to list all of those on a single page, and quite frankly, you probably would stop reading this blog after a while if I tried.

But, I’ll ask you to humor me for a few minutes. We’ve been talking for a while now about classifying things into ‘Ten Dimensions of Data:’



Here’s a quick breakdown of each dimension with some micro examples:

  • Human-generated
    • Email files and databases: PST, OST, NSF, EML
    • Documents: PDF, DOC, XLS, PPT
    • Images: JPG, TIFF, PNG, BMP
    • Container files: ZIP, TAR, ISO, GZ
  • Digital & Mobile Forensic Data
    • Forensic images: E01, L01 AD1, DD
    • Mobile images: Cellebrite, MSAB, Oxygen
    • System files: EXE, DLL, LNK
    • File system artifacts: $LogFile, $UserJrnl, PLists
  • Network Data
    • Network captures: PCAP
  • User Data
    • User and endpoint behaviors: DNS, Keystrokes, NetFlow, Print
    • Location data: Image Geolocation, IP Geolocation
  • Real-time Feeds
    • Third-party intelligence feeds: CRITS, Stix, Yara
    • Social media feeds: Facebook dumps Twitter feeds
  • Enterprise & Cloud Repositories
    • Archive systems: Enterprise Vault, EmailXtender, SourceOne, EAS
    • Cloud repositories: MS Office 365, Amazon S3, Box, Dropbox
    • Virtual machine images: Parallels, VDK, VMDK
  • Communication Data
    • Patterns: Email, Call Records, Chats/Messages
  • Multimedia
    • Audio & video files
  • Log Data
    • Log files: Weblogs, Event logs, CSV/TSV
  • Structured
    • Databases: MS SQL, Oracle, SQLite

As you can see, these ten dimensions really cover the all of file types that are part of any eDiscovery, investigation, cybersecurity, or governance, risk, and compliance (GRC) use case. Which dimension is most important to your organization? Is there a certain type of data your organization struggles to capture or get value from? You’re probably not alone … and Nuix can help.



While each organization may answer this previous question differently, Nuix offers a seamless workflow when dealing with many use cases. The below graphic illustrates how Nuix can easily consume these various data sources and allow organizations to make fact-based decisions. Our platform even provides a layer of analytics, including machine learning, which can help cut costs and reduce time in large-scale document reviews.