Getting Started with Scripting on the Nuix Engine – Part 1
For those who aren’t programmers ... square brackets often represent an array. You could say the introduction above as "Hip Hip Array!”
Don’t be too scared, that’s the last dad joke, I promise!
Scripting with the Nuix Engine
The Nuix Engine can be applied in many ways to many problems. When I try to explain the Nuix Engine to people who are not from the legal or eDiscovery sectors, I often use the analogy of a universal unzip tool that will not only extract the binaries of child objects, but also their text and metadata. It will also identify what those extracted items are; optionally storing them in a case for further indexing and searching.
Considering the Nuix Engine supports many thousands MIME types, including zips, it most likely is the most extensive data extraction tool in existence. The aim of this article is to introduce you to the main entry points for building a script that will allow you to tap into the power of the Nuix Engine.
Let’s start by arming you with the documentation you’ll need to get going!
All good dev posts include a link to documentation! You can access the documentation you need by visiting our download site (note, you’ll need a username and password to access this site) or by opening Nuix Workstation and going to Help → Help Topics.
I recommend you start your journey with the script console open (open Nuix Workstation Scripts→ Script Console) with the documentation nearby.
Where Can I Run Scripts?
You can run scripts a few ways with the Nuix Engine. Specifically, they can be run on the console, on the welcome screen, inside a case, at the worker or in the results view table. Before starting your script, think about what sort of automation you intend to create.
Let's start with the purely command line options. Open the terminal of your choice, navigate to the location Nuix is installed and run the following command (in this case, assuming Nuix is installed on Windows inside Program Files\Nuix\Nuix 9.0:
cd "c:\Program Files\Nuix\Nuix 9.0" nuix_console.exe -interactive
If you have troubles with this, you may need to add in one more switch for workers. If you have multiple licenses, you’ll be prompted to select a license. The good news is, those switches the tool suggests can be copied so you won’t get prompted next time. Stash them away and we’ll revisit them soon:
cd "c:\Program Files\Nuix\Nuix 9.0" nuix_console.exe -interactive -licenceworkers 2
When the screen shows “irb” you’re good to go:
This is the Interactive Ruby console. You can interact with the Nuix Engine directly here without any prewritten script. For example, we can run commands like this, which will output ‘Hello World’ with the license description.
puts("Hello World, you have selected:"
While this approach has its uses, it’s not the most convenient way to write a script. I’ve fallen in love with using the interactive console for MIME type detection on a file or running a quick test on Nuix Engine capabilities.
Command Line Script
The inevitable question that follows is “How do I save my script?” Writing code every time is a real pain!
Remember those switches we stashed away earlier? Bring them out and let's play. Remove the “-interactive” switch and append the licensing switches. The very last parameter is a prewritten script (Ruby, Python, ECMA) followed by the inputs.
cd "c:\Program Files\Nuix\Nuix 9.0" nuix_console.exe -licenceworkers 2
Any parameters following the script will present themselves as the script arguments rather than Nuix arguments. Make sure your script is the last and the script followed by any input you need. In the example above, "Cameron" will be passed along to the script.
At this point, you may ask “What’s the point?” You have access to interactive and all that is changed is the code is now a file. That’s not entirely true! You now have a reusable script, and you can write code to capture the inputs. Combining that with the switches for all our licensing needs this can be run completely headless. Headless scripts are super useful for scheduled jobs or automation pieces triggered by another source. They will spin up, consume a license, do their automation and then close.
Some of the crazy ideas I have seen using this approach include:
- A small Windows menu to right click a file and send the file to Nuix for processing automatically
- Conducting daily case audits in a directory
- Migrating cases and creating reports.
Nuix Workstation: Script Console/Script Menu
What if you want to have a license and run multiple scripts? Say hello to Nuix Workstation and the scripts menu/scripts console, which is available even before opening a case. Clicking the scripts menu shows a list of installed scripts. Start in the script console; it has a similar sort of feel to a script editor that allows you to write and execute scripts.
Script console is my favorite place to hang out when you get down to it. Drafting and testing scripts requires an iterative approach: Write, test, break, write, test, break. Console testing drives me nuts waiting for the Engine to spin up when I only want to work with a quick bit of code. Script console is by far the script writer’s preferred place to be with one major catch: There is no auto-drafting (there is an in-memory cache, but it only lasts while in session). If you accidently close that window or case, your script goes poof. Ah man! What a bummer!
However, the experience gives you a cancelation ability, so if a bit of code is misbehaving you can easily cancel.
Introducing the Script Directory
Anything in the script directory is here to stay. Scripts in the directory can also be run on demand under the scripts menu, which is a nice user experience. You can even style these scripts by wrapping them in a ‘.nuixscript’ wrapper (see Scripting → Advanced Help in the changelog). However, once started a script cannot be cancelled.
Scripts can also have a small amount of header information in the script directory to control if they require selected items or a case open. This automatically disables the script so users aren’t tempted to click it.
Let's get serious now, users are not a big fan of automation if they have no real ability to interact with it. Good news! You have access to everything that is in Java, so swing, inputboxes, JDialog and all that goodness is at your fingertips. If doing that is not up your alley you can jump on our GitHub to see some examples or pull one of our utility jars down (my favorite is the nx.jar).
Even More Possibilities
We’ve now talked about how to interact directly with the Nuix Engine from a script and from within Nuix Workstation. There are two more ways to run scripts that are unique in their design.
Scripted metadata can be created via the metadata profile. Have you ever wanted a column formatted in just a certain way or to provide a combination of fields based on a condition? Scripted metadata is generated at the time of the results view being shown. It’s unique per row of the results view so be careful connecting with external resources (thread safe + multiple IO requests), but otherwise it can do some amazing things.
I wouldn’t recommend it because of the IO demands of the case, but you could even tag an item every time it is viewed or calculate the result on first view and then cache it as custom metadata for next time. If the calculation takes a long time this may be worthwhile, but personally I prefer to do a bulkAnnotater job!
The benefits of scripted metadata really come into play when you want to present a value to a user in a particular way based on a condition but only on demand. This can be instrumental in making an export profile look perfect or show only some details when relevant and not store them in the case. Lots of fields and data may bloat your case, so having data only on demand can shrink this down.
There are some examples of scripted metadata available on GitHub -> Scripted Metadata Profiles.
This simple scripted metadata example will display the datestamp of when the item was last processed:
Worker Side Scripts
These scripts are potentially the most versatile as they operate on the item as it gets processed by the workers. This means that without any developer effort we now have access to threads that can operate continuously across billions of items. With worker side scripts you can focus on filtering, hydrating, morphing and reporting on the data being processed, in-flight, prior to the item being written to the Nuix index.
Before I go any further, you can get the worker side script guide on our download site as well. As I mentioned earlier, you will need a username and password to access that site.
For example, what if I was provided with a flat list of CSV records? Utterly boring to process, right? Why bother? Well, if I was told they were call records and the client wanted to have them appear in communication searches and the timestamps of the communication could overlay the activities of the investigation, my opinion about the list would totally change.
With the worker side script, we can look at the properties being brought in by the CSV, each of the columns being moved (morphing) into the appropriate communication field and using an external source to provide the phone number to alias (hydrating). Once completed, it’s likely that billions of records would be deemed as too many to review (who would have thought, right?).
Supplying a list of known ‘exempt’ internal numbers to another worker side script allows it to check for any of the exempt aliases calling external numbers not owned by a staff member, culling the records from billions to about 6,000 in one shot. This is much easier to review, made even easier by searching across dates and custodians with analytics.
For good measure we can also add some limited output for reporting, so when it comes to proving our methods we have a huge log file of skipped records, who they belonged to and why they were skipped.
This has been an extensive look at the scripting options available to developers working with the Nuix platform. In the second half of this series, I’ll cover some of the documentation and further possibilities.