Setting Up A Forensic Hash Server Using Nsrlsvr ~ Mulyana Sandi

When working a case involving media that contains operating system, application, and user data files, it is important to be able to efficiently and reliably differentiate files that warrant examination from those that may be normal system files. One effective way to do this is to set up a forensic hash server on your analysis network. A forensic hash server centralizes your repository of hash-sets for known files as well as provides dedicated resources for managing hash queries. Thanks to the great work done by Rob Hansen with support from RedJack Security, we can easily setup our own hash server using his Nsrlsvr project.

Nsrlsvr Overview

Nsrlsvr is a C++ application that can be compiled on Linux or OSX. It takes its name from the National Software Reference Library (NSRL) project which is maintained by NIST and supported by the Department of Homeland Security and other law enforcement agencies. The NSRL is an extremely large database of known/valid application files, their file hashes (md5,sha1sum), and associated metadata. While the NSRL is a wonderful resource, its overall size (over 30 million entries) and flat text format make it unwieldy to run a large number of queries against (trust me - you don't want to grep or findstr against this). That is where Nsrlsvr comes in. Nsrlsvr loads this data set into memory and makes it easy to perform bulk hash lookups using standard open-source forensic tools (in particular md5deep).

Server Setup: Compiling Nsrlsvr

Compiling Nsrlsvr is not difficult provided you have enough disk space and a few gigs of RAM. Here are some basic setup instructions for getting this running under RŌNIN-Linux.

Step1 - Download zip of latest release of Nsrlsvr.

Step 2. Basic Compile
( Note: During the configuration stage, scripts will download the NSRL database and process it; this may take some time depending on your bandwidth and system resources.)

sudo apt-get install build-essential
unzip ./rjhansen-nsrlsvr*.zip
cd ./rjhansen-nsrlsvr*
./configure && make
sudo make install

At the end of the build, you should have a nsrlsvr binary (/usr/local/bin/nsrlsvr ) as well as a master hash table extracted from the NSRL data-set (/usr/local/share/nsrlsvr/NSRLFile.txt ).

Launching Nsrlsvr
If you take a look at the man page for Nsrlsvr, you'll that it is really easy to fire-up following installation. To spawn a nsrlsvr daemon that is loaded with the NSRL reference data set, we can simply issue the command (can drop this in rc.local to run on each boot):

nsrlsvr

( Note: The default tcp port for this process will be 9120. Also, the nsrlsvr daemon consumes a good bit of RAM when loading the NSRL reference data set. Developer recommends 8GB RAM and 64-bit OS for adequate performance).

Client Setup: Compiling Nsrllookup
Our analysis systems will also need software installed to be able to issue hash lookup queries queries to the Nsrlsvr daemon. To handle this we will install Nsrllookup.

Linux compile instructions are below. If your analysis systems run Windows, the developer also provides pre-compiled binaries (32-bit, 64-bit).

Step1. Download zip of the latest release of Nsrllookup.

Step 2. Basic Compile

sudo apt-get install build-essential
unzip ./rjhansen-nsrllookup*.zip
cd ./rjhansen-nsrllookup*
./configure && make
sudo make install

At the end of this build, you should have a nsrllookup binary (/usr/local/bin/nsrlookup).

Performing Hash Lookups
Now that we have the server up and running and our client has a query tool installed, we can start performing hash lookups. To do this we will use the md5deep utility to compute the hashes and Nsrllookup to issue queries against our hash server:

md5deep -r ./image_mount_point/|nsrllookup -K known_files.txt -U unknown_files.txt -s hashserverip

With this command, we are using md5deep to perform a recursive scan (-r) of all files contained within our image mount directory. We are piping the returned hash values to Nsrlookup which is in turn querying our central hash server. The flags (-K, -U) sort queried files into two categories based on whether files matches (known) an entry in the NSRL reference data set or they are not matched (unknown). With these two report files, we are now able to focus-in our review efforts on those entries/objects which were not located in the NSRL database.

Using Custom HashSets
Nsrlsvr is also capable of loading custom hash sets that you provide. This is a useful function as you can launch multiple nsrlserv processes on varied ports that allow you to query against different hash sets.

If you're responsible for DFIR in corporate or other enterprise computing environment, this function can be really useful for building and loading hashes from desktop and server gold build images. Another usage idea would be to create a cron job that generates hash files (and/or piecewise hashes) for any malicious files (in-house zoo), illegal images, or other content that you might want to do initial sweeps for early on in an investigation. To build a custom hash set, we use md5deep and perform some string manipulations to get it into a format that Nsrlserv will readily parse (see below).

md5deep -r -c /media/goldimage/|tr '[:lower:]' '[:upper:]'|tr "," "
"|awk {'print $1'} > goldimage.hash

We can then fire-up and background another Nsrlserv process by doing the following:

nohup nsrlsvr -S -f goldimage.hash -p 7070 2 & >&1

(This binds the new nsrlsvr process to tcp port 7070).

To run a query against this new listener we can point nsrlookup on our client to this new port (7070) and print known files in our custom hash set (-k for known):

md5deep -r /image_mount_point/|nsrllookup -k -s serverip -p 7070

We can also actually chain queries using these multiple nsrlsvr listeners. For example, if you want to list all files whose hash values do not match any entry (-u for unknown) in both the core NSRL data set or your custom data set; you can do something like this:

md5deep -r /image_mount_point/|nsrllookup -u|nsrllookup -u -s serverip -p 7070

As we can see, Nsrlserv and Nsrllookup are really useful resources to help with data reduction at the onset of an investigative case as well as for quick review of content that you want to flag.

Source : http://blog.jameswebb.me/2013/05/setting-up-forensic-hash-server-using.html