Wednesday, October 26, 2011

Evidence visualisation

I've been doing a load of research on trying to easily visualize digital forensic data with the hope that patterns, frequencies and clusters would stand out easily.  There are already excellent tools that do a great job for primarily email such as NUIX and Intella, but these are pretty expensive beasts.  You can also look at software such as I2's Analyst Notebook but now we are talking stratospheric money, out of my league.

My mind was focused when a friend at the Met Police introduced me to a new tool call Bulk Extractor from Simson Garfinkle which scans across an image and extracts data strings, very quickly, based on a plugin structure.  I set out to run Bulk Extractor against a RAM image and had tremendous results.  The tool will extract email addresses, URL's, search terms, Credit card numbers, telephone numbers and others, and does so with aplomb.  The tool generates a list of text files which can be analyzed with the Bulk Extractor Viewer. You can run it against disk images, phone memory dumps and RAM. This is great, but when faced with a list of 10,000+ URLS where do you start.  This is where some visualisation help really comes in.

After alot of looking around I came back to a tool I have used many times, Maltego.  Maltego is primarily used for the enumeration of Internet data, connecting IP's, WHOIS, email and domain information to enable the mapping of an online infrastructure.  It also enables the importing and graphing of text/csv files.

I ran Bulk Extractor against an old 512meg RAM dump and amongst other things it extracted URL links between over 3000 IP addresses.  Normally I would move on quietly(!), however, I tidied up the columns in Excel and imported into Maltego, mapping the  URL address columns.  This is what I saw:-

Each little cluster represents URL's linking to a central URL in the hub.  A quick look shows the most popular URL's at the top with many links.  Straight away the list of 3,000 is somewhat more manageable if we are interested in popular links.

Zooming down we see:-

Although a tad tricky to see there are little links between the nodes with URL addresses linking to the primary URL.  We simply draw around a cluster and then we see:-

Although the URLS linking in are hard to see, believe me they are there, showing all the URLS that link to the central URL.  How cool is that?

Next I thought IP addresses would be fun, except we had over 10000 entries from the one RAM dump.  However, it mapped very well:-

Again there are some very obvious clusters which may be of interest.  Scrolling in we see a very definite structure:-

Scrolling in further we see all the interconnected IP's with a very interesting structure with clusters grouped together into super-clusters.

Further again and we see the individual addresses:-

Now we can see each individual connected IP and their port numbers.  Now Maltego really comes into its own.  We select the centre of the cluster and select the Transform to reverse look up the domain and TLD.  As if by magic the graph redraws this cluster and we get:-

We now can see that all of these IP's are referencing back to and it is a very popular cluster in the RAM dump.

Being able to 'see' data in this way can help the investigator to quickly zone in on the important areas, seeing, if you like, the wood for the trees.

I'm now doing work on mapping outputs from Volatility and will blog again in a few days.


Nick Furneaux