Saturday, December 17, 2011

Forensic visualization Part 2 - Court Case



Visualization gone serious

I blogged some weeks back on research I was doing around visualization of forensic data which was well received with some very interesting comments from readers (both of you!).  However, the week after the posting I was asked to be involved in a prosecution of a man who was accused of various forms of grooming, sexual assault, voyeurism etc of several teenage girls in his community centre.

The case has now concluded and the man received 4 years prison, so a good result, however I wont name the case as I refer to the victims and they deserve as much anonymity as possible.

The case revolved around a large amount of Facebook chat between the accused and the girls, and between the girls themselves.  Some of the chat was quite damning and on the face of it, it was clear that he was trying to talk the girls, one in particular, out of coming forward with what had been happening using emotional blackmail.

His defense on the Facebook chats was that the girls had logged in as him and had chats between themselves, implicating him in wrongdoing. 

I was asked to consider the workings of Facebook, could they log in at the same time as him on a different computer, would he have a record on his own machine and what were the ‘relationships’ between the parties involved.

The word, relationships, got me thinking, could we visualize the data to ‘see’ the relationships and would it be easier for a jury to understand and interpret?  Now, it is easy to map out Facebook ‘Friends’, the excellent Facebook Visualizer as well as the Facebook transform in Maltego will help with that task, but that doesn't really help us understand the activity that exists between those people.  Although Im not much of a Facebook user I have load of buddies on Skype but some of them I haven't spoken to in years.  Just because the accused and Girls A,B, and C were on each others Facebook lists and the fact that there was some chat doesn't ‘a relationship make’!

I used IEF 4(Internet Evidence Finder) to carve all the Facebook chats and fragments out of the 4 hard drives, it even did a great job on the accused’s Mac hard drive and I was left with 4 CSV files with thousands and thousands of chats.  Now to make some sense of it.

I tidied up the CSV’s, removing some of the metadata that I didn't need and essentially just left the FROM, TO and the CHAT columns.  Next I imported this data into Maltego as an Edge weighted graph.  I expected this to cluster the chats around the person who made them and it worked better than expected.

Fig 1 shows the recovered chats on the accused’s computer and who he was talking to.  Each orange dot is a person he has chatted with and the surrounding green dots are each individual chat.  The primary cluster, centre left, is the accused with all his chats; being his machine we would expect this to be the largest cluster.  As we can see there are many chats to many different people, however, our eye is quickly drawn to the 2nd largest cluster on the centre right.  This is a person he talks to more than anyone.  Rolling our mouse over the orange dot in the centre of the cluster, surprise, surprise, it is our 13 year old Girl B.  The 3rd largest, at the bottom, is his best friend, but top right, Girl A. 

Fig 1


This graph gives us an excellent tool, aside from just numbers and statistics as to who was important to him in a Facebook setting.  The question, was this just a girl or girls with a crush, that it was one way traffic, is quashed by this graph, Girl B and Girl A are the 1st and 3rd most frequently communicated with persons on his extensive Facebook buddy list.

Encouraged by the success I did the same process on the machine of Girl B.  This time, as there were many different chat partners I also removed the chats that only existed once or twice, the boy at school saying Hi, a friend inviting to a party etc, but which were not repeated with that person.  The results in Fig 2 are fascinating:-



Fig 2


The primary cluster is of course Girl B herself, but no prize for guessing which cluster is the accused??  You’ve got it, the 1st next biggest cluster top left, in fact their chats are almost twice as many as any other person.  Remember we are talking about a teenage girl here with lots of people to chat too and he was chatting with her more than twice as much as her best friends at school.

I then moved on to looking at the relationships with all those involved.  I again used Maltego and imported all the chats from all the machines but removed the actual chat.  This provided a link graph between the Girls and the accused and their friends, also showing connections between those friends.  I will not present that graph as it includes the names of the persons involved but it showed the accused front and centre with chat connections with all the girls involved and showed the connections between those girls and their friends. 

I felt this was very useful to a jury and so included it in my report to the prosecution barrister.  It went on to form part of the jury pack so I can say that my graphs have made it to Court.  Sadly, I was not called to give evidence on this occasion as the defense agreed all our findings and signed a statement to that effect.  Shame really as I was looking forward to presenting this data in open Court and judging the reaction from a jury.  Not that I am expecting wild applause and fist pumping whooping but it would be interesting all the same.

So far I’ve been using Maltego but have been given heads up of other free tools that might do the same job.  The primary tool is Gephi, thanks @danmcquillan for the tip, a superb, free graphing application for Windows or Mac which supports many different output graphs.  So far Im liking it, it takes a little more work pre-application as you need to define your Nodes and Edges for it to successfully graph the links.  I’ve also had problems with the Preview and output elements which keep crashing, I need to pop a message on the forums really.


A Bump on the Node


Just for your information, the visualization industry seems to be dominated by research groups in Universities ‘visualizing’ everything that moves and then posting them on Youtube with no information about how it was done except the message ‘Arn’t we clever!’. 

However, if you want to learn about it you appear to need the brain the size of planet, a doctorate in statistics and a student card.  It is a very difficult area to start learning as a beginner.  For example, search Google for - What are Nodes and Edges.  Go on, try it.  The top link is Wikipedia that presents you with a series of equations that make up graphing theory.  Its a nightmare.

Anyway, for those of you out there with a shriveled 40-something brain like me, a Node is an element such as the person on my graphs and the Edges are the links between them. 

Eg

I am Nick Furneaux.
My friends are Ed, Toby and Chris
I talk to Ed and Toby
I never talk to Chris

The Nodes are:-

Nick
Ed
Toby
Chris

The Edges are:-

Nick - Ed
Nick -Toby

The graph would show links between me and Ed and Toby but Chris would be an unlinked orphan node floating around the graph on his own.  Sorry Chris.

Clear?  Good.

Hear endeth the lesson!

Wednesday, October 26, 2011

Evidence visualisation

I've been doing a load of research on trying to easily visualize digital forensic data with the hope that patterns, frequencies and clusters would stand out easily.  There are already excellent tools that do a great job for primarily email such as NUIX and Intella, but these are pretty expensive beasts.  You can also look at software such as I2's Analyst Notebook but now we are talking stratospheric money, out of my league.

My mind was focused when a friend at the Met Police introduced me to a new tool call Bulk Extractor from Simson Garfinkle which scans across an image and extracts data strings, very quickly, based on a plugin structure.  I set out to run Bulk Extractor against a RAM image and had tremendous results.  The tool will extract email addresses, URL's, search terms, Credit card numbers, telephone numbers and others, and does so with aplomb.  The tool generates a list of text files which can be analyzed with the Bulk Extractor Viewer. You can run it against disk images, phone memory dumps and RAM. This is great, but when faced with a list of 10,000+ URLS where do you start.  This is where some visualisation help really comes in.

After alot of looking around I came back to a tool I have used many times, Maltego.  Maltego is primarily used for the enumeration of Internet data, connecting IP's, WHOIS, email and domain information to enable the mapping of an online infrastructure.  It also enables the importing and graphing of text/csv files.

I ran Bulk Extractor against an old 512meg RAM dump and amongst other things it extracted URL links between over 3000 IP addresses.  Normally I would move on quietly(!), however, I tidied up the columns in Excel and imported into Maltego, mapping the  URL address columns.  This is what I saw:-


Each little cluster represents URL's linking to a central URL in the hub.  A quick look shows the most popular URL's at the top with many links.  Straight away the list of 3,000 is somewhat more manageable if we are interested in popular links.

Zooming down we see:-


Although a tad tricky to see there are little links between the nodes with URL addresses linking to the primary URL.  We simply draw around a cluster and then we see:-


Although the URLS linking in are hard to see, believe me they are there, showing all the URLS that link to the central Mozilla.org URL.  How cool is that?

Next I thought IP addresses would be fun, except we had over 10000 entries from the one RAM dump.  However, it mapped very well:-


Again there are some very obvious clusters which may be of interest.  Scrolling in we see a very definite structure:-


Scrolling in further we see all the interconnected IP's with a very interesting structure with clusters grouped together into super-clusters.

Further again and we see the individual addresses:-


Now we can see each individual connected IP and their port numbers.  Now Maltego really comes into its own.  We select the centre of the cluster and select the Transform to reverse look up the domain and TLD.  As if by magic the graph redraws this cluster and we get:-


We now can see that all of these IP's are referencing back to Yahoo.com and it is a very popular cluster in the RAM dump.

Being able to 'see' data in this way can help the investigator to quickly zone in on the important areas, seeing, if you like, the wood for the trees.

I'm now doing work on mapping outputs from Volatility and will blog again in a few days.

Cheers

Nick Furneaux

Wednesday, September 14, 2011

Downloading files on your iPhone

I just cannot believe how long its been since a blog post, there are just not enough hours in a day.  Then, when I do pop a post up its nothing to do with forensics, great!

I wondered if you have ever had the issue of browsing on your iPhone when you find just the file you are looking for, perhaps a tar, zip, dmg or some other file type that the iPhone does not let you download but that you don't want to browse away from and risk losing for good.  I've found a simple way to achieve it.

If you download the Dropbox app it becomes a option to 'Open with' when browsing the web.  Simply:-

1.  Browse to the file you want to download




 2.  Select Open in Dropbox from the screen and it will copy the file from the site to your Dropbox box account letting you access it from your computer later.


























Its already proving to be very handy indeed. Give it a go.

One other small thing, if you hold down shift on your Mac whilst minimising or maximising a window it does it in cool slowmo!  Who knew!

Thursday, March 31, 2011

Intel SSD's have default AES encryption - worried?


Intel have announced their range of new SSD's with a range of security and data stability tools, the 320 range. The include sizes from 40gig to 600gig (if you have the money!) and my experience is that they are crazy fast. Putting your OS on one of these would make a huge difference to the speed of the overall machine.

However, Intel state that they come with a default AES 128 full disk encryption system which apparently successfully finds the trade off of speed and encryption/decryption. The thought of new machines coming already set up with an AES flavour is enough to make the average digital investigator hang up his mouse and go stack shelves in Salisbury's (small print - other supermarkets also offer shelf stacking opportunities) . Should we be worried?

No.

It is true that the disk, out of the box comes running a AES 128 key providing full disk encryption. However, plug the disk into your machine and it will run with no seeming encryption involved at all? How so? Simply because there is no user key set up as default. To make the encryption 'work' as a security layer the user has to set up an ATA BIOS user password to secure the encryption key. Don't set up a BIOS password, no useful encryption. Excellent!

You can check out the security document here.

Knowing bad guys, and most of us have the misfortune of knowing their computers rather well, they are notoriously mistrusting of encryption and it is unlikely that the computer they buy will come with a big sticker saying how vital it is that they set a BIOS password. Indeed, many people believing that they are experts will read the drive specs, see AES 128 and believe that they are more secure than NASA. All which makes me think I should delete this blog post? Ah well, no one reads it!

Friday, March 4, 2011

Exif and GPS data on a Mac

I was kicking around yesterday looking for a decent Exif viewer for the Mac, I found one or two but they didnt support extraction of GPS data. Turns out my time was wasted and OSX supports and reports Exif data including GPS location data.

Step 1. Open your image in Preview mode.

Step 2. Cmd-i to Open Inspector

Step 3. Click the 'i' tab and select Exif or GPS button


It even has a 'Locate' button to fire the coordinates up in Google maps. Simple and brilliant.

Although there isn't an export feature, the dialogue does allow you to copy and paste the data out into a text program.

Gotta love your Mac!

Wednesday, February 16, 2011

Volatility 1.4

This is just an initial post about the beta availability of Volatility 1.4. I've been teaching 1.3 as part of my Advanced Live Forensics course for 18 months or so but it only supports XP SP2 and 3 RAM images. The new 1.4 version from the devs and helpers at www.volatilesystems.com have been toiling over this version for somewhile and its great to at last have a play with it.

First things first you can find proper 'how to' resources at http://code.google.com/p/volatility/ but downloads are currently limited to within svn. If this is new to you its easy enough. If you are using a Mac with Snow Leopard just open a terminal and type 'svn checkout http://volatility.googlecode.com/svn/branches/Volatility-1.4_rc1'. This will download the 1.4 version and put the Volatility files in your user root folder.

Once downloaded just 'cd Volatility-1.4_rc1'. Anyone used to the old version will see a small difference in the running of the commands. Instead of-

python volatility pslist -f [pathtoRAM]

..you have quite a different syntax. It breaks down like this-

python vol.py [plugin] --profile=[PROFILE] -f [image]

vol.py replaces the old volatility framework command
plugin is the command such as pslist, psscan2 etc
profile is completely new but a vital component of the new framework. For all RAM images except from Windows XPSP2 x86 should have the profile defined at the --profile switch. The BasicUsage document lists them as:-

PROFILES
--------
VistaSP0x86 - A Profile for Windows Vista SP0 x86
VistaSP1x86 - A Profile for Windows Vista SP1 x86
VistaSP2x86 - A Profile for Windows Vista SP2 x86
Win2K8SP1x86 - A Profile for Windows 2008 SP1 x86
Win2K8SP2x86 - A Profile for Windows 2008 SP2 x86
Win7SP0x86 - A Profile for Windows 7 SP0 x86
WinXPSP2x86 - A Profile for Windows XP SP2
WinXPSP3x86 - A Profile for windows XP SP3

So running a basic pslist against myram.dd imaged from a Windows SP3 box would look like this-

python vol.py pslist --profile WinXPSP3x86 -f myram.dd

In the previous version outputing the results to a file could be achieved by using '>' or '>>' to output to a text file etc such as -

python volatility pslist -f myram.dd >> pslist.txt

However, in 1.4 we have many more options, by adding -

--output= you can specify numerous output types if the module being invoked supports it. This includes -

--output=text
--output=html
--output=csv

To check what a module/plugin supports just check help - python vol.py pslist --h and look for the output section.

You can add -

--output-file=myoutputfile.csv to name your output file. So our previous command line could look like this -

python vol.py pslist --profile WinXPSP3x86 -f myram.dd --output=text --output-file=myfile.txt

That should get you started.

There are also some exciting new modules to play with such as bioskbd a plugin based on Andreas Schusters work. It enables the reading of input text from the BIOS area of memory which can include the BIOS password or even Full Disk Encryption passwords. Check out the link to Andreas site for more information. This plug in has apparently been around for a while but I'd completely missed it. If you do check it out take note that some RAM dumping tools dont image that area of RAM. For example if you are using Matthieu Suiches win32dd tool you need to add '-t 1' to grab page zero.

Also there are some exciting malware analysis plugins such as svcscan which can list Windows services from both usermode and kernelmode and also ldrmodules for detecting unlinked DLL's.

Anyway, thats all for now, I'll try and post more in due course once I've had a proper play.

Nick

Thursday, January 20, 2011

Mac Ram Dumps

Well its finally happened, at last a tool to dump RAM from OSX. Big thanks to ATC-NY for their Mac Memory Reader which can be downloaded for free here.

The tool is very easy to use, simply unpack and open a terminal.

cd to the folder MacMemoryReader (For newbies something like - cd /Users/name/Desktop/MacMemoryReader

Run - sudo ./MacMemoryReader filename

..where the 'filename' is the path to a connected storage device

You will prompted for your admin password and off it will go.

Remember to check that your connected storage has enough space for the entire RAM dump.

If you want to feel part of the action you can throw a -g into the command line and it will provide a percentage notifier.

The program outputs a Mach-0 raw file which should respond well to data carvers and the like. Well I've only conducted a couple of tests but Photorec and Foremost do a cracking job of getting at the files. They both successfully retrieved HTML, jpg, zips and a whole variety of other files including web pages going back 3 months. My 8 Gig of Ram offered up over 38000 files. Many of them were fairly uninteresting txt files so you need to wade through to find the good stuff.

If you are trying Foremost just bear in mind the 3Gig limit, perhaps take a look at Scalpel.

The next step is to start looking for running process information, fairly critical in basic RAM analysis. I'm away teaching next week so will have some evening time to play.

I'll try and blog again soon