Monday, April 8, 2013

Maltego Machines and other stuff

Once again it has been several lifetimes of certain moths since I wrote a blog post.  I have been trying to write the text for my new web site whilst also writing a book.  That's right loyal follower, I am writing a book!  The working title is Weaponizing Open Source Intelligence.  Obviously for those of you in the UK it will be Weaponising!  It should be pretty interesting not only covering advanced Open Source Techniques but how to understand how the data can be 'weaponised' into an attack against you or your organisation.  Should be good!

Anyway, 2 weeks back I taught the first Advanced Open Source Course to international acclaim and applause, well, all the students thought it was epic and enjoyed it.  The highlight seemed to be the real-world exercises where you do everything from hunting down bad guys to planning an attack on a company, loads of fun.

A good chunk of the course is focused on the tools from Maltego, CaseFile and primarily Radium, which, frankly, rocks.  If you haven't seen the tool before take a look at Paterva's YouTube channel at http://www.youtube.com/user/PatervaMaltego.  It is essentially a graphing tool to assist with 'automated' Open Source Intel gathering.



One of the interesting things about Radium is the ability to write your own Transforms (searches) but also to code up your own Machines to essentially daisy-chain commands together so that they run automatically.  

During the course we had a segment given online by Social Engineering Guru, Chris Hadnagy where we discussed the identification of key people within an organisation to create targets for phishing targets and the like.  It can also be useful to identify people who may know eachother for the same purpose.  Obviously we are not teaching this to be able to carry out an actual attack but rather identify vectors can could be used by an attacker against us.

I thought it would be interesting to create a Radium Machine that would accept the input of a Domain, extract 50 or so documents and then rip out the meta data in the documents hopefully giving us real names email addresses and like.  Then we can remove any data that only appears once, working on the principle that we would like to ID people who had authored many documents.  I took a good go at writing it and thanks to Andrew at Paterva he tidied it up and made sure it worked properly.

If you have a version of Radium simply click the Machines tab, Manage Machines, New Machine.  You can type any old rubbish into the dialogue as it will be overwritten by this code anyway.  The code looks like this, simply cut and paste into the code window and press the 'tick' button to compile:-

--------------------------------------

machine(
    "MetadataMachine",
    displayName:"Metadata Machine",
    author:"Nick Furneaux (thanks to Andrew)",
    description: "Finds documents and their metadata for a domain and then deletes any documents where the meta data is not found in more than one document"
    )
{


    start {
           
       
        /* Find all documents and then their Metadata */
       
       
        // Get Documents
        status("Searching for Documents")
        log("Finding Documents....",showEntities:false)
        run("paterva.v2.DomainToDocument_SE",slider:100)
       
        // Get Metadata from Documents
        status("Extracting metadata")
        log("Extracting metadata",showEntities:false)
        run("paterva.v2.DocumentToPersonEmail_Meta")
       

     
        /* Remove all entities that have less than 2 links incoming to the entity*/
       


        //now we select any people,phrases and email addresses
        type("maltego.Person", scope:"global")
        incoming(lessThan:2)
        delete()

        type("maltego.Phrase", scope:"global")
        incoming(lessThan:2)
        delete()

        type("maltego.EmailAddress", scope:"global")
        incoming(lessThan:2)
        delete()
       
       
       
        /* Remove any remaining documents that no longer have children */
       
       
        type("maltego.Document", scope:"global")
        outgoing(0)
        delete()
       
        /* Ask if you would like more work to be done on any extracted email addresses */
       
        type("maltego.EmailAddress", scope:"global")
        userFilter(title:"Choose Email Addresses",heading:"Email",description:"Please select the email addresses you want to do more research on.",proceedButtonText:"Next>")
        run("paterva.v2.EmailAddressToPerson_SamePGP")
       
       
       

    }
}


-------------------------------

The first command that runs, looks at the Domain you have supplied and goes looking for Office or PDF documents posted to that Domain.

     run("paterva.v2.DomainToDocument_SE",slider:100)

Next these documents have their metadata extracted.

     run("paterva.v2.DocumentToPersonEmail_Meta")

Then we remove any metadata that has less than 2 links to it.

       //now we select any people,phrases and email addresses
        type("maltego.Person", scope:"global")
        incoming(lessThan:2)
        delete()

        type("maltego.Phrase", scope:"global")
        incoming(lessThan:2)
        delete()

        type("maltego.EmailAddress", scope:"global")
        incoming(lessThan:2)
        delete()


Lastly, we display any email addresses and ask if you want more work done.  At the moment it just looks at a PGP server and tries to extract the registered name for that email address which could be useful.  We could do a web search for sites containing that address too.

        userFilter(title:"Choose Email Addresses",heading:"Email",description:"Please select the email addresses you want to do more research on.",proceedButtonText:"Next>")
        run("paterva.v2.EmailAddressToPerson_SamePGP")


As code goes, this is pretty simple and can help to automate tasks that you run regularly.  Interestingly the code also enables you to set timers to run the script every minute, hour or whenever.  This could be very useful for monitoring a specific Domain for new activity etc.

Thats all for now.  If you want to learn more about the Advanced Open Source Intelligence Course you can download a syllabus here - www.csitech.co.uk/Advanced_OSI_Syllabus.pdf.