Guest Diary: Xavier Mertens - Integrating VirusTotal within ELK

Published: 2015-07-28. Last Updated: 2015-07-28 16:01:26 UTC
by Alex Stanford (Version: 1)
2 comment(s)

[Guest Diary: Xavier Mertens] [Integrating VirusTotal within ELK]

Visualisation is a key when you need to keep control of what’s happening on networks which carry daily tons of malicious files. virustotal.com is a key player in fighting malwares on a daily basis. Not only, you can submit and search for samples on their website but they also provide an API to integrate virustotal.com in your software or scripts. A few days ago, Didiers Stevens posted some SANS ISC diaries about the Integration of VirusTotal into Microsoft sysinternal tools (here, here and here). The most common API call is to query the database for a hash. If the file was already submitted by someone else and successfilly scanned, you’ll get back interesting results, the most known being the file score in the form “x/y”. The goal of my setup is to integrate virustotal.com within my ELK setup. To feed virustotal, hashes of interesting files must be computed. I’m getting interesting hashes via my Suricata IDS which inspect all the Internet traffic passing through my network.

The first step is to configure the MD5 hashes support in Suricata. The steps are described here. Suricata logs are processed by a Logstash forwarder and MD5 hashes are stored and indexed via the field ‘fileinfo.md5‘:

(Click to enlarge)

Note: It is mandatory to configure Suricata properly to extract files from network flows. Otherwise, the MD5 hashes won’t be correct. It’s like using a snaplen of ‘0’ with tcpdump. In Suricata, have a look at the inspected response body size for HTTP requests and the stream reassembly depth. This could also have an impact on performances, fine tune them to match your network behavior.

To integrate VirusTotal within ELK, a Logstash filter already exists, developed by Jason Kendall. The code is available on github.com. To install it, follow this procedure:

# cd /data/src
# git clone https://github.com/coolacid/logstash-filter-virustotal.git
# cd logstash-filter-virustotal
# gem2.0 build logstash-filter-awesome.gemspec
# cd /opt/logstash
# bin/plugin install /data/src/logstash-filter-virustotal/logstash-filter-virustotal-0.1.1.gem

Now, create a new filter which will call the plugin and restart Logstash.

filter {
    if ( [event_type] == "fileinfo" and
         [fileinfo][filename] =~ /(?i)\.(doc|pdf|zip|exe|dll|ps1|xls|ppt)/ ) {
        virustotal {
            apikey => '
'
            field => '[fileinfo][md5]'
            lookup_type => 'hash'
            target => 'virustotal'
        }
    }
}

The filter above will query for the MD5 hash stored in ‘fileinfo.md5‘ to virustotal;com if the event contains file information generated by Suricata and if the filename contains an interesting extension. Of course, you can adapt the filter to your own environment and match only specific file format using ‘fileinfo.magic‘ or a minimum file size using ‘fileinfo.size‘. If conditions match a file, a query will be performed using the virustotal.com API and results stored into a new ‘virustotal‘ field:

(Click to enlarge)

Now, it’s up to you to build your ElasticSearch queries and dashboard to detect suspicious activities in your network. During the implementation, I detected that too many requests sent in parallel to virustotal.com might freeze my Logstash (mine is 1.5.1). Also, keep an eye on your API key consumption to not break your request rate or daily/monthly quota.

Keywords:
2 comment(s)

Comments

Another VirusTotal message! I was not born knowing what the VirusTotal "x/y" notation means, although all mine are y=56 the last time I checked. My e-mail client (Microsoft's outlook.exe) and a few others I "trust" have a nonzero "x" count. I browsed the VirusToal web site, and could not find the definition. I browsed the Sysinternals documentation, and I could not find the definition. I'm not saying the information's not there; I am saying it did not jump out at me in my cursory scans. Today's VirusTotal message trumpeted the "x/y" notation as being the "VirusTotal interesting result" that is "most known." Hmm. What, does everyone in the world know what the VirusTotal "x/y" means except me? So I broke down and googled: I asked what the "x/y in VirusTotal" meant. I discovered that Google bought VirusTotal. In other words, if we choose to do so, we are uploading our computer RAM images to Google.
[quote=comment#34695]Another VirusTotal message! I was not born knowing what the VirusTotal "x/y" notation means, although all mine are y=56 the last time I checked. My e-mail client (Microsoft's outlook.exe) and a few others I "trust" have a nonzero "x" count. I browsed the VirusToal web site, and could not find the definition. I browsed the Sysinternals documentation, and I could not find the definition. I'm not saying the information's not there; I am saying it did not jump out at me in my cursory scans. Today's VirusTotal message trumpeted the "x/y" notation as being the "VirusTotal interesting result" that is "most known." Hmm. What, does everyone in the world know what the VirusTotal "x/y" means except me? So I broke down and googled: I asked what the "x/y in VirusTotal" meant. I discovered that Google bought VirusTotal. In other words, if we choose to do so, we are uploading our computer RAM images to Google.[/quote]
Interesting point. Thank you for sharing that.

Diary Archives