Obfuscation and Repetition
The obfuscated payload of a maldoc submitted by a reader can be quickly extracted with the "strings method" I explained in diary entry "Quickie: String Analysis is Still Useful".
This is a very long string (more than 1000 characters) and is most likely the payload we are looking for.
It looks like this is just a sequence of repeating strings, but if you take a close look, you’ll see that there are characters between the repeating string hui12t7gGG7&^6272 gasg671. I have highlighted this repeating string in red here:
You can see individual letters between the repeating string: p, o, w, e, r, …
I’m sure you can now guess where this is going: powershell …
This is an obfuscation method I’ve seen several times: obfuscate the payload by inserting a long string of characters between each character of the payload.
Here is an example.
Say that our payload is "powershell payload". We obfuscate it by inserting character . between each character of the payload, like this:
"p.o.w.e.r.s.h.e.l.l. .p.a.y.l.o.a.d"
In this example, the payload is still easily recognizable.
But what if we use "Internet_Storm_Center" as repeating string? Then we get this:
"pInternet_Storm_CenteroInternet_Storm_CenterwInternet_Storm_CentereInternet_Storm_CenterrInternet_Storm_CentersInternet_Storm_CenterhInternet_Storm_CentereInternet_Storm_CenterlInternet_Storm_CenterlInternet_Storm_Center Internet_Storm_CenterpInternet_Storm_CenteraInternet_Storm_CenteryInternet_Storm_CenterlInternet_Storm_CenteroInternet_Storm_CenteraInternet_Storm_Centerd"
And in this example, the payload is not so easy to recognize.
The trick to decode the obfuscated payload, is to find the repeating string, and remove it. As this can be sometimes tricky, I wrote a small program that automates this task: deobfuscate-repetitions.py.
In this example, we can see that it finds several repeating strings for our sample, but that there’s one repeating string that results in a decoded payload starting with powersheLL:
We can then use option -f to search for string "power", and have the complete payload decoded:
This can then be decoded with base64dump.py:
Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com DidierStevensLabs.com
Comments