Generic Unpacking with r2pipe
I’ve been playing around with r2pipe lately and thought I would do a bit of a write up on how I automated unpacking a Locky sample using r2 over r2pipe.
Basically Locky unpacks itself using VirtualAlloc
to allocate a buffer for the unpacked file, followed by a call to RtlDecompressBuffer
to decompress the data.
The method I used was to break on calls to VirtualAlloc
and then watch for a write to the allocated buffer and check for a PE header, the same method that @struppigel describes very well in this youtube video: https://www.youtube.com/watch?v=h9RiBJ06MAQ. I will use the same Locky sample that he used, which can be found here if you would like to follow along.
The Algorithm
The basic algorithm I will be using is as follows:
- Set a breakpoint on
VirtualAlloc
- Run until breakpoint is hit
- Step out of
VirtualAlloc
and record the address of the allocated buffer, which is the return value ofVirtualAlloc
and can be found ineax
- Step out of the calling function (which during my tests I found is usually also the function that writes to the allocated buffer, more on this later)
- Check the first 2 bytes of the buffer for the PE
MZ
header - If PE header is found, dump the buffer to a file and quit, otherwise repeat the process
I chose to step out of the function calling VirtualAlloc
instead of setting a hardware breakpoint to break when the buffer is written to because I found this to land me in system code somewhere which I would then have to step out of until I get back to user code to be sure the buffer is fully written. I am lazy and the simple solution of just stepping out of the calling function turned out to be more reliable in my tests.
The Setup
Because I dislike developing on Windows and much prefer a unix environment for running and testing my python r2pipe script I set up a KVM virtual machine running a 64 bit installation of Windows 7 Professional. To aid in testing I installed my favourite Windows debugger (x64dbg) and radare2. On the host OS I had a python virtual environment with r2pipe
installed which I used to run the script, connecting over HTTP to an instance of r2 running on the windows VM. You can run the script directly on Windows or use one of the other remote connection methods r2 supports but this is the only setup I tested this in.
I started r2 with the following command:
This will start r2 in debug mode (-d
) and instruct it to listen on all network adapters for an HTTP connection on port 1337. I also turned off the HTTP sandbox with http.sandbox = false
as debugging features are disabled by default when starting an HTTP listener for security reasons.
The Code
Alright now we can get into the fun part, the actual code.
First we import r2pipe
, open a connection to our VM and setup a few helper functions to simplify our lives:
Next we will reopen the file in debug mode with the r2 command doo
, this ensures we can run the script multiple times and it will restart the debugged program (Locky) each time. After that we continue execution to allow some standard libraries to load, set a breakpoint on entry0
and continue until we hit it.
We can now analyze the program with aaa
as all of the imports have been loaded.
This will name our functions for us so that if we wish we can have a look around. We don’t actually need this level of analysis for this script but again, I’m lazy and this file is small so it doesn’t take very long.
Next on the list of things to do is get the address of VirtualAlloc
and set a breakpoint on it. I also print out some debug messages throughout the code to help me see what is going on.
The final part of the script is the main loop which carries out the algorithm we described above. It looks like this:
The code works as follows:
- We continue execution until we hit a breakpoint
- Double check we are actually stopped at
VirtualAlloc
- If so, print out some messages and get the address
esp
currently points to - Get the size of the buffer (
esp+8
) being allocated, which is the second argument toVirtualAlloc
. Values can be retrieved in r2 with thepv
command (print value) - Step out of
VirtualAlloc
- Read the address of the allocated buffer from
eax
usingdrj
, which returns a json representation of all register values. - We can now read 2 bytes from the start of the buffer using
p8 2
and compare them with theMZ
string at the start of all PE headers, in the hex string form4d5a
. - If it matches, dump the allocated buffer to file with r2’s handy
wt
(write to) command.
Putting this all together we get the following:
The script takes about 2 minutes to run and yields an unpacked version of Locky named dump.bin
in the working directory of where ever you launched radare2.exe from. I’m sure there are many improvements I could make to this script but this is my first time playing around with r2pipe so please send me suggestions, I would love to hear them.
This script can also be found as a gist.