Diving Into Radare2
I’ve been improving my reverse engineering skills lately and decided to have a go at using radare2 after a recommendation on an IRC channel I frequent. After reading through some blog posts and the radare2 book (which is awesome, by the way) I decided to reverse a small shellcode using only r2 to see how easy it would be to get used to.
First I picked up a random shellcode from exploit-db and settled on this one which promised to contain some XOR encoding, which I figured would give me some semi-complicated operations to carry out using only r2. I would have to manually XOR some bytes and decompile the output to read the shellcode’s final payload, at least.
First lets compile the following payload:
With the following simple command:
Then open it with r2:
As can be seen by the r2 prompt, we are currently positioned at offset 0x7fef42cacaf0. Lets analyse the whole file and seek to the main subroutine:
We can see that r2 has analysed our bin and named the shellcode subroutine for us and that the program is putting the address of the shellcode into edx
, then calling it. Lets take a look at the shellcode now.
Hmm.. looks like r2 hasn’t recognized this part of the code as a function, lets jump into visual disassembly mode by executing V
to get into visual mode and pressing p
to cycle to the disassembly view. You may need to jump back to this position by pressing o
and then typing obj.shellcode
, this is because when you first enter visual mode r2 will seek to the current instruction pointer.
After a quick look at this we can see that the shellcode jumps down to 0x6008bf which is an instruction to call.. a string? Looks like r2 has mistaken that line of code for data, but that’s okay because we can fix it. Lets scroll down to that section by pressing j
once and mark it as code by pressing dc
(a good mnemonic for this is define code). After doing that we get the following:
Much better. Now lets take a look at what this code is doing, shall we?
We can see that after the call instruction from 0x6008bf is executed we are popping an address from the stack into rsi
. The call instruction puts the next instruction’s address onto the stack, which means rsi
is now pointing to 0x006008c4, which looks like a lot of junk code. Remember that this is an XOR’d shellcode so this is not surprising.
Next the code will zero out rcx
by XORing it with itself and we set the counter (cl
) to 0x31 and data (dl
) to 0x90 and then zero out rax
. This is all to set up a loop that will loop through 0x31 bytes of data starting at rsi
, XORing each byte with the value 0x90 and pushing it onto the stack.
At 0x006008bd the execution is passed to the newly decoded instructions at rsp
(the top of our stack). We need to somehow decode this ourselves so we can have a look at it, keeping in mind that the code was pushed onto the stack backwards.
We could take advantage of r2’s write mode by turning it on with e io.cache = true
and then XOR the code with the wox
command and analyse the output, but then we would also need to reverse the byte order (as the data is pushed onto the stack it will be backwards if we do it in the correct order) and we don’t really want to complicate things. For this we should take advantage of r2’s debugging abilities.
Lets quit r2 (press q
until it closes completely) and reopen our shellcode in debug mode:
Then enter aaa
to analyse the file again and then seek to our obj.shellcode flag and go through the same process of defining that string as code from visual mode. We should now be looking at the same screen as before.
Like in vim we can enter a command mode without exiting visual mode in r2 by pressing :
. From here we can execute normal r2 commands without needing to jump back to the r2 shell. Because we are responsible people and we never run code that we haven’t read yet we will set a break point at the instruction to call out to the decoded instructions pointed to by rsp
at address 0x006008bd by entering the following command:
Now lets run the program by entering the dc
command (debugger continue) and then seek to the location that rsp
currently points to with s rsp
. We should now be looking at the following:
Now this code is purposely confusing, first it zeros out both eax
and rdx
and pushes them onto the stack, growing it. We then move some values into the space on the stack, and move the current stack pointer into rdi
. We then do the same thing again, making space as we go, via pushing a zero’d out rax
and saving the new value of the stack pointer to rsi
. Afterwards we can see that we are adding 0x3b to rax
(which is 0) and executing a syscall
. 0x3b is 49 in octal, so we are calling syscall 49, which has the following call signature:
The shellcode is calling sys_execve
, which starts a process! This means that the code must be pointing rdi
to the filename and rsi
to the arguments for the call. Keep in mind that by convention the first element in the arguments to sys_execve
should be the same as the filename. Lets place a break point before the syscall and take a look at the stack:
Now lets take a closer look at the values themselves. Keep in mind that rsi
is pointing to an array of strings and that we need to reverse the byte order to get the address it points to. First we will use p8 8 @ rsi
to print 8 bytes at rsi
, then we will reverse those bytes and print the value that they point to. Finally we will print the second argument.
From this we can see that this shellcode will simply start a shell, by launching /bin/sh
with the argument -i
(which will force the shell to launch in interactive mode).
I hope this will be of some help to someone as a simple intro to using radare2 as a debugger. As someone who lives in the terminal as much as possible I am loving using r2, but hopefully this will convince others that it is actually not any harder to use than a visual decompiler / debugger.
PS: I would like some feedback as to what people would prefer to see used for the examples in my articles. Would you prefer the text be placed in a plaintext code block instead of images? I am aware that some people hate posts with too many images and I’ve been meaning to step up my game and actually start writing more posts. You can either let me know in the comment section or on twitter or email me (contact info can be found in the footer of this blog).
2016-06-24: After a comment emailed to me by Otto Ebeling I have made an edit to the article where the call to sys_execve is explained. I got a bit wrong where I was using rsi
to address the parameters as a string, but it is actually pointing to an array of strings. Thanks for pointing that out :)