The following story took place on a Friday evening, with the sun long gone from the horizon, in a dimly lit room where a few hackers teamed up to do what they do best: fuck shit up.
A former security researcher at KU Leuven took his fifth coffee this hour. Or was it the sixth? He didn't care. Today there was only one thing on his mind: breaking into the nuclear power plants of those Cyberdyne sons o' bitches. It seemed easy enough at first sight. Even his ex-colleague, whom not always had the brightest moments, was able to connect to the power plants using netcat. A nice little menu greeted us:
Our hackers were already familiar with the menu. New nuclear plants could be added and existing ones could be listed. Even the configuration of each power plant, which consists of the number of elements present in the power plant, could easily be updated. And last but not least there were the options to get and set a self-destruct code. This self-destruct code was of course the flag they were after - the reason they were already awake for 16 hours. Unfortunately the self-destruction code can only be accessed if you know a rather lengthy password. But rest assured that such pity defences don't stop true hackers.
By bribing the right people they managed to obtain the binary code responsible for displaying the menu. With this information in hands, it didn't take them long to find the first weak spot in the Cyberdyne system: a format string vulnerability [1]. After a plant is created, or after editing an existing nuclear plant, the amount of uranium was checked by the following function:
The code above was generated by IDA Pro and the Hex-Rays Decompiler, both state of the art tools used by the bad and the good guys. For our hackers it was clear: if the level of uranium is equal or lower to zero, a format string vulnerability is triggered (line 10), potentially enabling them to take over the system. Using the control panel it wasn't possible to edit the plant such that it had a non-positive uranium level, so it wasn't clear how they could force the system to execute this code. But a quick look at the code responsible for adding a new power plant changed the situation:
After staring at this code and getting another coffee, it all became clear. At least for now. Our hackers eagerly discussed it with each other in precise detail: The code first calls get_random_num a few times to initialize the levels of all elements present in the power plant (oxygen, carbon, boron, zirconium and uranium). Uranium is one of those elements. Then the name that has been given to the new power plant (line 6) is copied to the memory region containing all information about the power plant (line 13). This memory region begins with the name of the plant, and ends with the levels of each element. However the string copy function possibly overwrites the levels of each element! This is because all information about a plant is saved in only 112 bytes (see line 4), and the string copy function copies at most 0x70 bytes (line 6). Recall that 0x70 is the hexadecimal representation of 112, meaning the name of the plant potentially overwrites all the other information in the memory region.
After the plant name is copied, the function check_secondary_elems_level checks that the levels of the secondary elements (oxygen, carbon, boron and zirconium) are between, and including, 1 and 999. Our hackers concluded that entering a long name will overwrite the levels of each elements, and if done properly will assure that the vulnerable printf statement gets executed. In other words the vulnerability has been confirmed, and it's time to use it to destroy those Cyberdyne scums.
After the plant name is copied, the function check_secondary_elems_level checks that the levels of the secondary elements (oxygen, carbon, boron and zirconium) are between, and including, 1 and 999. Our hackers concluded that entering a long name will overwrite the levels of each elements, and if done properly will assure that the vulnerable printf statement gets executed. In other words the vulnerability has been confirmed, and it's time to use it to destroy those Cyberdyne scums.
A Second Entry Point
Amazingly the not-so-bright ex-colleague found another vulnerability in the code. Looking again at the code to add a power plant, and this time focussing on where the data is saved, we get the following:
He saw that there are no checks whether there still is enough memory left when adding a new power plant! The plantid variable, which is used to determine where in memory to save the next plan, is always increased by one without checking whether there's actually space left. Argument a1 is pointing to the memory where the information will be stored and is allocated on the stack by the calling function. Essentially all nuclear plants are stored in one big "list" on the stack, and eventually there won't be enough space to save them all, making the stack go BOOM. And when the stack goes boom, the hackers gain control. Unfortunately it took our colleague a while to find this, and in the end this vulnerability wasn't used. We all smiled though: typical Cyberdyne, they always make more than one mistake.
For hackers the backdoor is always open.
"....that's what she said!"
Bypassing ASLR and NX
Cyberdyne knew the power plant software was not to be trusted. To remedy this they hired some of the best security researchers around the world to create some defences. Unfortunately Cyberdyne was too incompetent to actually implement them, making them ditch all the excellent research. Instead they settled for two other (already well known) protection mechanisms: ASLR and NX.
One of the older hackers of the group was reminiscing about exploits and vulnerabilities back in the good old days. The days where all programs, even the operating system, lived in the same memory region. Ah, it was just too easy to exploit programs then. He also recalled how information leakage vulnerabilities were considered useless. These are vulnerabilities that don't let you take over a system, but only make the system reveal some information, like where certain data is stored. The problem is that even though you know where certain data was stored, you can't necessarily read it, making people ignore these attacks. But it was clear to him that such information leakage attacks would be essential to bypass ASLR and eventually attack Cyberdyne.
The older hacker begins his explanation to the youngers ones: The format string exploit can be used as an information leakage attack to defeat ASLR. When using %10$p%11$p as a plant name it will display the saved frame pointer (ebp) followed by the return address (eip). Remark that the plant name must be followed by data so the appropriate number of elements will be overwritten in order to reach the vulnerable printf call. Though the
binary isn't position independent, and hence can't fully benefit from
ASLR, this information leakage defeats ASLR. The absolute position of
libraries can be derived from the pointers in the .GOT section,
defeating ASLR for loaded libraries as well. More precisely, the first 4 bytes of the plant name will contain a pointer to the .GOT entry of fork() followed by the string %22$s. This will print out the string located at the .GOT entry of fork() and hence prints out the address of fork (assuming there are no zeros in the address)! Finally, the address of execve() can be found by adding a predictable offset to that of fork(). This teaches us all the addresses we will need when constructing an exploit. The eyes of the other hackers opened wide open - he's right! Even with ASLR we can take this fucker down!
Bypassing protections like a boss.
The second protection mechanism to defat is NX: the stack and heap of the binary are not executable, meaning we can't copy over shellcode and execute it. One of the hackers suggested using a return-oriented programming approach. Quickly they all agreed on this, but simplified the idea to an easier to execute return-to-libc attack. To accomplish this feat they have to control the value of the stack pointer (esp) before executing a return instruction. This can be done by overwriting the saved frame pointer used in the prologue and epilogue of a function. Let's look at the disassembled code of printf to better understand what our hackers are going to do:
Here the register ebp is used to contain the frame pointer, and its value is saved on the stack at the start of the function and restored at the end of the function. Using the format string exploit we can overwrite the saved ebp value, and hence we can control the value of ebp at the end of the printf function. This is done using traditional format string exploits and the %hn format specifier. However, we need to control esp and not ebp! Patience. Let's look at the disassembled code of check_uranium_level:
The next to last instruction is leave, which is equivalent with mov %ebp, %esp followed by pop %ebp. Aha! Here the register esp is set to ebp, and we control ebp since we overwrote it in the call the printf! With control over esp we can do a traditional return-to-libc attack, and in our case we will construct a fake stack frame so the program will execute execve("/bin/sh", NULL, NULL).
The hackers are ready to attack Cyberdyne.
Writing the Exploit
Time to get to business. First the location of the stack and the code (link to code) is extracted using
%10$p:ENDEBP:%11$p:ENDRET:Then we extract the location of execve() (link to code). This is done by reading the .GOT entry of fork() and adding the appropriate offset so we end up with the address of execve(). The given input is of the form:
<addr .GOT entry fork>%22$s:ENFORK:We continue by storing the stack frame for the return-to-libc attack. To construct it we use the addresses extracted in the previous two steps. The code to construct the stack frame and store it is:
Important to note is that the memory region where the power plant info is saved is first initialized to zero. Hence the first few bytes after the plant name also contains zeros, and thus the argv and envp arguments are pointers to NULL.
Finally we trigger the return-to-libc exploit by overwriting the saved frame pointer (ebp) in the print function:
When combining all the parts we get the following beautiful result (link to code):
There is no right and wrong. There's only fun and boring.
Exploit Reliability
A final note is due about the reliability of the current exploit. It assumes there are no whitespace characters (space, tab, newline, vertical-tab, form-feed characters, or terminating zero) within any of the (extracted) addresses. If there are the exploit will fail as not all data of the payload and/or exploit will be read during the scanf call. However this is only a limitation of the current exploit: we can write any byte using the format specifier %hn and read any byte using %s. This allows us to create an exploit which will work under any environment.
Background
The UCSB International Capture The Flag (iCTF) is a computer security challenge where participants have to attack other systems and defend their own. This year we participated with our KU Leuven hacknamstyle research team.
To get a feel of the atmosphere of a security CTF, the video below shows the rwthCTF from the perspective of the organizers (and of course we also participated in that CTF).
For the iCTF our team successfully exploited 4 challenges:
nuclearboom, water, pesticides and traintrain. This got us quite some
attack points, though we didn't focus on defence, making us end on a
respectable 37 place out of 98 participating teams.
In this post I focused on one challenge: nuclearboom. I've already encountered a few write-ups on nuclearboom [codezen, lifayk], but those only explain how to steal the flag, whereas my exploit actually make it spawn a (remote) shell. To do this we have bypassed both ASLR and NX, both widely used protection mechanisms in modern operating systems and programs. It turned out that this challenge was perfect to show how information leakage defeats ASLR, and how a return-oriented-programming techniques can defeat NX.
Footnotes
[1] After accepting an incoming connection stdout, stdin and stderr are replaced with the socket file descriptor using dup2. This way the program can simply use sscanf and printf, knowing that everything actually happens over the TCP connection. So in this case the output of the printf call will be redirected over the TCP connection.