This is a pretty good writeup of a long-fixed Firecracker bug (CVE-2019-18960).<p>Firecracker is a KVM hypervisor, and so a Firecracker VM is a Linux process (running Firecracker). The guest OS sees "physical memory", but that memory is, of course, just mapped pages in the Firecracker process (the "host").<p>Modern KVM guests talk to their hosts with virtio, which is a common abstraction for a bunch of different device types that consists of queues of shared buffers. Virtio queues are used for network devices, block devices, and, apropos this bug, for vsocks, which are a sort of generic host-guest socket interface (vsock : host/guest :: netlink : user/kernel, except that Netlink is much better specified, and people just do sort of random stuff with vsocks. They're handy.)<p>The basic deal with managing virtio vsock messages is that the guest is going to fill in and queue buffers on its side expecting the host to read from them, which means that when the host receives them, it needs to dereference pointers into guest memory. Which is not that big of a deal; this is, like, some of the basic functioning of a hypervisor. A running guest has a "regions" of physical memory that correspond to mapped pages in Firecracker on the host side; Firecracker just needs to keep tables of regions and their corresponding (host userland) memory ranges.<p>This table is usually pretty simple; it's 1 entry long if the VM has less than 3.5G, and 2 entries if more. Unless you're on ARM, in which case it's always 1 entry, and the bug wasn't exploitable.<p>The only tricky problem here for Firecracker is that we can't trust the guest --- that's the premise of a hypervisor! --- and a guest can try to create fucky messages with pointers into invalid memory, hoping that they'll correspond to invalid memory ranges in the host that Firecracker will deference. And, indeed, in 2019, there was a case where that would happen: if you sent a vsock message, which is a tuple (header, base, size), where:<p>1. The guest had more than 3.5G of memory, so that Firecracker would have more than one region table entry<p>2. The base address landed in some valid entry in the table of regions<p>3. base+size lands in some other valid entry in the table of regions<p>There are two bugs: first, a validity check on virtio buffers doesn't check to make sure that <i>both</i> base <i>and</i> base+size are in the same, valid region, and second, code that extracts the virtio vsock message does an address check on the buffer address with a size of 1 (in other words, just checking to see if the base address is valid, without respect to the size).<p>At any rate, because the memory handling code here deals with raw pointers, this was done in Rust `unsafe{}` blocks, and so this bug combination would theoretically let a guest trick Firecracker into writing into host memory outside of a valid guest memory range.<p>The hitch, which is as far as I know fatal: there's nothing mapped in between regions in x86 Firecracker that you can write to: between a memory region and the no-mans-land memory region outside it, there always happen to be PROT_NONE guard pages†, so an overwrite will simply kill the Firecracker process. Since the attacker here already controls the guest kernel, crashing the guest this way doesn't win you anything you didn't already have.<p>† <i>And now, post-fix, there's deliberately PROT_NONE guard pages around regions</i>