It's 2020. Quarantines are everywhere – and here I'm writing about one, too. But this quarantine is of a different kind.
In this article I'll describe the Linux Kernel Heap Quarantine that I developed for mitigating kernel use-after-free exploitation. I will also summarize the discussion about the prototype of this security feature on the Linux Kernel Mailing List (LKML).
Use-after-free in the Linux kernel
Use-after-free (UAF) vulnerabilities in the Linux kernel are very popular for exploitation. There are many exploit examples, some of them include:
UAF exploits usually involve heap spraying. Generally speaking, this technique aims to put attacker-controlled bytes at a defined memory location on the heap. Heap spraying for exploiting UAF in the Linux kernel relies on the fact that when kmalloc() is called, the slab allocator returns the address of memory that was recently freed:
So allocating a kernel object with the same size and attacker-controlled contents allows overwriting the vulnerable freed object:
Note: Heap spraying for out-of-bounds exploitation is a separate technique.
In July 2020, I got an idea of how to break this heap spraying technique for UAF exploitation. In August I found some time to try it out. I extracted the slab freelist quarantine from KASAN functionality and called it SLAB_QUARANTINE.
If this feature is enabled, freed allocations are stored in the quarantine queue, where they wait to be actually freed. So there should be no way for them to be instantly reallocated and overwritten by UAF exploits. In other words, with SLAB_QUARANTINE, the kernel allocator behaves like so:
On August 13, I sent the first early PoC to LKML and started deeper research of its security properties.
Slab quarantine security properties
For researching the security properties of the kernel heap quarantine, I developed two lkdtm tests (code is available here).
The first test is called lkdtm_HEAP_SPRAY. It allocates and frees an object from a separate kmem_cache and then allocates 400,000 similar objects. In other words, this test attempts an original heap spraying technique for UAF exploitation:
If CONFIG_SLAB_QUARANTINE is disabled, the freed object is instantly reallocated and overwritten:
If CONFIG_SLAB_QUARANTINE is enabled, 400,000 new allocations don't overwrite the freed object:
That happens because pushing an object through the quarantine requires both allocating and freeing memory. Objects are released from the quarantine as new memory is allocated, but only when the quarantine size is over the limit. And the quarantine size grows when more memory is freed up.
That's why I created the second test, called lkdtm_PUSH_THROUGH_QUARANTINE. It allocates and frees an object from a separate kmem_cache and then performs kmem_cache_alloc()+kmem_cache_free() for that cache 400,000 times.
This test effectively pushes the object through the heap quarantine and reallocates it after it returns back to the allocator freelist:
As you can see, the number of the allocations needed for overwriting the vulnerable object is almost the same. That would be good for stable UAF exploitation and should not be allowed. That's why I developed quarantine randomization. This randomization required very small hackish changes to the heap quarantine mechanism.
The heap quarantine stores objects in batches. On startup, all quarantine batches are filled by objects. When the quarantine shrinks, I randomly choose and free half of objects from a randomly chosen batch. The randomized quarantine then releases the freed object at an unpredictable moment:
However, this randomization alone would not stop the attacker: the quarantine stores the attacker's data (the payload) in the sprayed objects! This means the reallocated and overwritten vulnerable object contains the payload until the next reallocation (very bad!).
This makes it important to erase heap objects before placing them in the heap quarantine. Moreover, filling them with zeros gives a chance to detect UAF accesses to non-zero data for as long as an object stays in the quarantine (nice!). That functionality already exists in the kernel, it's called init_on_free. I integrated it with CONFIG_SLAB_QUARANTINE as well.
During that work I found a bug: in CONFIG_SLAB, init_on_free happens too late. Heap objects go to the KASAN quarantine while still "dirty." I provided the fix in a separate patch.
For a deeper understanding of the heap quarantine's inner workings, I provided an additional patch, which contains verbose debugging (not for merge). It's very helpful, see the output example:
The heap quarantine PUT operation you see in this output happens during kernel memory freeing. The heap quarantine REDUCE operation happens during kernel memory allocation, if the quarantine size limit is exceeded. The kernel objects released from the heap quarantine return to the allocator freelist – they are actually freed. In this output, you can also see that on REDUCE, the quarantine releases some part of a randomly chosen object batch (see the randomization patch for more details).
What about performance?
I made brief performance tests of the quarantine PoC on real hardware and in virtual machines:
1. Network throughput test using iperf
server: iperf -s -f K
client: iperf -c 127.0.0.1 -t 60 -f K
2. Scheduler stress test
hackbench -s 4000 -l 500 -g 15 -f 25 -P
3. Building the defconfig kernel
time make -j2
I compared vanilla Linux kernel in three modes:
- init_on_free=on (upstreamed feature)
- CONFIG_SLAB_QUARANTINE=y (which enables init_on_free)
Network throughput test with iperf showed that:
- init_on_free=on gives 28.0 percent less throughput compared to init_on_free=off.
- CONFIG_SLAB_QUARANTINE gives 2.0 percent less throughput compared to init_on_free=on.
Scheduler stress test:
- With init_on_free=on, hackbench worked 5.3 percent slower versus init_on_free=off.
- With CONFIG_SLAB_QUARANTINE, hackbench worked 91.7 percent slower versus init_on_free=on. Running this test in a QEMU/KVM virtual machine gave a 44.0 percent performance penalty, which is quite different from the results on real hardware (Intel Core i7-6500U CPU).
Building the defconfig kernel:
- With init_on_free=on, the kernel build went 1.7 percent more slowly compared to init_on_free=off.
- With CONFIG_SLAB_QUARANTINE, the kernel build was 1.1 percent slower compared to init_on_free=on.
As you can see, the results of these tests are quite diverse and depend on the type of workload.
Sidenote: There was NO performance optimization for this version of the heap quarantine prototype. My main effort was put into researching its security properties. I decided that performance optimization should be done further on down the road, assuming that my work is worth pursuing.
My patch series got feedback on the LKML. I'm grateful to Kees Cook, Andrey Konovalov, Alexander Potapenko, Matthew Wilcox, Daniel Micay, Christopher Lameter, Pavel Machek, and Eric W. Biederman for their analysis.
And the main kudos go to Jann Horn, who reviewed the security properties of my slab quarantine mitigation and created a counter-attack that re-enabled UAF exploitation in the Linux kernel.
Amazingly, that discussion with Jann happened during Kees's Twitch stream in which he was testing my patch series (by the way, I recommend watching the recording).
Quoting the mailing list:
I shared that in Kees's Twitch stream chat right away, and Kees adapted my PUSH_THROUGH_QUARANTINE test to implement this attack. It worked. Bang!
Jann proposed another idea for mitigating UAF exploitation in the Linux kernel. Kees, Daniel Micay, Christopher Lameter, and Matthew Wilcox commented on it. I'll give a few quotes from consecutive messages here to describe the idea. However, I recommend reading the whole discussion.
Prototyping a Linux kernel heap quarantine and testing it against use-after-free exploitation techniques was a quick and interesting research project. It didn't turn into a final solution suitable for the mainline, but it did give us useful results and ideas. I've written this article as a way to summarize these efforts for future reference.
And for now, let me finish with a tiny poem that I composed several days ago before going to sleep:
Author: Alexander Popov, Positive Technologies