Buffer Issues

Lecture Notes

This lecture will present an overview over issues involving Buffers overflow, underflow, or otherwise incorrectly used.

Download here

Related files:

Practical tasks

This guide will complement the lecture slides and present code and descriptions to enable exploitation of buffer vulnerabilities.

Memory Structure and Variables

Languages such as C/C++, but not only, allow a great flexibility in the use of variables. From a perspective of types, these languages are not fully type-safe, as the safety of the variable types is limited, and can be comonly cirvunvented by developers. Recent compilers will provide a great amount of information to developers so that they avoid type errors, but that doesn’t avoid developers from casting variables to incompatible types, especially when dealing with pointers to structures (The cast is valid, but the content structure is not compatible).

Then there is the notion of memory safety, which is simply not present in these languages. A language that is not memory safe will allow developers to access memory with great freedom, exposing the allocated virtual address to the program.

The following program not_type_safe.c will provide an insight of the problem. Compile it and run it on your computer. Try to explain the values printed.

int aux = 42; // Integer
int *value = &aux; // Pointer to Integer

// Correct usage
printf("%d\n", *value);

// Reading memory after the variable
printf("%d\n", *(value + 4));

// Reading memory before the variable
printf("%d\n", *(value - 4));

// Cast to variable with different storage
printf("%f\n", *((double*) &value));

// Cast to variable with different size
printf("%llu\n", *((unsigned long long*) &value));

The result is something like, which includes the value 42 but also other variations.

42
32693
1
0.000000
140737456555452

A similar program not_memory_safe will specifically explore memory safety with dynamic memory. Check it’s output.

char* buffer = (char*) malloc(10);  // Buffer with 10 bytes
char* str = buffer;                 // Pointer to buffer

free(buffer);                       // Free buffer!


// Write after free (and write beyond buffer)
memcpy(str, "Hello World!!!!", 15);
// Read after free (and read beyond buffer)
printf("%s\n", str);

Different variable types will be allocated to different memory areas. This is intrinsic of each program and is broadly defined by how the program is compiled. When the program is loaded, the addresses may change, but they will still respect this notion of areas.

The following program mem.c will print the address of several variables that you may find in a program. Some variables are local, some are global, some are static, some are dynamic. Then you also have program arguments and functions. During the creation of a program the programmer will decide how to declare a variable, and this will have some impact on where the variable is placed in memory.

const char cntvar[]="constant";
static char bssvar[4];
...

int main(int argc, void** argv) {
        FILE* fd;
        char line[1024];
        unsigned int mask;
        unsigned int stack = (unsigned int) &argc;
        unsigned int heap = (unsigned int) malloc(sizeof(unsigned int));
        unsigned int bss = (unsigned int) bssvar;
        unsigned int cnst = (unsigned int) cntvar;
        unsigned int text = (unsigned int) &main;
        memset(&mask,0xff,sizeof(mask));
        mask ^= getpagesize() -1;
        printf("Internal Variables (Page = %u)\n", getpagesize());
        printf("&argc  = %08x -> stack = %08x\n", stack, stack & mask);
        printf("malloc = %08x -> heap  = %08x\n", heap, heap & mask);
        printf("bssvar = %08x -> bss   = %08x\n", bss, bss & mask);
        printf("cntvar = %08x -> const = %08x\n", cnst,cnst & mask);
        printf("&main  = %08x -> text  = %08x\n", text,text & mask);
    }

You can compile the program with gcc -o mem mem.c.

Task:

  • Compile and run the program
  • Match the addresses printed with the different variable types
  • Change the location of a variable, or create others of the same type, and see how it affects the resulting address.

The program also allocates memory in the program stack, by calling a function recursively until all memory is exhausted.

void foo(int argc, unsigned int mask, unsigned int c, unsigned int m)
{
    char a[4096*0x100];
    unsigned int stack = (unsigned int) &argc;

    printf("foo [%03u]: &argc  = %08x -> stack = %08x\n",c,stack, stack & mask);
    if(c < m)
        foo(argc,mask,c+1, m);
}

Each new function will allocate a variable stack with the value of the argc argument (this could be avoided, and is here for clarification), and then allocates a variable named a of size 4096 * 0x100. 4096 (or 0x1000) is the standard page size, while 0x100 will set the number of pages. The larger this value, the quicker the program exhausts all memory.

A possible result would be:

foo [000]: &argc = bfeb8140 -> stack = bfeb8000
foo [001]: &argc = bfdb8110 -> stack = bfdb8000
foo [002]: &argc = bfcb80e0 -> stack = bfcb8000
foo [003]: &argc = bfbb80b0 -> stack = bfbb8000
foo [004]: &argc = bfab8080 -> stack = bfab8000
foo [005]: &argc = bf9b8050 -> stack = bf9b8000
foo [006]: &argc = bf8b8020 -> stack = bf8b8000
foo [007]: &argc = bf7b7ff0 -> stack = bf7b7000
foo [008]: &argc = bf6b7fc0 -> stack = bf6b7000

You should notice that stack allocation grows from higher address to lower addresses. Depending on your system configuration, addresses presented may be constant or slightly random.

Tasks:

  • Take notice of how the addresses in your system, and how memory usage evolves.
  • Run it multiple times.

Variable allocation

Program state is considered to be ephemeral and resides in memory areas specifically allocated for this purpose. Each function will allocate a new stack frame with local variables, and in some calling conventions, arguments to other functions called. Although when developing an application we use variables with specific names, when the code is compiled, variables are only memory spaces. If the language as weak, or no memory management features, access may be totally unconstrained and writing before or beyond the variable start may be a problem.

Considering the following program (also available here ), it declares two variables buffer and message. buffer is a char array with 5 bytes, while message is an array initialized to Hello World.

The for cycle present will write the value A to buffer, but instead of writing only 5 bytes, it will write 15 bytes. The question that arises is, where are these bytes going to? The program also prints the variable message before and after the the cycle, so it may help us finding this.

To check what happens, save to code to bo.c, compile the program with gcc -o bo bo.c, and execute it ./bo. What you will see is a basic overflow, but more on this later.

#include <stdio.h>

void main(int argc, char* argv[]){
    char message[] = "Hello World";
    
    char buffer[5];
    int i;

    printf("buffer=%s message=%s\n", buffer, message);
    
    for(i = 0; i < 15; i++) { 
        buffer[i] = 'A';
    }

    printf("buffer=%s message=%s\n", buffer, message);
}

Another file available will also print the value of several variables. Can be used to see how location declaration affects actual memory allocation.

Task:

  • Compile the program and execute it.
  • What can you conclude about memory structure of these variables
  • Instead of filling the buffer with A, fill it with a variable value (e.g. 'A' + i)

Buffer Overflows and TOCTOU

Many operations and not atomic, and specifically when the Time Of Check to Time Of Use is higher than 0, it may be possible to invalidate the check, or to change the value of the check, allowing access to additional resources.

The following example code is a crude demonstration of TOCTOU, which can be controlled through a Buffer Overflow. Specifically, the message variable can be used to override the allowed variable, essentially bypassing the previous check.

int main() {
        char allowed = 0;
        char password[8];
        char username[8];
        char message[32];

        puts("username:");
        gets(username);
        puts("password:");
        gets(password);
        allowed = strcmp("admin", username) + \
			strcmp("topsecrt", password);

        puts("message:");
        gets(message);	// <-- Issue here

        printf("user=%s pass=%s result=%d\n", username, \
				password, allowed);

        if(allowed == 0)
                printf("Access granted. Message sent!\n");
        else
                printf("Access denied\n");

        return 0;
}

If you use gdb to analyse the memory, you can check the order of the variables, and notice that message is before allowed. Therefore, an overflow of message will write over allowed. The amount of data to write depends on the distance between variables, which can be calculated. An attacker without access to the binary could need a brute force, but we won’t need it.

$p &allowed
0x7ffffffedf2f 

$ p &username
0x7ffffffedf1f

$ p &password
0x7ffffffedf27

$ p &message
0x7ffffffedef0

Tasks:

  • Compile the binary with these flags: gcc -g -O0 -fno-stack-protector –o prog_2 prog_2.c
  • Analyze the execution with different payloads
  • Determine * What is the stack base address? * Where is the return information? * How many bytes can be entered to the message without overflow? * How many bytes can be written without damage? * What happens when an overflow is achieved? * How can the decision be subverted?

Format String Attacks

The Format String exploit occurs when the submitted data of an input string is evaluated as a command by the application. In this way, the attacker could execute code, read the stack, or cause a segmentation fault in the running application, causing new behaviors that could compromise the security or the stability of the system.

The attack could be executed when the application doesn’t properly validate the submitted input. In this case, if a Format String parameter, like %x, is inserted into the posted data, the string is parsed by the Format Function, and the conversion specified in the parameters is executed. However, the Format Function is expecting more arguments as input, and if these arguments are not supplied, the function could read or write the memory.

Consider the following example, which has this vulnerability. The password is random, but the user is printed directly. Also, and invalid check is made, resulting the in the possibility of exploiting the username variable in order to leak the password and then access the system.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <sys/random.h>

char *ref_user = "root";
char ref_pass[1024];

void init_pass(char *s, const int len) {
	getrandom(s, len, 0);
	static const char alphanum[] = "0123456789#$!ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
    
	for (int i = 0; i < len; ++i) {
		s[i] = alphanum[((unsigned char) s[i]) % (sizeof(alphanum) - 1)];
	}

	s[len] = 0;
}


void check_user_pass(char* ref_user, char* ref_pass) {
	char username[64];
	char password[64];
	
	memset(username, 0, 64);
	memset(password, 0, 64);

	fprintf(stdout, "User: ");
	fgets(username, sizeof(username), stdin);
	printf(username);
    fprintf(stdout,"\n");
	
	if (strncmp(username, ref_user, strlen(ref_user)) != 0) {
		printf("Invalid user\n");
		exit(-1);
	}
	
    printf("Pass: ");
	
	fgets(password, sizeof(password), stdin);
	
	if (strncmp(password, ref_pass, strlen(password) != 0)) {
		printf("Invalid password.\n");
		exit(-2);
	}
}

int main()
{
	setbuf(stdout, NULL);
    setbuf(stdin, NULL);
    setbuf(stderr, NULL);

    printf("Generating Random Pass...");
	init_pass(ref_pass, 16);
	printf("Done\n");
    
	check_user_pass(ref_user, ref_pass);
	printf("Access Granted\n");
}

___ Tasks: ___

  • Compile the snippet and run it
  • Find how you can provide some payload besides the username
  • Using %x, %s or %p create a payload to get access to the system
Previous
Next