CSCE 465 Lecture 5

« previous | Tuesday, January 29, 2013 | next »

Lecture Slides

Buffer Overflow

Program Security:

BO Basics
"Stack smashing"
Other vulnerabilities
BO Defense

Basics

What is a buffer overflow?

occurs when a program writes data outside the bounds of allocated memory.
can be exploited to overwrite values in memory to the advantage of the attacker.

Impact

First widely seen in the first computer worm: Morris Worm (1988; 6000 machines affected)
Still the most common source of security vulnerability
SANS institute reports that 14/20 top vulnerabilities in 2006 were buffer overflow related
Also behind some of the most devastating worms and viruses in recent history:
- Zotob
- Sasser
- CodeRed
- Blaster
- SQL Slammer
- Conficker
- Stuxnet

goal: subvert the function of a privileged program so that the attacker can take control of that program, and if the program is sufficiently privileged, thence control the host
involvement: Ensure malicious code is present in program address space (by injection or use what's already there); Transfer execution to that code.

Code Injection

Provide a string as input to the program, which the program stores in a buffer.
The string contains native CPU instructions for the platform being attacked.
Works with buffers stored anywhere

Existing Code

Code of interest already in part of program
Attacker just needs to call it with desired arguments before jumping to it
e.g. Aquire a shell, but code already in some library contains a call to exec(char *arg); attacker needs to pass a pointer to the string "/bin/sh" and jump to the exec call.

Jumping to attacker code:

activation records: stores return address of function.; Attacker modifies this pointer to point to his code (called stack smashing)
function pointers: similar idea to modifying activation records, but seeks to modify an arbitrary function pointer.
longjump buffers: attacker modifies buffer with malicious code.

Vocabulary

buffer: data storage area inside computer memory (stack, heap, etc); intended to hold a pre-defined amount of data (if more is stuffed into it, it spills into adjacent memory); if executable code is supplied as "data", victim's computer may be fooled into executing it—we'll see how; code will self-propagate or give attacker control over machine
first generation exploits: stack smashing
second generation: heaps, function pointers, off-by-one
third generation: format strings and heap management structures.

Stack Smashing

Additional information available in class handout ^[1]

Process memory is divided into three sections: Text, data, and exectuion stack

Initialized and unititialized data
Static variables
Global variables

	Data section (`.data` or `.bss`)
Top of memory `0xFFFFFFFF`	Text/code section (`.text`)
	instructions and read-only data marked read-only modifications cause segmentation faults
	Stack section
	implementing procedure abstraction
	Environment/Argument section
Bottom of memory `0x00000000`	environment data command-line data

What happens when memory outside buffer is accesed?

Stack Frame

Usually grows toward lower memory addresses
Composed of frames
Stack pointer points to the top of the stack (usually the last valid address)

Parameters
Return address
Stack Frame Pointer (saved from `%esp`


Local variables
Current stack pointer (SP) `%esp`
↓ stack growth ↓

Suppose a web server contains this function

void func(char *str) {
    char buf[126];      // allocate local buffer (126 bytes reserved on stack)
    strcpy(buf, str);   // copy argument into local buffer
                        // DOES NOT CHECK WHETHER *str CONTAINS FEWER THAN 126 CHARS
}

when this function is invoked, a new frame with local variables is pushed onto the stack

before `strcpy`	after `strcpy`
top of stack	top of stack
frame of calling function	frame of calling function
arguments (str)	arguments (str)
return address	overflow (uh oh!) this will be interpreted as the return address!
previous frame pointer
buf	buf

Suppose buffer contains attacker-created string. suppose *str contains a string received from the network as input to some network service daemon.

Shell code to execute:

void main() {
    char *name[2];
    name[0] = "/bin/sh";
    name[1] = NULL;
    execve(name[0], name, NULL);
    exit(0);
}

Attack procedure

Compile attack code
extract binary for piece that actually dooes the work (shell code)
insert compiled code into buffer
figure out where overflow code should jump
place that address in the buffer at the proper location so the normal return address gets overwritten

Overflow pontion of the buffer must contain the correct address of attack code in the return position. How do you get that?

Otherwise application will crash with segmentation violation
Attacker must correctly "guess" where it is.
Not very difficult to do this.

Stack address of a program can be obtained by using the function:

unsigned long get_sp(void) {
    __asm__("movl %esp, %eax");
}

Guess the offset of the buffer with respect to the stack pointer.
Use a series of NOPs at the beginning of the overflowing buffer so that the jump does not need to be exactly precise (called a NOP sled)

Footnotes

↑ Stack Smashing Handout

[1] Stack Smashing Handout

[1]