[Home]HowToFindStackBugs

TheSourcery | RecentChanges | Preferences | Index | RSS

Showing revision 3
Here's our example program that we'll debug.


$ cat memover.c

a() {

char buf[10];

sprintf(buf,"$N gives %s a %s", "Dorkus", "big puking shiny sward o duum.");

}



int main(int argc, char** argv) {

  char c[10];

  a();

  return 0;

}



$ gcc -g -o memover memover.c

When we run it it crashes and dumps core. The bug is obvious looking at the code. The string being copied into the buf array is much larger than it can hold.


$ ./memover

Segmentation fault (core dumped)



$ gdb -c core

Core was generated by `./memover'.

Program terminated with signal 11, Segmentation fault.

#0  0x69622061 in ?? ()



(gdb) file memover

Reading symbols from memover...done.

The problem with memory overruns on the stack is that we can't rely of the frame pointers in our core dump as they are commonly trashed:


(gdb) bt

#0  0x69622061 in ?? ()

Cannot access memory at address 0x2073756b



(gdb) i locals

No symbol table info available.

Looks hopeless doesn't it? While the above error is obvious when your looking at it. How do we find it when our core dump is screwed up. Well first let's find out where our program code is.


(gdb) x _start

0x8048320 <_start>:     0x895eed31

(gdb) x _fini

0x804843c <_fini>:      0x53e58955

Our program code is found within the range of 0x8048320-0x804843c

How did I know to look for those symbol names? Well you find that out using the objdump utility on your executable.


$ objdump -t memover

memover:     file format elf32-i386



SYMBOL TABLE:

080480f4 l    d  .interp        00000000

08048108 l    d  .note.ABI-tag  00000000

08048128 l    d  .hash  00000000

08048158 l    d  .dynsym        00000000

080481c8 l    d  .dynstr        00000000

08048244 l    d  .gnu.version   00000000

08048254 l    d  .gnu.version_r 00000000

08048274 l    d  .rel.got       00000000

0804827c l    d  .rel.plt       00000000

0804829c l    d  .init  00000000

080482cc l    d  .plt   00000000

08048320 l    d  .text  00000000               ----------- our code is called .text 

0804843c l    d  .fini  00000000               --------- end of code here 

08048460 l    d  .rodata        00000000

...blah blah removed ....

08048320 g       .text  00000000              _start          --------- here's the symbol

08049598 g     O *ABS*  00000000              __bss_start

080483f4 g     F .text  00000011              main            ---------- our main

080482fc       F *UND*  00000105              __libc_start_main@@GLIBC_2.0

080494b8  w      .data  00000000              data_start

0804843c g     F .fini  00000000              _fini           -------- end of code

08049598 g     O *ABS*  00000000              _edata

080494d8 g     O .got   00000000              _GLOBAL_OFFSET_TABLE_

080495b0 g     O *ABS*  00000000              _end

080483d0 g     F .text  00000023              a                -------- ours tooo

08048464 g     O .rodata        00000004              _IO_stdin_used

0804830c       F *UND*  00000024              sprintf@@GLIBC_2.0  ------ library calls we called 

...more blah blah removed ....

Why do we need to know that? Well we can look at the registers in our core and look at a few meaningful ones. We can also get a sense of which are valid knowing the range our program is loaded at and where the stack usually lives.


(gdb) i registers

eax            0x30     48

ecx            0x40071d14       1074208020

edx            0xbffffbf8       -1073742856

ebx            0x4010b1ec       1074835948

esp            0xbffffcdc       -1073742628   ------ stack pointer is probably valid!  

                                              ------ This should be a high address on gcc 2.95+ on linux that is...  

                                              ------ systems that use alloca like old bsd and cygwin will often

                                              ------ have a low stack address below the program code

ebp            0x2073756b       544437611     ------ ebp the base frame pointer (this is trashed)

                                              ------ we know this because backtrace is fubared

esi            0x4000ae60       1073786464

edi            0xbffffd34       -1073742540

eip            0x69622061       1768038497    ------ eip is where we are executing (this is trashed too - ascii)

                                              ------- unscramble it right to left and it reads "a bi"  hmmm...

eflags         0x10282  66178

cs             0x23     35

ss             0x2b     43

ds             0x2b     43

es             0x2b     43

fs             0x2b     43

gs             0x2b     43

cwd            0x0      0

swd            0x0      0

twd            0x0      0

fip            0x0      0

fcs            0x0      0

fopo           0x0      0

fos            0x0      0

Let's see what's on the stack pointed to by ESP. Display the contents of ESP in strings.


(gdb) x/10s 0xbffffcdc

0xbffffcdc:      "g puking shiny sward o duum."  ----------- wow a clue.  grep for it in the code.  

0xbffffcf9:      "² +h8\001@\001"

0xbffffd02:      ""

0xbffffd03:      ""

0xbffffd04:      " \203\004\b"

0xbffffd09:      ""

0xbffffd0a:      ""

0xbffffd0b:      ""

0xbffffd0c:      "A\203\004\b(\203\004\b\001"

0xbffffd16:      ""

Okay assuming we still didn't have clue.. lets display the stack in words


(gdb) x /20w 0xbffffcdc

0xbffffcdc:     0x75702067      0x676e696b      0x69687320      0x7320796e

0xbffffcec:     0x64726177      0x64206f20      0x2e6d7575      0xbffffd00* --- looks good here (PUSH of ESP?)  

0xbffffcfc:     0x40013868      0x00000001      0x08048320**    0x00000000

0xbffffd0c:     0x08048341      0x080483f4      0x00000001      0xbffffd34

0xbffffd1c:     0x0804829c      0x0804843c      0x4000ae60      0xbffffd2c

As we go up the stack we're hoping to find some data that looks valid. Bingo 0x08048320 looks like the last valid EIP of our program that is on the stack.


(gdb) disass 0x08048320

Dump of assembler code for function _start:

0x8048320 <_start>:     xor    %ebp,%ebp

0x8048322 <_start+2>:   pop    %esi

0x8048323 <_start+3>:   mov    %esp,%ecx

0x8048325 <_start+5>:   and    $0xfffffff8,%esp

It's a shame as that happens to be _start which doesn't narrow things down. Had we had a much bigger program with many more functions nested a might deeper that might have been helpful. Merc muds don't nest too deeply anyways so depending on how bad your overflow was that might not help.

Learn to use gdb, explore, play around and how the real code works under the covers of the high level language. I'm certain there are quicker approaches to solving this problem.


TheSourcery | RecentChanges | Preferences | Index | RSS
Edit revision 3 of this page | View other revisions | View current revision
Edited November 22, 2004 5:28 am by JonLambert (diff)
Search:
All material on this Wiki is the property of the contributing authors.
©2004-2006 by the contributing authors.
Ideas, requests, problems regarding this site? Send feedback.