[Home]History of HowToFindStackBugs

TheSourcery | RecentChanges | Preferences | Index | RSS


Revision 7 . . (edit) September 13, 2005 9:32 pm by JonLambert [Removed extra lines... have done this before what's the deal with this.]
Revision 3 . . November 22, 2004 5:28 am by JonLambert
  

Difference (from prior major revision) (minor diff)

Changed: 1,174c1,177
Here's our example program that we'll debug.


$ cat memover.c
a() {
char buf[10];
sprintf(buf,"$N gives %s a %s", "Dorkus", "big puking shiny sward o duum.");
}

int main(int argc, char** argv) {
char c[10];
a();
return 0;
}

$ gcc -g -o memover memover.c


When we run it it crashes and dumps core. The bug is obvious looking at the code. The string being copied into the buf array is much larger than it can hold.


$ ./memover
Segmentation fault (core dumped)

$ gdb -c core
Core was generated by `./memover'.
Program terminated with signal 11, Segmentation fault.
#0 0x69622061 in ?? ()

(gdb) file memover
Reading symbols from memover...done.


The problem with memory overruns on the stack is that we can't rely of the frame pointers in our core dump as they are commonly trashed:


(gdb) bt
#0 0x69622061 in ?? ()
Cannot access memory at address 0x2073756b

(gdb) i locals
No symbol table info available.


Looks hopeless doesn't it? While the above error is obvious when your looking at it. How do we find it when our core dump is screwed up. Well first let's find out where our program code is.


(gdb) x _start
0x8048320 <_start>: 0x895eed31
(gdb) x _fini
0x804843c <_fini>: 0x53e58955


Our program code is found within the range of 0x8048320-0x804843c

How did I know to look for those symbol names?
Well you find that out using the objdump utility on your executable.


$ objdump -t memover
memover: file format elf32-i386

SYMBOL TABLE:
080480f4 l d .interp 00000000
08048108 l d .note.ABI-tag 00000000
08048128 l d .hash 00000000
08048158 l d .dynsym 00000000
080481c8 l d .dynstr 00000000
08048244 l d .gnu.version 00000000
08048254 l d .gnu.version_r 00000000
08048274 l d .rel.got 00000000
0804827c l d .rel.plt 00000000
0804829c l d .init 00000000
080482cc l d .plt 00000000
08048320 l d .text 00000000 ----------- our code is called .text
0804843c l d .fini 00000000 --------- end of code here
08048460 l d .rodata 00000000
...blah blah removed ....
08048320 g .text 00000000 _start --------- here's the symbol
08049598 g O *ABS* 00000000 __bss_start
080483f4 g F .text 00000011 main ---------- our main
080482fc F *UND* 00000105 __libc_start_main@@GLIBC_2.0
080494b8 w .data 00000000 data_start
0804843c g F .fini 00000000 _fini -------- end of code
08049598 g O *ABS* 00000000 _edata
080494d8 g O .got 00000000 _GLOBAL_OFFSET_TABLE_
080495b0 g O *ABS* 00000000 _end
080483d0 g F .text 00000023 a -------- ours tooo
08048464 g O .rodata 00000004 _IO_stdin_used
0804830c F *UND* 00000024 sprintf@@GLIBC_2.0 ------ library calls we called
...more blah blah removed ....


Why do we need to know that? Well we can look at the registers in our core and look at a few meaningful ones. We can also get a sense of which are valid knowing the range our program is loaded at and where the stack usually lives.


(gdb) i registers
eax 0x30 48
ecx 0x40071d14 1074208020
edx 0xbffffbf8 -1073742856
ebx 0x4010b1ec 1074835948
esp 0xbffffcdc -1073742628 ------ stack pointer is probably valid!
------ This should be a high address on gcc 2.95+ on linux that is...
------ systems that use alloca like old bsd and cygwin will often
------ have a low stack address below the program code
ebp 0x2073756b 544437611 ------ ebp the base frame pointer (this is trashed)
------ we know this because backtrace is fubared
esi 0x4000ae60 1073786464
edi 0xbffffd34 -1073742540
eip 0x69622061 1768038497 ------ eip is where we are executing (this is trashed too - ascii)
------- unscramble it right to left and it reads "a bi" hmmm...
eflags 0x10282 66178
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x2b 43
gs 0x2b 43
cwd 0x0 0
swd 0x0 0
twd 0x0 0
fip 0x0 0
fcs 0x0 0
fopo 0x0 0
fos 0x0 0


Let's see what's on the stack pointed to by ESP. Display the contents of ESP in strings.


(gdb) x/10s 0xbffffcdc
0xbffffcdc: "g puking shiny sward o duum." ----------- wow a clue. grep for it in the code.
0xbffffcf9: "² +h8\001@\001"
0xbffffd02: ""
0xbffffd03: ""
0xbffffd04: " \203\004\b"
0xbffffd09: ""
0xbffffd0a: ""
0xbffffd0b: ""
0xbffffd0c: "A\203\004\b(\203\004\b\001"
0xbffffd16: ""


Okay assuming we still didn't have clue.. lets display the stack in words


(gdb) x /20w 0xbffffcdc
0xbffffcdc: 0x75702067 0x676e696b 0x69687320 0x7320796e
0xbffffcec: 0x64726177 0x64206f20 0x2e6d7575 0xbffffd00* --- looks good here (PUSH of ESP?)
0xbffffcfc: 0x40013868 0x00000001 0x08048320** 0x00000000
0xbffffd0c: 0x08048341 0x080483f4 0x00000001 0xbffffd34
0xbffffd1c: 0x0804829c 0x0804843c 0x4000ae60 0xbffffd2c


As we go up the stack we're hoping to find some data that looks valid.
Bingo 0x08048320 looks like the last valid EIP of our program that is on the stack.


(gdb) disass 0x08048320
Dump of assembler code for function _start:
0x8048320 <_start>: xor %ebp,%ebp
0x8048322 <_start+2>: pop %esi
0x8048323 <_start+3>: mov %esp,%ecx
0x8048325 <_start+5>: and $0xfffffff8,%esp


It's a shame as that happens to be _start which doesn't narrow things down.
Had we had a much bigger program with many more functions nested a might deeper
that might have been helpful. Merc muds don't nest too deeply anyways so
depending on how bad your overflow was that might not help.

Learn to use gdb, explore, play around and how the real code works under the covers
of the high level language. I'm certain there are quicker approaches to solving this
problem.
Here's our example program that we'll debug.


$ cat memover.c
a() {
char buf[10];
sprintf(buf,"$N gives %s a %s", "Dorkus", "big puking shiny sward o duum.");
}

int main(int argc, char** argv) {
char c[10];
a();
return 0;
}

$ gcc -g -o memover memover.c


When we run it it crashes and dumps core. The bug is obvious looking at the code. The string being copied into the buf array is much larger than it can hold.


$ ./memover
Segmentation fault (core dumped)

$ gdb -c core
Core was generated by `./memover'.
Program terminated with signal 11, Segmentation fault.
#0 0x69622061 in ?? ()

(gdb) file memover
Reading symbols from memover...done.


The problem with memory overruns on the stack is that we can't rely on the frame pointers in our core dump as they are commonly trashed:


(gdb) bt
#0 0x69622061 in ?? ()
Cannot access memory at address 0x2073756b

(gdb) i locals
No symbol table info available.


Looks hopeless doesn't it as gdb has lost it's way? While the above error is obvious when your looking right at it. How do we find the offensive source code in a large program when our core dump is screwed up. Well first let's find out where our program code is.


(gdb) x _start
0x8048320 <_start>: 0x895eed31
(gdb) x _fini
0x804843c <_fini>: 0x53e58955


Our program code is found within the range of 0x8048320-0x804843c

How did I know to look for those symbol names?
Well you find that out using the objdump utility on your executable.


$ objdump -t memover
memover: file format elf32-i386

SYMBOL TABLE:
080480f4 l d .interp 00000000
08048108 l d .note.ABI-tag 00000000
08048128 l d .hash 00000000
08048158 l d .dynsym 00000000
080481c8 l d .dynstr 00000000
08048244 l d .gnu.version 00000000
08048254 l d .gnu.version_r 00000000
08048274 l d .rel.got 00000000
0804827c l d .rel.plt 00000000
0804829c l d .init 00000000
080482cc l d .plt 00000000
08048320 l d .text 00000000 ----------- our code is called .text
0804843c l d .fini 00000000 --------- end of code here
08048460 l d .rodata 00000000
...blah blah removed ....
08048320 g .text 00000000 _start --------- here's the symbol
08049598 g O *ABS* 00000000 __bss_start
080483f4 g F .text 00000011 main ---------- our main
080482fc F *UND* 00000105 __libc_start_main@@GLIBC_2.0
080494b8 w .data 00000000 data_start
0804843c g F .fini 00000000 _fini -------- end of code
08049598 g O *ABS* 00000000 _edata
080494d8 g O .got 00000000 _GLOBAL_OFFSET_TABLE_
080495b0 g O *ABS* 00000000 _end
080483d0 g F .text 00000023 a -------- ours tooo
08048464 g O .rodata 00000004 _IO_stdin_used
0804830c F *UND* 00000024 sprintf@@GLIBC_2.0 ------ library calls we called
...more blah blah removed ....



Why do we need to know that? Well we can look at the registers in our core and look at a few meaningful ones. We can also get a sense of which are valid knowing the range our program is loaded at and where the stack usually lives.


(gdb) i registers
eax 0x30 48
ecx 0x40071d14 1074208020
edx 0xbffffbf8 -1073742856
ebx 0x4010b1ec 1074835948
esp 0xbffffcdc -1073742628 ------ stack pointer is probably valid!
------ This should be a high address on gcc 2.95+ on linux that is...
------ systems that use alloca like old bsd and cygwin will often
------ have a low stack address below the program code
ebp 0x2073756b 544437611 ------ ebp the base frame pointer (this is trashed)
------ we know this because backtrace is fubared
------ and because it should be pointing at the stack higher than ESP
------- (greater than 0xbffffcdc or thereabouts)
esi 0x4000ae60 1073786464
edi 0xbffffd34 -1073742540
eip 0x69622061 1768038497 ------ eip is where we are executing (this is trashed too - ascii)
------- unscramble it right to left and it reads "a bi" hmmm...
eflags 0x10282 66178
cs 0x23 35
ss 0x2b 43
ds 0x2b 43
es 0x2b 43
fs 0x2b 43
gs 0x2b 43
cwd 0x0 0
swd 0x0 0
twd 0x0 0
fip 0x0 0
fcs 0x0 0
fopo 0x0 0
fos 0x0 0


Let's see what's on the stack pointed to by ESP. Display the contents of ESP in strings.


(gdb) x/10s 0xbffffcdc
0xbffffcdc: "g puking shiny sward o duum." ----------- wow a clue. grep for it in the code.
0xbffffcf9: "² +h8\001@\001"
0xbffffd02: ""
0xbffffd03: ""
0xbffffd04: " \203\004\b"
0xbffffd09: ""
0xbffffd0a: ""
0xbffffd0b: ""
0xbffffd0c: "A\203\004\b(\203\004\b\001"
0xbffffd16: ""


Okay assuming we still didn't have clue.. lets display the stack in words


(gdb) x /20w 0xbffffcdc
0xbffffcdc: 0x75702067 0x676e696b 0x69687320 0x7320796e
0xbffffcec: 0x64726177 0x64206f20 0x2e6d7575 0xbffffd00* --- looks good here (PUSH of prior EBP?)
0xbffffcfc: 0x40013868 0x00000001 0x08048320** 0x00000000
0xbffffd0c: 0x08048341 0x080483f4 0x00000001 0xbffffd34
0xbffffd1c: 0x0804829c 0x0804843c 0x4000ae60 0xbffffd2c


As we go up the stack we're hoping to find some data that looks valid.
Bingo 0x08048320 looks like the last valid EIP of our program that is on the stack.


(gdb) disass 0x08048320
Dump of assembler code for function _start:
0x8048320 <_start>: xor %ebp,%ebp
0x8048322 <_start+2>: pop %esi
0x8048323 <_start+3>: mov %esp,%ecx
0x8048325 <_start+5>: and $0xfffffff8,%esp


It's a shame as that happens to be _start which doesn't narrow things down.
Had we had a much bigger program with many more functions nested a might deeper
that might have been helpful. Merc muds don't nest too deeply anyways so
depending on how bad your overflow was that might not help.

Learn to use gdb, explore, play around and how the real code works under the covers
of the high level language. I'm certain there are quicker approaches to solving this
problem.

TheSourcery | RecentChanges | Preferences | Index | RSS
Search:
All material on this Wiki is the property of the contributing authors.
©2004-2006 by the contributing authors.
Ideas, requests, problems regarding this site? Send feedback.