Thursday, October 6, 2011

gdb tutorial


1. How do I use gdb?

When you compile your program, you must tell the compiler to produce a program that is compatible with the debugger. The debugger needs special information to run properly. To do this, you must compile your program with the debugger flag, -g. This step is critical. Without it, the debugger won't have the program symbol information. That means it won't know what your functions and variables are called, and it won't understand when you ask it about them.

1.1 How do I compile with debugging symbols?

Pass the -g flag to your compiler:
 
[sgupta@rhel6x64 socket]$ gcc -g program.c -o programname
 
NOTE: If you have a larger program with several files, each must be compiled with the -g flag, and it must also be set when you link.

1.2 How do I run programs with the debugger?

First start the debugger with your program name as the first argument.
 
[sgupta@rhel6x64 socket]$ gdb programname
 
Next use the run command in gdb to start execution. Pass your arguments to this command.
 
(gdb) run arg1 "arg2" ...

1.3 How do I restart a program running in the debugger?

Use the kill command in gdb to stop execution. The you can use the run command as shown above to start it again.
(gdb) kill
Kill the program being debugged? (y or n) y
(gdb) run ...

1.4 How do I exit the debugger?

Use the quit command.
(gdb) quit
NOTE: You may be asked if you want to kill the program. Answer yes.
(gdb) quit
The program is running. Exit anyway? (y or n) y
[sgupta@rhel6x64 socket]$

1.5 How do I get help on debugger commands?

Use the help command. Gdb has a description for every command it understand, and there are many, many more then this tutorial covers. The argument to help is the command you want information about. If you just type "help" with no arguments, you will get a list of help topics similar to the following: 

(gdb) help
List of classes of commands:

aliases -- Aliases of other commands
breakpoints -- Making program stop at certain points
data -- Examining data
files -- Specifying and examining files
internals -- Maintenance commands
obscure -- Obscure features
running -- Running the program
stack -- Examining the stack
status -- Status inquiries
support -- Support facilities
tracepoints -- Tracing of program execution without stopping the program
user-defined -- User-defined commands

Type "help" followed by a class name for a list of commands in that class.
Type "help" followed by command name for full documentation.
Command name abbreviations are allowed if unambiguous.


2. How do I watch the execution of my program?

Gdb functions somewhat like an interpreter for your programs. You can stop your program at any time by sending it signals. Normally this is done using key combinations like Ctrl-C for the interrupt signal SIGINT. Outside of gdb this would terminate your program. Gdb traps this signal and stops executing your program. Also, using breakpoints you can have your program stop executing at any line of code or function call. Once your program is stopped, you can examine 'where' it is in your code. You can look at the variables currently in scope, as well as your memory space and the cpu registers. You can also change variables and memory to see what effect it has on your code.

2.1 How do I stop execution?

You can stop execution by sending your program UNIX symbols like SIGINT. This is done using the Ctrl-C key combination. In the following example, I pressed Ctrl-C after 'Starting Program...' appeared.

(gdb) run
 
Starting Program: /home/ug/ryansc/a.out

Program received signal SIGINT, Interrupt.
0x80483b4 in main(argc=1, argv=0xbffffda4) at loop.c:5
5   while(1){
...
(gdb)

2.2 How do I continue execution?

Use the continue command to restart execution of your program whenever it is stopped.

2.3 How do I see where my program stopped?

Use the list command to have gdb print out the lines of code above and below the line the program is stopped at. In the example below, the breakpoint is on line 8.
 
(gdb) list
3       int main(int argc, char **argv)
4       {
5         int x = 30;
6         int y = 10;
7    
8         x = y;
9    
10        return 0;
11      }


2.4 How do I step through my code line-by-line?

First stop your program by sending it signals or using breakpoints. Then use the next and step commands.

5   while(1){
(gdb) next
7   }
(gdb)
 
NOTE: the next and step commands are different. On a line of code that has a function call, next will go 'over' the function call to the next line of code, while step will go 'into' the function call.
The next command:

(gdb)
11     fun1();
(gdb) next
12 }
The step command:
(gdb)
11     fun1();
(gdb) step;
fun1 () at loop.c:5
5    return 0;
(gdb)


2.5 How do I examine variables?

Use the print command with a variable name as the argument. For example, if you have int x and char *s:

(gdb) print x
$1 = 900
(gdb) print s
$3 = 0x8048470 "Hello World!\n"
(gdb)
 
NOTE: The output from the print command is always formatted $## = (value). The $## is simply a counter that keeps track of the variables you have examined.

2.6 How do I modify variables?

Use the set command with a C assignment statement as the argument. For example, to change int x to have the value 3:
 
(gdb) set x = 3
(gdb) print x
$4 = 3
 
NOTE: in newer versions of gdb, it may be necessary to use the command 'set var', as in 'set var x = 3'

2.7 How do I call functions linked into my program?

From the debugger command line you can use the call command to call any function linked into the program. This includes your own code as well as standard library functions. For example, if you wish to have your program dump core:

(gdb) call abort()

2.8 How do I return from a function?

Use the finish command to have a function finish executing and return to it's caller. This command also shows you what value the function returned.

(gdb) finish
Run till exit from #0  fun1 () at test.c:5
main (argc=1, argv=0xbffffaf4) at test.c:17
17        return 0;
Value returned is $1 = 1

3. How do I use the call stack?

The call stack is where we find the stack frames that control program flow. When a function is called, it creates a stack frame that tells the computer how to return control to its caller after it has finished executing. Stack frames are also where local variables and function arguments are 'stored'. We can look at these stack frames to determine how our program is running. Finding the list of stack frames below the current frame is called a backtrace.

3.1 How do I get a backtrace?

Use the gdb command backtrace. In the backtrace below, we can see that we are currently inside func2(), which was called bu func1(), which was called from main()

(gdb) backtrace
#0  func2 (x=30) at test.c:5
#1  0x80483e6 in func1 (a=30) at test.c:10
#2  0x8048414 in main (argc=1, argv=0xbffffaf4) at test.c:19
#3  0x40037f5c in __libc_start_main () from /lib/libc.so.6
(gdb)


3.2 How do I change stack frames?

Use the gdb command frame. Notice in the backtrace above that each frame has a number beside it. Pass the number of the frame you want as an argument to the command.
 
(gdb) frame 2
#2  0x8048414 in main (argc=1, argv=0xbffffaf4) at test.c:19
19        x = func1(x);
(gdb)


3.3 How do I examine stack frames?

To look at the contents of the current frame, there are 3 useful gdb commands. info frame displays information about the current stack frame. info locals displays the list of local variables and their values for the current stack frame, and info args displays the list of arguments.

(gdb) info frame
Stack level 2, frame at 0xbffffa8c:
eip = 0x8048414 in main (test.c:19); saved eip 0x40037f5c
called by frame at 0xbffffac8, caller of frame at 0xbffffa5c
source language c.
Arglist at 0xbffffa8c, args: argc=1, argv=0xbffffaf4
Locals at 0xbffffa8c, Previous frame's sp is 0x0
Saved registers:
 ebp at 0xbffffa8c, eip at 0xbffffa90
(gdb) info locals
x = 30
s = 0x8048484 "Hello World!\n"
(gdb) info args
argc = 1
argv = (char **) 0xbffffaf4

4. How do I use breakpoints?

Breakpoints are a way of telling gdb that you want it to stop your program at certain lines of code. You can also have it stop when your program makes specific function calls. Once the program is stopped, you can poke around in memory and see what the values of all your variables are, examine the stack, and step through your program's execution.

4.1 How do I set a breakpoint on a line?

The command to set a breakpoint is break. If you only have one source file, you can set a breakpoint like so:

(gdb) break 19
Breakpoint 1 at 0x80483f8: file test.c, line 19
If you have more than one file, you must give the break command a filename as well:
(gdb) break test.c:19
Breakpoint 2 at 0x80483f8: file test.c, line 19  

4.2 How do I set a breakpoint on a C function?
To set a breakpoint on a C function, pass it's name to break.
 
(gdb) break func1
Breakpoint 3 at 0x80483ca: file test.c, line 10    


4.3 How do I set a breakpoint on a C++ function?

Setting a breakpoint on a C++ function is similar to setting a breakpoint on a C function. However C++ is polymorphic, so you must tell break which version of the function you want to break on (even if there is only one). To do this, you tell it the list of argument types.
 
(gdb) break TestClass::testFunc(int)
Breakpoint 1 at 0x80485b2: file cpptest.cpp, line 16.


4.4 How do I set a temporary breakpoint?

Use the tbreak command instead of break. A temporary breakpoint only stops the program once, and is then removed.

4.5 How do I get a list of breakpoints?

Use the info breakpoints command.
 
(gdb) info breakpoints
Num Type           Disp Enb Address    What
2   breakpoint     keep y   0x080483c3 in func2 at test.c:5
3   breakpoint     keep y   0x080483da in func1 at test.c:10

4.6 How do I disable breakpoints?


Use the disable command. Pass the number of the breakpoint you wish to disable as an argument to this command. You can find the breakpoint number in the list of breakpoints, as shown above. In the example below we can see that breakpoint number 2 has been disabled (there is an 'n' under the Enb column).

(gdb) disable 2
(gdb) info breakpoints
Num Type           Disp Enb Address    What
2   breakpoint     keep n   0x080483c3 in func2 at test.c:5
3   breakpoint     keep y   0x080483da in func1 at test.c:10

4.7 How do I skip breakpoints?

To skip a breakpoint a certain number of times, we use the ignore command. The ignore command takes two arguments: the breakpoint number to skip, and the number of times to skip it. 

(gdb) ignore 2 5
Will ignore next 5 crossings of breakpoint 2.

5. How do I use watchpoints?

Watchpoints are similar to breakpoints. However, watchpoints are not set for functions or lines of code. Watchpoints are set on variables. When those variables are read or written, the watchpoint is triggered and program execution stops.
It is difficult to understand watchpoint commands by themselves, so the following simple example program will be used in the command usage examples.

#include <stdio.h>

int main(int argc, char **argv)
{
   int x = 30;
   int y = 10;

   x = y;

   return 0;
}

5.1 How do I set a write watchpoint for a variable?

Use the watch command. The argument to the watch command is an expression that is evaluated. This implies that the variabel you want to set a watchpoint on must be in the current scope. So, to set a watchpoint on a non-global variable, you must have set a breakpoint that will stop your program when the variable is in scope. You set the watchpoint after the program breaks.
 
NOTE: You may notice in the example below that the line of code printed doesn't match with the line that changes the variable x. This is because the store instruction that sets off the watchpoint is the last in the sequence necessary to do the 'x=y' assignment. So the debugger has already gone on to the next line of code. In the examples, a breakpoint has been set on the 'main' function and has been triggered to stop the program.

(gdb) watch x
Hardware watchpoint 4: x
(gdb) c
Continuing.
Hardware watchpoint 4: x

Old value = -1073743192
New value = 11
main (argc=1, argv=0xbffffaf4) at test.c:10
10      return 0;

5.2 How do I set a read watchpoint for a variable?

Use the rwatch command. Usage is identical to the watch command.
 
(gdb) rwatch y
Hardware read watchpoint 4: y
(gdb) continue
Continuing.
Hardware read watchpoint 4: y

Value = 1073792976
main (argc=1, argv=0xbffffaf4) at test.c:8
8         x = y;

5.3 How do I set a read/write watchpoint for a variable?

Use the awatch command. Usage is identical to the watch command.

5.4 How do I disable watchpoints?

Active watchpoints show up the breakpoint list. Use the info breakpoints command to get this list. Then use the disable command to turn off a watchpoint, just like disabling a breakpoint.

(gdb) info breakpoints
Num Type           Disp Enb Address    What
1   breakpoint     keep y   0x080483c6 in main at test.c:5
       breakpoint already hit 1 time
4   hw watchpoint  keep y   x
       breakpoint already hit 1 time

6. Advanced gdb Features

6.1 How do I examine memory?

Use the x command to examine memory. The syntax for the x command is x/FMT ADDRESS. The FMT field is a count followed by a format letter and a size letter. There are many options here, use the help command 'help x' to see them all. The ADDRESS argument can either be a symbol name, such as a variable, or a memory address.
If we have
char *s = "Hello World";
some uses of the x command could be:

Examine the variable as a string:
 
(gdb) x/s s
0x8048434 <_IO_stdin_used+4>:    "Hello World\n"
Examine the variable as a character:
(gdb) x/c s
0x8048434 <_IO_stdin_used+4>:   72 'H'
Examine the variable as 4 characters:
(gdb) x/4c s
0x8048434 <_IO_stdin_used+4>:   72 'H'  101 'e' 108 'l' 108 'l'
Examine the first 32 bits of the variable:
(gdb) x/t s
0x8048434 <_IO_stdin_used+4>:   01101100011011000110010101001000
Examine the first 24 bytes of the variable in hex:
(gdb) x/3x s
0x8048434 <_IO_stdin_used+4>:   0x6c6c6548      0x6f57206f      0x0a646c72

6.2 How do I see what is in the processor registers?

Use the info registers command. The output of this command depends on the hardware architecture. The following is part of the output on an intel machine:

(gdb) info registers
eax            0x40123460       1074934880
ecx            0x1      1
edx            0x80483c0        134513600
ebx            0x40124bf4       1074940916
esp            0xbffffa74       0xbffffa74
ebp            0xbffffa8c       0xbffffa8c
esi            0x400165e4       1073833444
...

6.3 How do I debug with a core file?

When your program segfaults and leaves a core dump file, you can use gdb to look at the program state when it crashed. Use the core command to load a core file. The argument to the core command is the filename of the core dump file, which is usually "core", making the full command core core.

[sgupta@rhel6x64 socket]$ myprogram
Segmentation fault (core dumped)
[sgupta@rhel6x64 socket]$ gdb myprogram
...
(gdb) core core
...

6.4 How do I step through my code at the instruction level?

There are two commands, nexti and stepi, that work similar to next and step. See the usage of those commands for an idea of how to use these two

6.5 How do I see the assembly code my program is running?


Use the disassemble command. The argument to this command is a memory address. Here is an example of the disassembly for the main function of a simple program on an intel machine:
(gdb) disassemble main

Dump of assembler code for function main:

0x80483c0 <main>:       push   %ebp
0x80483c1 <main+1>:     mov    %esp,%ebp
0x80483c3 <main+3>:     sub    $0x18,%esp
0x80483c6 <main+6>:     movl   $0x0,0xfffffffc(%ebp)
0x80483cd <main+13>:    mov    0xfffffffc(%ebp),%eax
0x80483d0 <main+16>:    movb   $0x7,(%eax)
0x80483d3 <main+19>:    xor    %eax,%eax
0x80483d5 <main+21>:    jmp    0x80483d7 <main+23>
0x80483d7 <main+23>:    leave
0x80483d8 <main+24>:    ret   
End of assembler dump.

7.1 Example Debugging Session: Infinite Loop Example


We are going to use gdb to discover where the infinite loop in the following program is. It may be obvious to you on inspection, but it is instructive to use gdb to find it. The program should print out all the alphanumeric (letter and number) characters in it's input.

1 : #include <stdio.h>
2 : #include <ctype.h>
3 :
4 : int main(int argc, char **argv)
5 : {
6 :   char c;
7 :
8 :   c = fgetc(stdin);
9 :   while(c != EOF){
10:

11:       if(isalnum(c))
12:          printf("%c", c);
13:        else
14:          c = fgetc(stdin);
15:   }
16:
17:   return 1;
18: }



The first step is to compile the program with debugging flags:
 
[sgupta@rhel6x64 socket]$ gcc -g inf.c
 
Now if we run this program and enter a few characters followed by a newline, we discover something is amiss. Note that you will have to press Ctrl-C to stop this program once the infinite loop starts!

[sgupta@rhel6x64 socket]$ a.out

bob
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
...
 

Obviously, we have a problem. Lets load up gdb:
 
[sgupta@rhel6x64 socket]$ gdb a.out
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb)
 

To find the problem, we'll set off the infinite loop, and then press Ctrl-C to send the program a SIGINT. Gdb will trap this signal and stop program execution.

(gdb) run
Starting program: /home/dgawd/cpsc/363/a.out
moo
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
....
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Program received signal SIGINT, Interrupt.
0x400d8dc4 in write () from /lib/libc.so.6
(gdb)
 

Now the program is stopped and we can see where we are. We will use the backtrace command to examine the stack. The output of this command depends on your C libraries and exactly where the program was when you sent it the SIGINT. Mine looks like this:

(gdb) backtrace
#0  0x400d8dc4 in write () from /lib/libc.so.6
#1  0x40124bf4 in __check_rhosts_file () from /lib/libc.so.6
#2  0x40086ee8 in _IO_do_write () from /lib/libc.so.6
#3  0x40086e46 in _IO_do_write () from /lib/libc.so.6
#4  0x40087113 in _IO_file_overflow () from /lib/libc.so.6
#5  0x40087de5 in __overflow () from /lib/libc.so.6
#6  0x40069696 in vfprintf () from /lib/libc.so.6
#7  0x40070d76 in printf () from /lib/libc.so.6
#8  0x80484c2 in main (argc=1, argv=0xbffffaf4) at inf.c:12
#9  0x40037f5c in __libc_start_main () from /lib/libc.so.6

From this output, we can see that the program stopped in the write() system call inside the C library. But what we really want to see is where we are in main, so we are going to switch to frame 8:

(gdb) frame 8
#8  0x80484c2 in main (argc=1, argv=0xbffffaf4) at inf.c:12
12                printf("%c", c);
From the output, we know that the value of c is probably 'm', but lets check anyway:
(gdb) print c
$1 = 109 'm'
 

Now we have to find the loop. We use several iterations of the 'next' command to watch what is happening. Note that we have to work our way up the call stack back to main(). The next command will exit any functions we can't debug (like C library functions). We could also use the finish command here.
 

(gdb) next
 

Single stepping until exit from function write,
which has no line number information.
 

0x40087778 in _IO_file_write () from /lib/libc.so.6
(gdb) next
Single stepping until exit from function _IO_file_write,
which has no line number information.
0x40086ee8 in _IO_do_write () from /lib/libc.so.6
.....
.....
(gdb)
Single stepping until exit from function printf,
which has no line number information.
main (argc=1, argv=0xbffffaf4) at inf.c:15
15        }
Ok, now we are inside main(). We run the next command several more times to watch the program execute.
(gdb) n
11              if(isalnum(c))
(gdb)
12                printf("%c", c);
(gdb)
15        }
(gdb)
11              if(isalnum(c))
(gdb)
12                printf("%c", c);
(gdb) n
15        }
(gdb)
11              if(isalnum(c))
(gdb)
12                printf("%c", c);
 

Notice a pattern? The same two lines of code are executing over and over. This means we are looping, inside the while loop on line 9. Now, the value of 'c' is not changed in these lines. Maybe we should look back at our program:

11:        if(isalnum(c))
12:          printf("%c", c);
13:     else
14:          c = fgetc(stdin);
 

The lines being executed are 11 and 12. The test is always passing because the character is never changing. So the program is only reading characters until it finds an alphanumeric character, after which it never reaches the fgetc. But we always want to read the next character. Removing the 'else' on line 13 will fix the bug.

7.2 Example Debugging Session: Segmentation Fault Example
 

We are going to use gdb to figure out why the following program causes a segmentation fault. The program is meant to read in a line of text from the user and print it. However, we will see that in it's current state it doesn't work as expected...

1 : #include <stdio.h>
2 : #include <stdlib.h>

3 : int main(int argc, char **argv)
4 : {
5 :   char *buf;
6 :
7 :   buf = malloc(1<<31);
8 :
9 :   fgets(buf, 1024, stdin);
10:   printf("%s\n", buf);
11:
12:   return 1;
13: }



The first step is to compile the program with debugging flags:
 
[sgupta@rhel6x64 socket]$ gcc -g segfault.c
 

Now we run the program:
 

[sgupta@rhel6x64 socket]$ a.out
Hello World!
Segmentation fault
[sgupta@rhel6x64 socket]$
 

This is not what we want. Time to fire up gdb:
 

[sgupta@rhel6x64 socket]$ gdb a.out
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb)
We'll just run it and see what happens:
(gdb) run
Starting program: /home/dgawd/cpsc/363/a.out
test string

Program received signal SIGSEGV, Segmentation fault.
0x4007fc13 in _IO_getline_info () from /lib/libc.so.6

So we received the SIGSEGV signal from the operating system. This means that we tried to access an invalid memory address. Let's take a backtrace:

(gdb) backtrace
#0  0x4007fc13 in _IO_getline_info () from /lib/libc.so.6
#1  0x4007fb6c in _IO_getline () from /lib/libc.so.6
#2  0x4007ef51 in fgets () from /lib/libc.so.6
#3  0x80484b2 in main (argc=1, argv=0xbffffaf4) at segfault.c:10
#4  0x40037f5c in __libc_start_main () from /lib/libc.so.6
We are only interested in our own code here, so we want to switch to stack frame 3 and see where the program crashed:
(gdb) frame 3
#3  0x80484b2 in main (argc=1, argv=0xbffffaf4) at segfault.c:10
10        fgets(buf, 1024, stdin)

We crashed inside the call to fgets. In general, we can assume that library functions such as fgets work properly (if this isn't the case, we are in a lot of trouble). So the problem must be one of our arguments. You may not know that 'stdin' is a global variable that is created by the stdio libraries. So we can assume this one is ok. That leaves us with 'buf':

(gdb) print buf
$1 = 0x0

The value of buf is 0x0, which is the NULL pointer. This is not what we want - buf should point to the memory we allocated on line 8. So we're going to have to find out what happened there. First we want to kill the currently-running invocation of our program:

(gdb) kill
Kill the program being debugged? (y or n) y
Now set a breakpoint on line 8:
(gdb) break segfault.c:8
Breakpoint 1 at 0x8048486: file segfault.c, line 8.
Now run the program again:
(gdb) run
Starting program: /home/dgawd/cpsc/363/a.out

Breakpoint 1, main (argc=1, argv=0xbffffaf4) at segfault.c:8
8         buf = malloc(1<<31);
We're going to check the value of buf before the malloc call. Since buf wasn't initialized, the value should be garbage, and it is:
(gdb) print buf
$2 = 0xbffffaa8 "Èúÿ¿#\177\003@t`\001@\001"
Now step over the malloc call and examine buf again:
(gdb) next
10        fgets(buf, 1024, stdin);
(gdb) print buf
$3 = 0x0
 

After the call to malloc, buf is NULL. If you were to go check the man page for malloc, you would discover that malloc returns NULL when it cannot allocate the amount of memory requested. So our malloc must have failed. Let's go back and look at it again:

7 :   buf = malloc(1<<31);
 

Well, the value of the expression 1 << 31 (the integer 1 right-shifted 31 times) is 429497295, or 4GB (gigabytes). Very few machines have this kind of memory - mine only has 256MB. So of cousre malloc would fail. Furthermore, we are only reading in 1024 bytes in the fgets call. All that extra space would be wasted, even if we could allocate it. Change the 1<<31 to 1024 (or 1<<9), and the program will work as expected:
 

[sgupta@rhel6x64 socket]$
Hello World!
Hello World!
[sgupta@rhel6x64 socket]$
 
So now you know how to debug segmentation faults with gdb. This is extremely useful (I use it more often then I care to admit). The example also illustrated another very important point: ALWAYS CHECK THE RETURN VALUE OF MALLOC! Have a nice day.
How to Debug Using GDB. We are going to be using two programs to illustrate how GDB can be used to debug code.Debugging a program with a logical error. The first sample program has some logical errors. The program is supposed to output the summation of (X^0)/0! + (X^1)/1! + (X^2)/2! + (X^3)/3! + (X^4)/4! + ... + (X^n)/n!, given x and n as inputs. However the program outputs a value of infinity, regardless of the inputs. We will take you step by step through the debugging process and trace the errors:
See the sample program broken.cpp 
/*
Broken.cpp
*/
2.  #include <iostream>
3.  #include <cmath>
4.   
5.  using namespace std;
6.   
7.  int ComputeFactorial(int number) {
8.    int fact = 0;
9.   
10.   for (int j = 1; j <= number; j++) {
11.     fact = fact * j;
12.   }
13. 
14.   return fact;
15. }
16. 
17. double ComputeSeriesValue(double x, int n) {
18.    double seriesValue = 0.0;
19.    double xpow = 1;
20. 
21.    for (int k = 0; k <= n; k++) {
22.       seriesValue += xpow / ComputeFactorial(k);
23.       xpow = xpow * x;
24.    }
25. 
26.    return seriesValue;
27. }
28. 
29. int main() {
30.    cout << "This program is used to compute the value of the following series : " << endl;
31. 
32.    cout << "(x^0)/0! + (x^1)/1! + (x^2)/2! + (x^3)/3! + (x^4)/4! + ........ + (x^n)/n! " << endl;
33. 
34.    cout << "Please enter the value of x : " ;
35. 
36.    double x;
37.    cin >> x;
38. 
39.    int n;
40.    cout << endl << "Please enter an integer value for n : " ;
41.    cin >> n;
42.    cout << endl;
43. 
44.    double seriesValue = ComputeSeriesValue(x, n);
45.    cout << "The value of the series for the values entered is "
46.  << seriesValue << endl;
47. 
48.    return 0;
49.}


2. Compile the program and execute the program.

[sgupta@rhel6x64 socket]$ g++ -g broken.cpp -o broken
[sgupta@rhel6x64 socket]$ ./broken
Whatever the input, the output will be inf. The -g option is important because it enables meaningful GDB debugging.

3. Start the debugger

[sgupta@rhel6x64 socket]$ gdb broken

This only starts the debugger; it does not start running the program in the debugger.

4. Look at the source code and set a breakpoint at line 43

(gdb) b 43
which is
double seriesValue = ComputeSeriesValue(x, n);

5. Now, we start to run the program in the debugger.

(gdb) run

Note: If you need to supply the command-line arguments for the execution of the program, simply include them after the run command, just as normally done on the command line.

6. The program starts running and asks us for the input.

Let's enter the values as x=2 and n=3. The expected output value is 5. The following is a snapshot of the program running in the debugger:

This program is used to compute the value of the following series :
(x^0)/0! + (x^1)/1! + (x^2)/2! + (x^3)/3! + (x^4)/4! + ........ + (x^n)/n!
Please enter the value of x : 2

Please enter an integer value for n : 3

Breakpoint 1, main () at broken.cpp:43
43  double seriesValue = ComputeSeriesValue(x, n);

Note that the program execution stopped at our first (and only) breakpoint.

7. Step into the ComputeSeriesValue() function

To step into a function call, we use the following command:

(gdb) step
ComputeSeriesValue (x=2, n=3) at broken.cpp:17
17  double seriesValue=0.0;

At this point, the program control is at the first statement of the function ComputeSeriesValue (x=2, n=3)

8. Next let's step through the program until we get into ComputeFactorial.

(gdb) next
18  double xpow=1;
(gdb) n
20  for (int k = 0; k <= n; k++) {
(gdb)
21    seriesValue += xpow / ComputeFactorial(k) ;
(gdb) s
ComputeFactorial (number=0) at broken.cpp:7
7  int fact=0;

Here we use the next command, which is similar to step except it will step over (instead of into) functions. The distinction doesn't matter here since there are no functions. You may use the shortest, unambigious spelling of a GDB command to save some typing. Here we use n and s instead of next and step, respectively. If the command is simply a repeat of the previous command, you can just hit return, which will execute the last command. Finally, we step (with s) into ComputeFactorial(). (If we'd used next, it would have stepped over ComputeFactorial.)
18.     Where are we?

If you want to know where you are in the program's execution (and how, to some extent, you got there), you can view the contents of the stack using the backtrace command as follows:

(gdb) bt
#0  ComputeFactorial (number=0) at broken.cpp:7
#1  0x08048907 in ComputeSeriesValue (x=3, n=2) at broken.cpp:21
#2  0x08048a31 in main () at broken.cpp:43
19.     Watching changes We can step through the program and examine the values using the print command.
(gdb) n
9  for (int j = 0; j <= number; j++) {
(gdb) n
10    fact = fact * j;
(gdb) n
9  for (int j = 0; j <= number; j++) {
(gdb) print fact
$2 = 0
(gdb) n
13  return fact;
(gdb) quit

The print command (abbreviated p) reveals that the value of fact never changes. Note that the function is returning a value of 0 for the function call ComputeFactorial(number=0). This is an ERROR!
By taking a closer look at the values printed above, we realize that we are computing fact=fact * j where fact has been initialized to 0; fact should have been initialized to 1. We quit GDB with the quit command. Next we need to change the following line:

int fact = 1;

Recompile the code and run it, you will get the expected output.

Debugging a program that produces a core dum. This program causes a core dump due to a segmentation fault. We will try to trace the reason for this core dump. Use the following program for this
testit.c :

#include <stdio.h>

void main()
{
        char *temp = "Paras";

        int i;
        i=0;

        temp[3]='F';

        for (i =0 ; i < 5 ; i++ )
               printf("%c\n", temp[i]);
}

1.  Compile the program using the following command.

    g++ testit.c -g -o testit 
2.  Run it normally, you should get the following result:

    Segmentation fault (core dumped)
 
3.  The core dump generates a file called corewhich can be used for debugging. Since, this program is really short, we will not need to set any breakpoints. Use the following command to start running the debugger to debug the core file produced by testit.


gdb testit core

The output of the above command should look like this:

bash$ gdb testit core
GNU gdb 19991004
Copyright 1998 Free Software
Core was generated by `testit'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libstdc++-libc6.1-1.so.2...done.
Reading symbols from /lib/libm.so.6...done.
Reading symbols from /lib/libc.so.6...done.
Reading symbols from /lib/ld-linux.so.2...done.
#0  0x804851a in main () at testit.c:10
10              temp[3]='F';       

4.  As we can see from the output above, the core dump was produced as a result of execution of the statement on line 10:

temp[3] =F;

Take a closer look at the declaration of temp on line 5 :

Line 5          char *temp = "Paras";

We find that  temp  is a char* which has been assigned a string literal, and so we cannot modify the contents of the literal as on line 10. This is what is causing a core dump

7 comments:

  1. Replies
    1. thanks sudipta. Hope you understand the concepts.

      Thanks,
      Saurabh Gupta
      saurabh.gupta@ccplusplus.com

      Delete
  2. nice way to explain every thing. every time i mess up with things that how gdb works now i got the exact way ..... thanx a lot.

    abhishek kumar verma
    abhishek27.co.cc

    ReplyDelete
    Replies
    1. Hi Abhishek,

      Happy to hear that it helped you. Your suggestions are welcome to improve it.

      Thanks,
      Saurabh Gupta
      saurabh.sgupta@ccplusplus.com

      Delete
  3. Very well done, very good!

    ReplyDelete
  4. Only one thing, my RAM is 4 GB ! ;-)

    ReplyDelete