Lab 2 - Debugging with gdb

Due: 5:00pm on Wednesday

The purpose of this lab is to learn how to use the gdb utility to debug common pointer errors and code that uses structures or classes.

The gdb utility is the debugger that comes with the GNU compilers such as gcc and g++. It allows you to either step line-by-line through your code or to examine a special file called a coredump. In class last week, I showed you a quick example of how to step line-by-line through your code to tell the difference between a Segmentation fault error caused by infinite recursion and one caused by bad pointer references. In lab today, we will examine a coredump file caused by a bad pointer reference.

A coredump file is a special file generated when your code crashes with an error such as a Segmentation fault. It contains enough information about the state of the program, that you can usually use a coredump file to determine why the program crashed while running. This is particularly useful when your program only crashes under certain circumstances. By default on Sleipnir, a coredump file will NOT be generated when your code crashes. You must turn on the generation of a coredump file with the following command AFTER you have logged in to Sleipnir:

ulimit -c unlimited
For this lab, we will be using a variation of the StudentRecord structure example from class. The code wants to create a dynamic array of students, but forgets to call the new command. You can find this code in the file lab2_handout.cpp. You can copy this file over to your account on Sleipnir using either cut and paste (as done in Lab 1) or with the following command once you have logged in to Sleipnir and changed to your cs222 directory:
wget http://www.cs.csubak.edu/~melissa/cs222/lab2_handout.cpp
To use gdb, you have to compile with debugging info turned on. As discussed in class, you give the -g option to g++ to turn on debugging info. So compile and run the handout with the following commands:
g++ -g -o lab2 lab2_handout.cpp
./lab2
When you run lab2, you should see Segmentation fault (core dumped) after entering the first student ID. If you do not see (core dumped) rerun the ulimit command from above and then rerun lab2. Once you have the coredump file, start gdb with the following command:
gdb lab2 core
The first argument to gdb is the executable's name (lab2) and the second argument is the coredump's filename (core is the default filename). You will now be at the prompt for gdb. It should look something like the following:
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".

Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./lab2'.
Program terminated with signal 11, Segmentation fault.
#0  0x00002ae43bd962e5 in std::istream::operator>> ()
   from /usr/lib/libstdc++.so.6
(gdb) 
The line (gdb) is the gdb prompt. When at this prompt, you can issue various gdb commands to see the state of your program. Let's first try the backtrace command, which will show all the function calls that led up to the segmentation fault. This is the command I showed in class that would give you screens of function calls for infinite recursion but only a few lines for a bad pointer. Since this code has a bad pointer, we expect that the backtrace command will show us just a few lines of information. To give the backtrace command, give the following command at the gdb prompt:
bt
This should show you something like the following:
#0  0x00002ae43bd962e5 in std::istream::operator>> ()
   from /usr/lib/libstdc++.so.6
#1  0x0000000000400b2e in getStudent (s=@0x400e00) at lab2_handout.cpp:66
#2  0x0000000000400cdd in main () at lab2_handout.cpp:49
The first field (#0, #1 and #2) tells you what line and function generated the error. The numbers are referred to as frames in gdb. So we see the error occurred in the input operator (>>) in frame #0. This was called by line 66 in the getStudent function in frame #1. That line in frame #1 was called by line 49 in main in frame #2.

We can select one of these frames to look at further. Usually it's not helpful to look at frames for the C++ libraries, so we can skip frame #0. To look at frame #1, give the following command at the gdb prompt:

frame 1
Now that we are in the scope of getStudent, we can look at the variables local to getStudent. If you refer back to lab2_handout.cpp, you'll see the only two variables local to getStudent are s and sum. To look at those two variables, give the following commands at the gdb prompt:
print s
print sum
You'll see when you print s that it says "Cannot access memory at address 0x0" which tells us that our pointer is pointing to NULL (address 0x0). So now we can see that the pointer is bad. This tells us that we either passed the pointer to the function incorrectly or the pointer was never set in main. You'll also notice when you print sum that a random large number is printed on the screen. That's because the code crashed before getting to the line in getStudent that set sum to 0.

Since we now know the pointer is bad in getStudent, our next step is to see if the pointer is also bad in main. So now we need to switch to main's frame by giving the following command at the gdb prompt:

frame 2
Again, we want to look at all the variables local to main. By looking at the code, we see those variables are c, size and i. Since none of these variables are a parameter to main (with getStudent, s was a parameter to getStudent), we can use the following command to print out all three at once:
info locals
Note: info locals will not show you any parameters to a function. You have to use print for those.

In the output of info locals, you should see that the address for c is 0x0, which tells us that c is still a NULL pointer. This is why the code crashed. You cannot set values for the NULL pointer. Looking back at the code, we see that c was initialized to NULL at the top of main (you should get in the habit of doing this with all pointers you declare) and, as the comments indicate, new was never called to actually allocate the array of students to c.

Before continuing on to the rest of the assignment, exit gdb by giving the following command:

quit
As a side note, you can also use gdb to detect when you have infinite recursion. If your backtrace shows you pages upon pages of function calls, this is a very strong indication that there is infinite recursion occuring. You can then use the frame command to go into one of the function calls to see where in your code the infinite recursion is occuring.

Lab Assignment

Copy lab2_handout.cpp to lab2.cpp. Modify the code to have the correct call to new. Make sure to use the if statement shown in class to handle allocation failures. Recompile the program and run it in line-by-line mode in gdb by using the following commands:
g++ -g -o lab2 lab2.cpp
gdb lab2
Notice you do not include the second argument to gdb (core) this time since you are NOT debugging a coredump now.

You will now be at the gdb prompt. Since the program should run correctly this time, we want to force it to pause at the same point it had errors before. The error before happened in the getStudent function, so we want gdb to stop running the program when we reach the getStudent function. To do so, we need to set something called a "breakpoint" with the following command:

break getStudent
This tells gdb to stop executing the program and return to the gdb prompt whenever the program enters the getStudent function. You could set additional breakpoints for other function calls or even at specific line numbers if you wish by replacing getStudent in the above command with the name of the other function or the line number.

Once you have the breakpoint set, tell gdb to start executing the program with the following command:

run
The program will then execute until a breakpoint is reached, a crash occurs or the program exits. In this assignment, it should reach the getStudent breakpoint after prompting for the number of students. At this point, it will return you to the gdb prompt. You can then examine the variables local to getStudent with the same commands from above:
print s
print sum
Notice that now, since the array has been properly allocated, print s actually shows you all the fields for the structure. Right now, these fields are random values because we haven't actually executed any lines in the getStudent function yet. To execute a line in getStudent, issue the following command:
step
This will execute just one line, the cout statement. It won't actually print the cout statement to the screen until you give the step commmand again. Give the step commmand a second time, then give the print s command again. Notice how the ID field is now whatever you entered at the student ID prompt. You can also print off just the ID field of the structure using the following command:
print s.ID
Notice that we're using the dot operator to retrieve just the ID field out of the object named s.

Keep giving the step commmand until you reach the end of the getStudent function. Check the value of each field in the StudentRecord object after it is read in.

When you reach the end of the function, the step command will show something like the following (Note: the line number will be different depending on how many lines you took to implement the new command. The key thing is seeing } one step command after seeing the s.grade line):

(gdb) step
87        s.grade = sum / 3.0;
(gdb) step
88      }
This means you are at the end of the getStudent function. Give the step commmand one more time. Notice how you now return back to the main function at the for loop. Now you can examine main's variables using info locals again. Notice that now c has a random hexadecimal address associated with it instead of 0x0. This is the base address of the chunk of memory that new allocated to you. You can now also access each student in the array (since the array now exists) using the following commands:
print c[0]
print c[0].ID
print c[0].name
print c[0].quiz[0]
print c[0].quiz[1]
print c[0].quiz[2]
print c[0].grade
print c[1]
Note that c[1] will be random values because you've only gone through getStudent for c[0] at this point. Also notice that the print command uses the subscript operator and dot operator to access certain portions of the structure just as you would within your code in main to access that portion of the structure.

You can now give the following command to continue the execution of the program to the next call of getStudent:

continue
When the breakpoint is reached, try the print s command. Note how it is the same thing as print c[1] from main's scope since this instance of getStudent is setting the values for c[1].

You may now either continue to play around with the step and continue commands. Feel free to try other gdb commands listed in the Department's HowTo Guide. Or you can exit gdb with the quit command.

Write a summary of what you saw during your gdb session (or cut and paste a log your gdb session into an email). Email me your writeup (or log) in the message section of your email and also attach your lab2.cpp source code.