Debugging M5
From gem5
M5 is a complex piece of software that models complex hardware systems running other even more complex pieces of software (e.g., full operating systems). Debugging is thus a major issue. Although debugging is not guaranteed to be easy, M5 has several features designed specifically to help you figure out what's going on.
Contents |
Trace-based debugging
The simplest method of debugging is to have M5 print out traces of what it's doing and then to examine these traces for problems. M5 contains numerous DPRINTF statements that print trace messages describing potentially interesting events. Each DPRINTF is associated with a trace category (e.g., Bus, Cache, Ethernet, Disk, etc.). To display the trace messages for a particular category, simply specify the category using the --debug-flags parameter. Multiple categories can be specified by giving a list of strings, e.g.:
build/ALPHA_FS/m5.opt --debug-flags=Bus,Cache configs/examples/fs.py
Of course, you would use your own config file and arguments in place of fs.py. Note that the m5.fast binary does not support tracing; part of what makes it faster than m5.opt is that the DPRINTF code is compiled out.
The --debug-flags switch must come after the executable and before the config file.
The complete list of trace-related flags is given in the Debugging Options and "Trace Options" section of the help printed with m5 --help:
Debugging Options
-----------------
--debug-break=TIME[,TIME]
Cycle to create a breakpoint
--debug-help Print help on trace flags
--debug-flags=FLAG[,FLAG]
Sets the flags for tracing (-FLAG disables a flag)
--remote-gdb-port=REMOTE_GDB_PORT
Remote gdb base port (set to 0 to disable listening)
Trace Options
-------------
--trace-start=TIME Start tracing at TIME (must be in ticks)
--trace-file=FILE Sets the output file for tracing [Default: cout]
--trace-ignore=EXPR Ignore EXPR sim objects
The complete list of trace flags can be seen by running m5 with the --debug-help option. Note that some flags are "compound flags", which are shorthand for a group of related base flags. The "-Flag" notation for disabling a flag is convenient when you want most but not all of the base flags set by a compound flag.
The Exec compound trace flag enables tracing of instruction execution in the CPU models. There are a number of base flags that control detailed aspects of instruction tracing, all grouped under 'Exec'. By changing these flags, you can modify the behavior of the instruction trace. For example, you can disable the use of function symbol names in place of absolute PC addresses (if they're available) by clearing the ExecSymbol flag (e.g., --trace-flags=Exec,-ExecSymbol).
If some supposedly innocuous change has caused m5 to stop working correctly, you can compare trace outputs from two versions of m5 (either two different binaries, or the same binary with different options) using the tracediff script located in the src/util directory. Read the comments in the script for instructions on how to use it.
If you find that events of interest are not being traced, feel free to add DPRINTFs yourself. You can add new trace categories simply by adding TraceFlag() command to any SConscript file (preferably the one nearest where you are using the new flag). The needed C++ code, including the Trace::Flags enum, is automatically generated from these SConscript commands.
Debugging with a debugger
If traces alone are not sufficient, you'll need to inspect what m5 is doing in detail using a debugger (e.g., gdb). You definitely want to use the m5.debug binary if you reach this point. Ideally, looking at traces has at least allowed you to narrow down the range of cycles in which you think something is going wrong. The fastest way to reach that point is to use a DebugEvent, which goes on M5's event queue and forces entry into the debugger when the specified cycle is reached (via SIGTRAP). (Of course, it's necessary to start m5 under the debugger or have the debugger attached to the m5 process first.)
You can create one or more DebugEvents when you invoke m5 using the "--debug-break=100" parameter. You can also create new DebugEvents from the debugger prompt using the schedBreakCycle() function. The following example session illustrates both of these approaches:
% gdb m5/build/ALPHA/m5.debug GNU gdb 6.1 Copyright 2002 Free Software Foundation, Inc. [...] (gdb) run --debug-break=2000 configs/run.py Starting program: /z/stever/bk/m5/build/ALPHA_SE/m5.debug --debug-break=2000 configs/run.py M5 Simulator System [...] warn: Entering event queue @ 0. Starting simulation...
Program received signal SIGTRAP, Trace/breakpoint trap. 0xffffe002 in ?? () (gdb) p curTick $1 = 2000 (gdb) c Continuing.
(gdb) call schedBreakCycle(3000) (gdb) c Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap. 0xffffe002 in ?? () (gdb) p curTick $3 = 3000 (gdb)
Functions intended to be called from GDB
M5 includes a number of functions specifically intended to be called from the debugger (e.g., using the gdb 'call' command, as in the schedBreakCycle() example above). Many of these are "dump" functions which display internal simulator data structures.
- eventq_dump() -- displays the events scheduled on the main event queue.
- setTraceFlag(const char *flag) -- turn trace/debug flags on from within the debugger. These functions take a string argument specifying the name of a single trace category.
- clearTraceFlag(const char *flag) -- See above
- takeCheckpoint(Tick when) -- Take a checkpoint of all the simulated state
- schedBreakCycle(Tick when) -- Schedule an event to drop into the debugging at cycle n
Kernel debugging
M5 has built-in support for gdb's remote debugger interface. If you are interested in monitoring what the kernel on the simulated machine is doing, you can fire up kgdb on the host platform and have it talk to the simulated M5 system as if it were a real machine (only better, since M5 executions are deterministic and M5's remote debugger interface is guaranteed not to perturb execution on the simulated system). To use a remote debugger with M5, the most important part is that you have gdb compiled to work with an alpha-linux target. It is possible to compile an alpha-linux gdb on an x86 machine for example. All that must be done is add the --target=alpha-linux option to configure when you compile gdb.
% wget http://ftp.gnu.org/gnu/gdb/gdb-6.3.tar.gz
--08:05:33-- http://ftp.gnu.org/gnu/gdb/gdb-6.3.tar.gz
=> `gdb-6.3.tar.gz'
Resolving ftp.gnu.org... done.
Connecting to ftp.gnu.org[199.232.41.7]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 17,374,476 [application/x-tar]
100%[====================================>] 17,374,476 216.57K/s ETA 00:00
08:06:52 (216.57 KB/s) - `gdb-6.3.tar.gz' saved [17374476/17374476]
% tar xfz gdb-6.3.tar.gz % cd gdb-6.3 % ./configure --target=alpha-linux <configure output....> % make <make output...this may take a while>
The end result is gdb/gdb which will work for remote debugging.
When M5 is run each CPU listens for a remote debugging connection on a TCP port. The first port allocated is generally 7000, though if a port is in use, the next port will be is tried.
To attach the remote debugger, it's necessary to have a copy of the kernel and of the source. Also to view the kernel's call stack, you must make sure Linux was built with the necessary debug configuration parameters enabled. To run the remote debugger, do the following:
ziff% gdb-linux-alpha arch/alpha/boot/vmlinux
GNU gdb 5.3
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=i686-pc-linux-gnu --target=alpha-linux"...
(no debugging symbols found)...
(gdb) set remote Z-packet on [ This can be put in .gdbinit ]
(gdb) target remote ziff:7000
Remote debugging using ziff:7000
0xfffffc0000496844 in strcasecmp (a=0xfffffc0000b13a80 "", b=0x0)
at arch/alpha/lib/strcasecmp.c:23
23 } while (ca == cb && ca != '\0');
(gdb)
The M5 simulator is already running and the target remote command connects to the already running simulator and stops it in the middle of execution. You can set breakpoints and use the debugger to debug the kernel. It is also possible to use the remote debugger to debug console code and palcode. Setting that up is similar, but a how to will be left for future work.
If you're using both the remote debugger and the debugger on the simulator, it is possible to trigger the remote debugger from the main debugger by doing a 'call debugger()'. Before you do this you'll need to figure out what CPU (the cpu id) you want to debug and set current_debugger to that cpuid. If you only have one cpu, then it will be cpuid 0, however if there are multiple cpus you will need to match the cpu id with the corresponding port number for the remote gdb session. For example, using the following sample output from M5, calling the kernel debugger for cpu 3 requires the kernel debugger to be listening on port 7001.
%./build/ALPHA_FS/m5.debug configs/example/fs.py ... making dual system Global frequency set at 1000000000000 ticks per second Listening for testsys connection on port 3456 Listening for drivesys connection on port 3457 0: testsys.remote_gdb.listener: listening for remote gdb #0 on port 7002 0: testsys.remote_gdb.listener: listening for remote gdb #1 on port 7003 0: testsys.remote_gdb.listener: listening for remote gdb #2 on port 7000 0: testsys.remote_gdb.listener: listening for remote gdb #3 on port 7001 0: drivesys.remote_gdb.listener: listening for remote gdb #4 on port 7004 0: drivesys.remote_gdb.listener: listening for remote gdb #5 on port 7005 0: drivesys.remote_gdb.listener: listening for remote gdb #6 on port 7006 0: drivesys.remote_gdb.listener: listening for remote gdb #7 on port 7007
Other Debugging Tools
- @TODO: mention valgrind suppressions
- @TODO: tracediff