This post is the first of a series examining GHC’s support for DWARF debug information and the tooling that this support enables:
- Part 1 introduces DWARF debugging information and explains how its generation can be enabled in GHC.
- Part 2 looks at a DWARF-enabled program in
gdband examines some of the limitations of this style of debug information.
- Part 3 looks at the backtrace support of GHC’s runtime system and how it can be used from Haskell.
- Part 4 examines how the Linux
perfutility can be used on GHC-compiled programs.
- Part 5 concludes the series by describing future work, related projects, and ways in which you can help.
DWARF debugging information
For several years now GHC has had support for producing DWARF debugging information. DWARF is a widely-used format (used by Linux and several BSDs) for representing debug information (typically embedded in an executable) for consumption by runtime systems, profiling, and debugging tools. It allows representation of a variety of information:
line information mapping instructions back to their location in the source program (e.g. the instruction at address
unwind information allowing call chains to be reconstructed from the runtime state of the execution stack (e.g. the program is currently executing
f, which was called from
g, which was called from
type information, allowing debugging tools to reconstruct the structure and identity of values from the runtime state of the program (e.g. when the program is executing the instruction at address
x, the value sitting in the
$raxregister is a pointer to a
Collectively, this information is what allows debuggers (e.g.
gdb) and profiling tools (e.g.
perf) to do what they do.
The effort to add DWARF support to GHC started with Peter Wortmann’s dissertation work which introduced the ability for GHC to emit basic line and unwind information in its executables. This support has matured considerably over the past few years and should finally be ready for use with GHC 8.10.
There are a few potential use-cases for DWARF information:
Use in native debugging tools (e.g.
Dumping runtime call stacks to the console using the
SIGQUITsignal; this is particularly useful in production
Computing runtime call stacks from within the program (using the
Statistical profiling using tools like perf.
Capturing call-stacks in exceptions for reporting to the user
We will discuss all of these in this series of blog posts. The rest of this first post will examine how to compile a DWARF-enabled binary.
As of GHC 8.10.2, GHC HQ will provide DWARF-enabled binary distributions for Debian 9, Debian 10, and Fedora 27 (as of 8.10.1 only Debian 9 is provided). These binary distributions differ in two respects from the non-DWARF distributions:
- all provided libraries (e.g.
unix, etc.) are built with debug information.
- the runtime system is built with a dependency on the
libdwlibrary (provided by the
Like other compilers, debug information support under GHC is enabled with the
-g flag. This flag can be passed a numeric “debug level”, which determines the detail (and, consequently, size) of the debug information that is produced. These levels are described in the GHC user guide.
When using native debug information we must keep in mind that all code linked into an executable (e.g. native libraries, Haskell libraries, and the code of the executable itself) must be built with debug information. Failure to ensure this will result in truncated backtraces.
To build a package with native debug information we can use
--enable-debug-info flag (or, below, its equivalent key in
cabal.project). Here, we will use the
vector testsuite as a non-trivial example:
$ git clone https://github.com/haskell/vector $ cd vector $ cat >>cabal.project.local <<EOF allow-newer: base package vector tests: True package * debug-info: 2 EOF $ cabal new-build vector-tests-O0
For the sake of demonstration we built the
vector-tests-O0 testsuite (which builds
vector’s tests without optimisation) since this provides slightly more interesting stacktraces. We chose debug level 2 as we will not be using the GHC-specific debug information emitted by debug level 3.
At this point we have a DWARF-annotated binary. This binary is functionally identical to a non-annotated build (apart from containing quite a few more bits, weighing in at over 150 megabytes). Most importantly, no optimizations were inhibited by enabling debug information.
In the next post we will begin to see what this extra 100 megabytes of debug information gives us.