Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(95)

Side by Side Diff: docs/processor_design.md

Issue 2103273003: docs: clean up markdown Base URL: https://chromium.googlesource.com/breakpad/breakpad.git@master
Patch Set: Created 4 years, 5 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 # Breakpad Processor Library 1 # Breakpad Processor Library
2 2
3 ## Objective 3 ## Objective
4 4
5 The Breakpad processor library is an open-source framework to access the the 5 The Breakpad processor library is an open-source framework to access the the
6 information contained within crash dumps for multiple platforms, and to use that 6 information contained within crash dumps for multiple platforms, and to use that
7 information to produce stack traces showing the call chain of each thread in a 7 information to produce stack traces showing the call chain of each thread in a
8 process. After processing, this data is made available to users of the library. 8 process. After processing, this data is made available to users of the library.
9 9
10 ## Background 10 ## Background
11 11
12 The Breakpad processor is intended to sit at the core of a comprehensive 12 The Breakpad processor is intended to sit at the core of a comprehensive
13 crash-reporting system that does not require debugging information to be 13 crash-reporting system that does not require debugging information to be
14 provided to those running applications being monitored. Some existing 14 provided to those running applications being monitored. Some existing
15 crash-reporting systems, such as [GNOME](http://www.gnome.org/)s Bug-Buddy and 15 crash-reporting systems, such as [GNOME](https://www.gnome.org/)'s Bug-Buddy and
16 [Apple](http://www.apple.com/)’s [CrashReporter] 16 [Apple](https://www.apple.com/)'s
17 (http://developer.apple.com/technotes/tn2004/tn2123.html), require symbolic 17 [CrashReporter](https://developer.apple.com/technotes/tn2004/tn2123.html),
18 information to be present on the end users computer; in the case of 18 require symbolic information to be present on the end user's computer; in the
19 CrashReporter, the reports are transmitted only to Apple, not to third-party 19 case of CrashReporter, the reports are transmitted only to Apple, not to third-p arty
20 developers. Other systems, such as [Microsoft](http://www.microsoft.com/)s 20 developers. Other systems, such as [Microsoft](https://www.microsoft.com/)'s
21 [Windows Error Reporting](http://msdn.microsoft.com/isv/resources/wer/) and 21 [Windows Error Reporting](https://msdn.microsoft.com/isv/resources/wer/) and
22 SupportSofts Talkback, transmit only a snapshot of a crashed process state, 22 SupportSoft's Talkback, transmit only a snapshot of a crashed process' state,
23 which can later be combined with symbolic debugging information without the need 23 which can later be combined with symbolic debugging information without the need
24 for it to be present on end users computers. Because symbolic debugging 24 for it to be present on end users' computers. Because symbolic debugging
25 information consumes a large amount of space and is otherwise not needed during 25 information consumes a large amount of space and is otherwise not needed during
26 the normal operation of software, and because some developers are reluctant to 26 the normal operation of software, and because some developers are reluctant to
27 release debugging symbols to their customers, Breakpad follows the latter 27 release debugging symbols to their customers, Breakpad follows the latter
28 approach. 28 approach.
29 29
30 We know of no currently-maintained crash-reporting systems that meet our 30 We know of no currently-maintained crash-reporting systems that meet our
31 requirements, which are to: * allow for symbols to be separate from the 31 requirements, which are to: * allow for symbols to be separate from the
32 application, * handle crash reports from multiple platforms, * allow developers 32 application, * handle crash reports from multiple platforms, * allow developers
33 to operate their own crash-reporting platform, and to * be open-source. Windows 33 to operate their own crash-reporting platform, and to * be open-source. Windows
34 Error Reporting only functions for Microsoft products, and requires the 34 Error Reporting only functions for Microsoft products, and requires the
35 involvement of Microsofts servers. Talkback, while cross-platform, has not been 35 involvement of Microsoft's servers. Talkback, while cross-platform, has not been
36 maintained and at this point does not support Mac OS X on x86, which we consider 36 maintained and at this point does not support Mac OS X on x86, which we consider
37 to be a significant platform. Talkback is also closed-source commercial 37 to be a significant platform. Talkback is also closed-source commercial
38 software, and has very specific requirements for its server platform. 38 software, and has very specific requirements for its server platform.
39 39
40 We are aware of Windows-only crash-reporting systems that leverage Microsofts 40 We are aware of Windows-only crash-reporting systems that leverage Microsoft's
41 debugging interfaces. Such systems, even if extended to support dumps from other 41 debugging interfaces. Such systems, even if extended to support dumps from other
42 platforms, are tied to using Windows for at least a portion of the processor 42 platforms, are tied to using Windows for at least a portion of the processor
43 platform. 43 platform.
44 44
45 ## Overview 45 ## Overview
46 46
47 The Breakpad processor itself is written in standard C++ and will work on a 47 The Breakpad processor itself is written in standard C++ and will work on a
48 variety of platforms. The dumps it accepts may also have been created on a 48 variety of platforms. The dumps it accepts may also have been created on a
49 variety of systems. The library is able to combine dumps with symbolic debugging 49 variety of systems. The library is able to combine dumps with symbolic debugging
50 information to create stack traces that include function signatures. The 50 information to create stack traces that include function signatures. The
51 processor library includes simple command-line tools to examine dumps and 51 processor library includes simple command-line tools to examine dumps and
52 process them, producing stack traces. It also exposes several layers of APIs 52 process them, producing stack traces. It also exposes several layers of APIs
53 enabling crash-reporting systems to be built around the Breakpad processor. 53 enabling crash-reporting systems to be built around the Breakpad processor.
54 54
55 ## Detailed Design 55 ## Detailed Design
56 56
57 ### Dump Files 57 ### Dump Files
58 58
59 In the processor, the dump data is of primary significance. Dumps typically 59 In the processor, the dump data is of primary significance. Dumps typically
60 contain: 60 contain:
61 61
62 * CPU context (register data) as it was at the time the crash occurred, and an 62 * CPU context (register data) as it was at the time the crash occurred, and an
63 indication of which thread caused the crash. General-purpose registers are 63 indication of which thread caused the crash. General-purpose registers are
64 included, as are special-purpose registers such as the instruction pointer 64 included, as are special-purpose registers such as the instruction pointer
65 (program counter). 65 (program counter).
66 * Information about each thread of execution within a crashed process, 66 * Information about each thread of execution within a crashed process,
67 including: 67 including:
68 * The memory region used for each threads stack. 68 * The memory region used for each thread's stack.
69 * CPU context for each thread, which for various reasons is not the same 69 * CPU context for each thread, which for various reasons is not the same
70 as the crash context in the case of the crashed thread. 70 as the crash context in the case of the crashed thread.
71 * A list of loaded code segments (or modules), including: 71 * A list of loaded code segments (or modules), including:
72 * The name of the file (`.so`, `.exe`, `.dll`, etc.) which provides the 72 * The name of the file (`.so`, `.exe`, `.dll`, etc.) which provides the
73 code. 73 code.
74 * The boundaries of the memory region in which the code segment is visible 74 * The boundaries of the memory region in which the code segment is visible
75 to the process. 75 to the process.
76 * A reference to the debugging information for the code module, when such 76 * A reference to the debugging information for the code module, when such
77 information is available. 77 information is available.
78 78
79 Ordinarily, dumps are produced as a result of a crash, but other triggers may be 79 Ordinarily, dumps are produced as a result of a crash, but other triggers may be
80 set to produce dumps at any time a developer deems appropriate. The Breakpad 80 set to produce dumps at any time a developer deems appropriate. The Breakpad
81 processor can handle dumps in the minidump format, either generated by an 81 processor can handle dumps in the minidump format, either generated by an
82 [Breakpad client handler](client_design.md) implementation, or by another 82 [Breakpad client "handler"](client_design.md) implementation, or by another
83 implementation that produces dumps in this format. The 83 implementation that produces dumps in this format. The
84 [DbgHelp.dll!MiniDumpWriteDump] 84 [DbgHelp.dll!MiniDumpWriteDump](https://msdn.microsoft.com/en-us/library/ms68036 0.aspx)
85 (http://msdn2.microsoft.com/en-us/library/ms680360.aspx) function on Windows 85 function on Windows produces dumps in this format, and is the basis for the
86 produces dumps in this format, and is the basis for the Breakpad handler 86 Breakpad handler implementation on that platform.
87 implementation on that platform.
88 87
89 The [minidump format] 88 The [minidump format](https://msdn.microsoft.com/en-us/library/ms679293%28VS.85% 29.aspx)
90 (http://msdn.microsoft.com/en-us/library/ms679293%28VS.85%29.aspx) is 89 is essentially a simple container format, organized as a series of streams. Each
91 essentially a simple container format, organized as a series of streams. Each 90 stream contains some type of data relevant to the crash. A typical "normal"
92 stream contains some type of data relevant to the crash. A typical normal
93 minidump contains streams for the thread list, the module list, the CPU context 91 minidump contains streams for the thread list, the module list, the CPU context
94 at the time of the crash, and various bits of additional system information. 92 at the time of the crash, and various bits of additional system information.
95 Other types of minidump can be generated, such as a full-memory minidump, which 93 Other types of minidump can be generated, such as a full-memory minidump, which
96 in addition to stack memory contains snapshots of all of a process mapped 94 in addition to stack memory contains snapshots of all of a process' mapped
97 memory regions. 95 memory regions.
98 96
99 The minidump format was chosen as Breakpads dump format because it has an 97 The minidump format was chosen as Breakpad's dump format because it has an
100 established track record on Windows, and it can be adapted to meet the needs of 98 established track record on Windows, and it can be adapted to meet the needs of
101 the other platforms that Breakpad supports. Most other operating systems use 99 the other platforms that Breakpad supports. Most other operating systems use
102 core files as their native dump formats, but the capabilities of core files 100 "core" files as their native dump formats, but the capabilities of core files
103 vary across platforms, and because core files are usually presented in a 101 vary across platforms, and because core files are usually presented in a
104 platforms native executable format, there are complications involved in 102 platform's native executable format, there are complications involved in
105 accessing the data contained therein without the benefit of the header files 103 accessing the data contained therein without the benefit of the header files
106 that define an executable formats entire structure. Because minidumps are 104 that define an executable format's entire structure. Because minidumps are
107 leaner than a typical executable format, a redefinition of the format in a 105 leaner than a typical executable format, a redefinition of the format in a
108 cross-platform header file, `minidump_format.h`, was a straightforward task. 106 cross-platform header file, `minidump_format.h`, was a straightforward task.
109 Similarly, the capabilities of the minidump format are understood, and because 107 Similarly, the capabilities of the minidump format are understood, and because
110 it provides an extensible container, any of Breakpads needs that could not be 108 it provides an extensible container, any of Breakpad's needs that could not be
111 met directly by the standard minidump format could likely be met by extending it 109 met directly by the standard minidump format could likely be met by extending it
112 as needed. Finally, using this format means that the dump file is compatible 110 as needed. Finally, using this format means that the dump file is compatible
113 with native debugging tools at least on Windows. A possible future avenue for 111 with native debugging tools at least on Windows. A possible future avenue for
114 exploration is the conversion of minidumps to core files, to enable this same 112 exploration is the conversion of minidumps to core files, to enable this same
115 benefit on other platforms. 113 benefit on other platforms.
116 114
117 We have already provided an extension to the minidump format that allows it to 115 We have already provided an extension to the minidump format that allows it to
118 carry dumps generated on systems with PowerPC processors. The format already 116 carry dumps generated on systems with PowerPC processors. The format already
119 allows for variable CPUs, so our work in this area was limited to defining a 117 allows for variable CPUs, so our work in this area was limited to defining a
120 context structure sufficient to represent the execution state of a PowerPC. We 118 context structure sufficient to represent the execution state of a PowerPC. We
121 have also defined an extension that allows minidumps to indicate which thread of 119 have also defined an extension that allows minidumps to indicate which thread of
122 execution requested a dump be produced for non-crash dumps. 120 execution requested a dump be produced for non-crash dumps.
123 121
124 Often, the information contained within a dump alone is sufficient to produce a 122 Often, the information contained within a dump alone is sufficient to produce a
125 full stack backtrace for each thread. Certain optimizations that compilers 123 full stack backtrace for each thread. Certain optimizations that compilers
126 employ in producing code frustrate this process. Specifically, the frame 124 employ in producing code frustrate this process. Specifically, the "frame
127 pointer omission optimization of x86 compilers can make it impossible to 125 pointer omission" optimization of x86 compilers can make it impossible to
128 produce useful stack traces given only a stack snapshot and CPU context. In 126 produce useful stack traces given only a stack snapshot and CPU context. In
129 these cases, however, compiler-emitted debugging information can aid in 127 these cases, however, compiler-emitted debugging information can aid in
130 producing useful stack traces. The Breakpad processor is able to take advantage 128 producing useful stack traces. The Breakpad processor is able to take advantage
131 of this debugging information as supplied by Microsofts C/C++ compiler, the 129 of this debugging information as supplied by Microsoft's C/C++ compiler, the
132 only compiler to apply such optimizations by default. As a result, the Breakpad 130 only compiler to apply such optimizations by default. As a result, the Breakpad
133 processor can produce useful stack traces even from code with frame pointer 131 processor can produce useful stack traces even from code with frame pointer
134 omission optimizations as produced by this compiler. 132 omission optimizations as produced by this compiler.
135 133
136 ### Symbol Files 134 ### Symbol Files
137 135
138 The [symbol files](symbol_files.md) that the Breakpad processor accepts allow 136 The [symbol files](symbol_files.md) that the Breakpad processor accepts allow
139 for frame pointer omission data, but this is only one of their capabilities. 137 for frame pointer omission data, but this is only one of their capabilities.
140 Each symbol file also includes information about the functions, source files, 138 Each symbol file also includes information about the functions, source files,
141 and source code line numbers for a single module of code. A module is an 139 and source code line numbers for a single module of code. A module is an
142 individually-loadble chunk of code: these can be executables containing a main 140 individually-loadble chunk of code: these can be executables containing a main
143 program (`exe` files on Windows) or shared libraries (`.so` files on Linux, 141 program (`exe` files on Windows) or shared libraries (`.so` files on Linux,
144 `.dylib` files, frameworks, and bundles on Mac OS X, and `.dll` files on 142 `.dylib` files, frameworks, and bundles on Mac OS X, and `.dll` files on
145 Windows). Dumps contain information about which of these modules were loaded at 143 Windows). Dumps contain information about which of these modules were loaded at
146 the time the dump was produced, and given this information, the Breakpad 144 the time the dump was produced, and given this information, the Breakpad
147 processor attempts to locate debugging symbols for the module through a 145 processor attempts to locate debugging symbols for the module through a
148 user-supplied function embodied in a symbol supplier. Breakpad includes a 146 user-supplied function embodied in a "symbol supplier." Breakpad includes a
149 sample symbol supplier, called `SimpleSymbolSupplier`, that is used by its 147 sample symbol supplier, called `SimpleSymbolSupplier`, that is used by its
150 command-line tools; this supplier locates symbol files by pathname. 148 command-line tools; this supplier locates symbol files by pathname.
151 `SimpleSymbolSupplier` is also available to other users of the Breakpad 149 `SimpleSymbolSupplier` is also available to other users of the Breakpad
152 processor library. This allows for the use of a simple reference implementation, 150 processor library. This allows for the use of a simple reference implementation,
153 but preserves flexibility for users who may have more demanding symbol file 151 but preserves flexibility for users who may have more demanding symbol file
154 storage needs. 152 storage needs.
155 153
156 Breakpads symbol file format is text-based, and was defined to be fairly 154 Breakpad's symbol file format is text-based, and was defined to be fairly
157 human-readable and to encompass the needs of multiple platforms. The Breakpad 155 human-readable and to encompass the needs of multiple platforms. The Breakpad
158 processor itself does not operate directly with native symbol formats ([DWARF] 156 processor itself does not operate directly with native symbol formats
159 (http://dwarf.freestandards.org/) and [STABS] 157 ([DWARF](http://dwarf.freestandards.org/) and
160 (http://sourceware.org/gdb/current/onlinedocs/stabs.html) on most Unix-like 158 [STABS](https://sourceware.org/gdb/current/onlinedocs/stabs.html)
161 systems, [.pdb files] 159 on most Unix-like systems,
162 (http://msdn2.microsoft.com/en-us/library/yd4f8bd1(VS.80).aspx) on Windows), 160 [.pdb files](https://msdn.microsoft.com/en-us/library/yd4f8bd1(VS.80).aspx) on W indows),
163 because of the complications in accessing potentially complex symbol formats 161 because of the complications in accessing potentially complex symbol formats
164 with slight variations between platforms, stored within different types of 162 with slight variations between platforms, stored within different types of
165 binary formats. In the case of `.pdb` files, the debugging format is not even 163 binary formats. In the case of `.pdb` files, the debugging format is not even
166 documented. Instead, Breakpads symbol files are produced on each platform, 164 documented. Instead, Breakpad's symbol files are produced on each platform,
167 using specific debugging APIs where available, to convert native symbols to 165 using specific debugging APIs where available, to convert native symbols to
168 Breakpads cross-platform format. 166 Breakpad's cross-platform format.
169 167
170 ### Processing 168 ### Processing
171 169
172 Most commonly, a developer will enable an application to use Breakpad by 170 Most commonly, a developer will enable an application to use Breakpad by
173 building it with a platform-specific [client handler](client_design.md) 171 building it with a platform-specific [client "handler"](client_design.md)
174 library. After building the application, the developer will create symbol files 172 library. After building the application, the developer will create symbol files
175 for Breakpads use using the included `dump_syms` or `symupload` tools, or 173 for Breakpad's use using the included `dump_syms` or `symupload` tools, or
176 another suitable tool, and place the symbol files where the processors symbol 174 another suitable tool, and place the symbol files where the processor's symbol
177 supplier will be able to locate them. 175 supplier will be able to locate them.
178 176
179 When a dump file is given to the processors `MinidumpProcessor` class, it will 177 When a dump file is given to the processor's `MinidumpProcessor` class, it will
180 read it using its included minidump reader, contained in the `Minidump` family 178 read it using its included minidump reader, contained in the `Minidump` family
181 of classes. It will collect information about the operating system and CPU that 179 of classes. It will collect information about the operating system and CPU that
182 produced the dump, and determine whether the dump was produced as a result of a 180 produced the dump, and determine whether the dump was produced as a result of a
183 crash or at the direct request of the application itself. It then loops over all 181 crash or at the direct request of the application itself. It then loops over all
184 of the threads in a process, attempting to walk the stack associated with each 182 of the threads in a process, attempting to walk the stack associated with each
185 thread. This process is achieved by the processors `Stackwalker` components, of 183 thread. This process is achieved by the processor's `Stackwalker` components, of
186 which there are a slightly different implementations for each CPU type that the 184 which there are a slightly different implementations for each CPU type that the
187 processor is able to handle dumps from. Beginning with a threads context, and 185 processor is able to handle dumps from. Beginning with a thread's context, and
188 possibly using debugging data, the stackwalker produces a list of stack frames, 186 possibly using debugging data, the stackwalker produces a list of stack frames,
189 containing each instruction executed in the chain. These instructions are 187 containing each instruction executed in the chain. These instructions are
190 matched up with the modules that contributed them to a process, and the 188 matched up with the modules that contributed them to a process, and the
191 `SymbolSupplier` is invoked to locate a symbol file. The symbol file is given to 189 `SymbolSupplier` is invoked to locate a symbol file. The symbol file is given to
192 a `SourceLineResolver`, which matches the instruction up with a specific 190 a `SourceLineResolver`, which matches the instruction up with a specific
193 function name, source file, and line number, resulting in a representation of a 191 function name, source file, and line number, resulting in a representation of a
194 stack frame that can easily be used to identify which code was executing. 192 stack frame that can easily be used to identify which code was executing.
195 193
196 The results of processing are made available in a `ProcessState` object, which 194 The results of processing are made available in a `ProcessState` object, which
197 contains a vector of threads, each containing a vector of stack frames. 195 contains a vector of threads, each containing a vector of stack frames.
(...skipping 23 matching lines...) Expand all
221 219
222 The symbol file format can be extended to carry information about the locations 220 The symbol file format can be extended to carry information about the locations
223 of parameters and local variables as stored in stack frames and registers, and 221 of parameters and local variables as stored in stack frames and registers, and
224 the processor can use this information to provide enhanced stack traces showing 222 the processor can use this information to provide enhanced stack traces showing
225 function arguments and variable values. 223 function arguments and variable values.
226 224
227 On Mac OS X and Linux, we can provide tools to convert files from the minidump 225 On Mac OS X and Linux, we can provide tools to convert files from the minidump
228 format into the native core format. This will enable developers to open dump 226 format into the native core format. This will enable developers to open dump
229 files in a native debugger, just as they are presently able to do with minidumps 227 files in a native debugger, just as they are presently able to do with minidumps
230 on Windows. 228 on Windows.
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698