Chromium Code Reviews| OLD | NEW |
|---|---|
| (Empty) | |
| 1 # A Crash Course in Debugging with about:net-internals | |
| 2 | |
| 3 This document is intended to help get people started debugging network errors | |
| 4 with about:net-internals, with some commonly useful tips and tricks. This | |
| 5 document is aimed more at how to get started using some of its features to | |
| 6 investigate bug reports, rather than as a feature overview. | |
| 7 | |
| 8 It would probably be useful to read "Life of a URLRequest" before this document. | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:25
Probably useful to include a link.
mmenke
2015/12/14 18:22:51
Done.
| |
| 9 | |
| 10 # What Data Net-Internals Contains | |
| 11 | |
| 12 about:net-internals provides a view of browser activity from net/'s perspective. | |
| 13 For this reason, it lacks knowledge of tabs, navigation, frames, resource types, | |
| 14 etc. | |
| 15 | |
| 16 The top level network stack object is the URLRequestContext. The Events View | |
| 17 has information for all Chrome URLRequestContexts that are hooked up to the | |
| 18 single, global, ChromeNetLog object. This includes both incognito and non- | |
| 19 incognito profiles, among other things. The Events View only shows events for | |
| 20 the period that net-internals was open and running, and is incrementally updated | |
| 21 as events occur. We do attempt to add a top level event for URLRequests that | |
| 22 were active when the tab was openned, to help debug hung requests, but that's | |
| 23 best-effort only, and only includes requests for the current profile. | |
| 24 | |
| 25 The other views are all snapshots of the current state of a particular | |
| 26 URLRequestContext's components, and are updated on a 5 second timer. These will | |
| 27 show objects that were created before about:net-internals was openned. Since | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:26
nit: opened.
mmenke
2015/12/14 18:22:51
Done.
| |
| 28 these tabs display information about a particular components of a particular | |
| 29 URLRequestContext, they only have information about the currently active | |
| 30 profile's main URLRequestContext. Most debugging is done with the Events tab | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:25
Suggestion: Collapse the distinction between "a pa
mmenke
2015/12/14 18:22:51
Done, removed second sentence.
| |
| 31 (Which will be all this document covers), but it's good to be aware of this | |
| 32 distinction. | |
| 33 | |
| 34 # Events vs Sources | |
| 35 | |
| 36 The Event View shows events logged by the NetLog. The NetLog model is that | |
| 37 long-lived network stack objects, called sources, emit events over their | |
| 38 lifetime. Some events have a beginning and end point (During which other | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:25
nit, possible style conflict: What I remember bein
mmenke
2015/12/14 18:22:51
You're right about the rule. I use a lot of paren
| |
| 39 subevents may occur), and some only occur at a single point in time. Generally | |
| 40 only one event can be occuring for a source at a time. If there can be multiple | |
| 41 events doing completely independent thing, we often introduce new sources to | |
| 42 represent the parallelism. | |
| 43 | |
| 44 "Sources" correspond to certain net objects, however, multiple layers of net/ | |
| 45 will often log to a single source. Here are the main source types (Excluding | |
| 46 HTTP2 [SPDY]/QUIC) and what they include: | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:26
nit, suggestion: move parenthetical to just before
mmenke
2015/12/14 18:22:51
Done.
| |
| 47 | |
| 48 * URL_REQUEST: This corresponds to the URLRequest object. It includes events | |
| 49 from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions, | |
| 50 HttpStreamFactoryImpl::Requests, HttpStream implementations, and | |
| 51 HttpStreamParsers used to service a response. If we follow HTTP redirects, it | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:25
nit, suggestion: Should we (:-}) be using "we" in
mmenke
2015/12/14 18:22:51
Done.
| |
| 52 includes each redirect. This is a lot of stuff, but generally only one of these | |
| 53 classes is doing work at a time. This event includes the full URL and generally | |
| 54 includes the request / response headers (Except when the cache handles the | |
| 55 response). | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:26
tangent: This is actually a frustration to me--I'v
mmenke
2015/12/14 18:22:51
Patches welcome! I agree that we should improve t
| |
| 56 | |
| 57 * HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (Note that | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:25
Why isn't this part of the URL_REQUEST source? It
mmenke
2015/12/14 18:22:51
Because we may create two of them (One for QUIC).
| |
| 58 one Request can have multiple Jobs). It also includes its proxy and DNS | |
| 59 lookups. One of the final event of these indicate how they created an | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:26
nit: events. I also find "of these" to be awkward
mmenke
2015/12/14 18:22:51
Done. Also fixed SpdySession / QuicSession to use
| |
| 60 HttpStream (Reusing an existing SOCKET / SpdySession / QuicSession, or creating | |
| 61 a new one). | |
| 62 | |
| 63 * CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket | |
| 64 pool uses. A successful CONNECT_JOB return a SOCKET. The events here vary a | |
| 65 lot by job type. Their main event is generally either to create a socket, or | |
| 66 request a socket from another socket pool (Which creates another CONNECT_JOB) | |
| 67 and then do some extra work on top of that - like establish an SSL connection on | |
| 68 top of a TCP connection. | |
| 69 | |
| 70 * SOCKET: These correspond to TCPSockets, but may also have other classes | |
| 71 layered on top of them (Like an SSLClientSocket). This is a bit different from | |
| 72 the other classes, where the name corresponds to the topmost class, instead of | |
| 73 the bottommost one. This is largely an artifact of the fact the socket is | |
| 74 created first, and then SSL (Or a proxy connection) is layered on top of it. | |
| 75 SOCKETs may be reused between multiple requests, and a request may end up | |
| 76 getting a socket created for another request. | |
| 77 | |
| 78 * HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The | |
| 79 include information about how long the lookup was queued, each DNS request that | |
| 80 was attempted (With the platform or built-in resolver) and all the other sources | |
| 81 that are waiting on the job. | |
| 82 | |
| 83 When one source depends on another, we generally log an event with | |
| 84 "source_dependency" value to both sources, which lets you jump between the two | |
| 85 related events. | |
| 86 | |
| 87 # Debugging | |
| 88 | |
| 89 When you receive a report from the user, the first thing you'll generally want | |
| 90 to do find the URL_REQUEST[s] that are misbehaving. If the user gives an ERR_* | |
| 91 code or the exact URL of the resource that won't load, you can just search for | |
| 92 it. If it's an upload, you can search for "post", or if it's a redirect issue, | |
| 93 you can search for "redirect". However, you often won't have much information | |
| 94 about the actual problem. There are two filters in net-internals that can help | |
| 95 in a lot of cases: | |
| 96 | |
| 97 * "type:URL_REQUEST is:error" will restrict the list to URL_REQUEST object with | |
| 98 an error of some sort (red background). Cache errors are often non-fatal, so | |
| 99 you should generally ignore those, and look for a more interesting one. | |
| 100 | |
| 101 * "type:URL_REQUEST sort:duration" will show the lonest-lived requests (as of | |
| 102 when about:net-internals was opened) first. This is often useful in finding | |
| 103 hung or slow requests. | |
| 104 | |
| 105 For a list of other filter commands, you can mouse over the question mark on | |
| 106 about:net-internals. | |
| 107 | |
| 108 Once you locate the problematic request, the next is to figure out where the | |
| 109 problem is - it's often one of the last events, though it could also be related | |
| 110 to response or request headers. You can use "source_dependency" links to drill | |
| 111 down (or up) into other related sources. | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:26
This sorta suggests you can drill up from URL_REQU
mmenke
2015/12/14 18:22:51
Done.
| |
| 112 | |
| 113 You can use the name of an event to search for the code responsible for that | |
| 114 event, and try to deduce what went wrong before/after a particular event. | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:26
I've found this tricky because the default emacs g
mmenke
2015/12/14 18:22:51
I use codesearch and MSVC, neither of which does t
| |
| 115 | |
| 116 Some things to look for while debugging: | |
| 117 | |
| 118 * CANCELLED events almost always come from outside the network stack. | |
| 119 | |
| 120 * Changing networks and entering / exiting suspend mode can have all sorts of | |
| 121 fun and exciting effects on underway network activity. | |
|
Randy Smith (Not in Mondays)
2015/12/14 02:34:25
Suggestion: Include guidance on how you can tell f
mmenke
2015/12/14 18:22:51
Added comment.
| |
| 122 | |
| 123 * URL_REQUEST_DELEGATE / DELEGATE_INFO events mean we're blocked in a | |
| 124 URLRequest::Delegate or the NetworkDelegate, which are implemented outside the | |
| 125 network stack. We'll often be CANCELED here for reasons known only to the | |
| 126 delegate. Or the delegate may cause a hang. In general, what we want to do in | |
| 127 this situation is identify the object causing the problem, which may be a | |
| 128 ResourceThrottle, a ResourceHandler, the ResourceLoader itself, or | |
| 129 ChromeNetworkDelegate. | |
| 130 | |
| 131 * Sockets are often reused between requests. If a request is on a stale | |
| 132 (reused) socket, what was the previous request that used the socket, how long | |
| 133 ago was it made? | |
| 134 | |
| 135 * SSL negotation is a process fraught with peril, particularly with broken | |
| 136 proxies. These will generally stall or fail in the SSL_CONNECT phase at the | |
| 137 SOCKET layer. | |
| 138 | |
| 139 * Range requests have magic to handle them at the cache layer, and are often | |
| 140 issued by the media and PDF code. | |
| 141 | |
| 142 * Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until | |
| 143 a CONNECT_JOB actually connects. This is so the highest priority pending job | |
| 144 gets the first available socket (Which may be a new socket, or an old one that's | |
| 145 freed up). For this reason, it can be a little tricky to relate hung | |
| 146 HTTP_STREAM_JOBs to CONNECT_JOBs. | |
| 147 | |
| 148 * Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If | |
| 149 all CONNECT_JOBs beling to a group (The CONNECT_JOB's description field) are | |
| 150 stalled waiting on an available socket, we probably have 6 sockets to that group | |
| 151 that are hung - either hung connection, or used by stalled requests. | |
| 152 | |
| 153 * There's a limit on number of DNS resolutions that can be started at once. If | |
| 154 everything is stalled while resolving DNS addresses, you've probably hit this | |
| 155 limit, and the DNS lookups are also misbehaving in some fashion. | |
| 156 | |
| 157 # Miscellany | |
| 158 | |
| 159 These are just miscellaneous things you may notice when looking through the | |
| 160 logs. | |
| 161 | |
| 162 * URLRequests that look to start twice for no obvious reason. These are | |
| 163 typically main frame requests, and the first request is AppCache. Can just | |
| 164 ignore it and move on with your life. | |
| 165 | |
| 166 * Some HTTP requests are not handled by URLRequestHttpJobs. These include | |
| 167 things like HSTS redirects (URLRequestRedirectJob), AppCache, ServiceWorker, | |
| 168 etc. These generally don't log as much information, so it can be tricky to | |
| 169 figure out what's going on with these. | |
| 170 | |
| 171 * Non-HTTP requests also appear in the log, and also generally don't log much | |
| 172 (blob URLs, chrome URLs, etc). | |
| 173 | |
| 174 * Preconnects create a "HTTP_STREAM_JOB" event that may create multiple | |
| 175 CONNECT_JOBs (or none) and is then destroyed. These can be identified by the | |
| 176 "SOCKET_POOL_CONNECTING_N_SOCKETS" events. | |
| OLD | NEW |