Chromium Code Reviews| OLD | NEW |
|---|---|
| 1 # A Crash Course in Debugging with about:net-internals | 1 # A Crash Course in Debugging with about:net-internals |
| 2 | 2 |
| 3 This document is intended to help get people started debugging network errors | 3 This document is intended to help get people started debugging network errors |
| 4 with about:net-internals, with some commonly useful tips and tricks. This | 4 with about:net-internals, with some commonly useful tips and tricks. This |
| 5 document is aimed more at how to get started using some of its features to | 5 document is aimed more at how to get started using some of its features to |
| 6 investigate bug reports, rather than as a feature overview. | 6 investigate bug reports, rather than as a feature overview. |
| 7 | 7 |
| 8 It would probably be useful to read | 8 It would probably be useful to read |
| 9 [life-of-a-url-request.md](life-of-a-url-request.md) before this document. | 9 [life-of-a-url-request.md](life-of-a-url-request.md) before this document. |
| 10 | 10 |
| 11 # What Data Net-Internals Contains | 11 # What Data Net-Internals Contains |
| 12 | 12 |
| 13 about:net-internals provides a view of browser activity from net/'s perspective. | 13 about:net-internals provides a view of browser activity from net/'s perspective. |
| 14 For this reason, it lacks knowledge of tabs, navigation, frames, resource types, | 14 For this reason, it lacks knowledge of tabs, navigation, frames, resource types, |
| 15 etc. | 15 etc. |
| 16 | 16 |
| 17 The top level network stack object is the URLRequestContext. The Events View | 17 The top level network stack object is the URLRequestContext. The Events View |
| 18 has information for all Chrome URLRequestContexts that are hooked up to the | 18 has information for all Chrome URLRequestContexts that are hooked up to the |
| 19 single, global, ChromeNetLog object. This includes both incognito and non- | 19 single, global, ChromeNetLog object. This includes both incognito and |
| 20 incognito profiles, among other things. The Events view only shows events for | 20 non-incognito profiles, among other things. The Events view only shows events |
| 21 the period that net-internals was open and running, and is incrementally updated | 21 for the period that net-internals was open and running, and is incrementally |
| 22 as events occur. The code attempts to add a top level event for URLRequests | 22 updated as events occur. The code attempts to add a top level event for |
| 23 that were active when the tab was opened, to help debug hung requests, but | 23 URLRequests that were active when the about:net-internals tab was opened, to |
| 24 that's best-effort only, and only includes requests for the current profile and | 24 help debug hung requests, but that's best-effort only, and only includes |
| 25 the system URLRequestContext. | 25 requests for the current profile and the system URLRequestContext. |
| 26 | 26 |
| 27 The other views are all snapshots of the current state of the main | 27 The other views are all snapshots of the current state of the main |
| 28 URLRequestContext's components, and are updated on a 5 second timer. These will | 28 URLRequestContext's components, and are updated on a 5 second timer. These will |
| 29 show objects that were created before about:net-internals was opened. Most | 29 show objects that were created before about:net-internals was opened. Most |
| 30 debugging is done with the Events view (which will be all this document | 30 debugging is done with the Events view (which will be all this document covers), |
| 31 covers), but it's good to be aware of this distinction. | 31 but it's good to be aware of this distinction. |
| 32 | 32 |
| 33 # Events vs Sources | 33 # Events vs Sources |
| 34 | 34 |
| 35 The Event View shows events logged by the NetLog. The NetLog model is that | 35 The Event View shows events logged by the NetLog. The NetLog model is that |
| 36 long-lived network stack objects, called sources, emit events over their | 36 long-lived network stack objects, called sources, emit events over their |
| 37 lifetime. When looking at the code, a "NetLogWithSource" object contains a sour ce | 37 lifetime. When looking at the code, a NetLogWithSource object contains a source |
| 38 ID, and a pointer to the NetLog the source emits events to. Some events have a | 38 ID, a NetLogSourceType, and a pointer to the NetLog the source emits events to. |
| 39 beginning and end point (during which other subevents may occur), and some only | 39 |
| 40 occur at a single point in time. Generally only one event can be occuring for a | 40 Sources show up in the left column of the Event View. Sources that include an |
| 41 source at a time. If there can be multiple events doing completely independent | 41 event with a net_error parameter with negative value (that is, some kind of |
| 42 thing, the code often uses new sources to represent the parallelism. | 42 ERR_) are shown with red background. Sources whose opening event has not ended |
| 43 yet are shown with white background. Other events have green background. The | |
| 44 search queries corresponding to the first two kinds are `is:error` and | |
| 45 `is:active`. The legend for these and other search queries appears if one | |
| 46 hovers over the (?) next to the filter box on top of the event list. | |
| 47 | |
| 48 Events corresponding to selected sources show up in the right column, organized | |
| 49 by source, and by time within each source. There are two time values: t is | |
| 50 measured from some reference point common to all sources, and st is measured | |
| 51 from the first event for each source. Time is displayed in milliseconds. | |
| 52 | |
| 53 Since the network stack is asynchronous, events from different sources will | |
| 54 often be interlaced in time, but Event View does not feature showing events from | |
| 55 different sources ordered by time. Large time gaps in the event list of a | |
| 56 single source do not necessarily mean that all that time is spend on that | |
| 57 source. | |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:42
Suggestion (if you agree): I might make a stronger
Bence
2016/11/16 14:12:51
Done.
| |
| 58 | |
| 59 Some events come in pair: a beginning and end event, between which other events | |
| 60 may occur. They are shown with + and - prefixes, respectively. The begin event | |
| 61 has a dt value which shows the duration, the time difference between the begin | |
| 62 and end. If there are no other events in between the begin and end, and the end | |
| 63 event has no parameters, then they are collapsed in a single line. | |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:42
Is the dt not present if the end event hasn't yet
Bence
2016/11/16 14:12:51
I am not sure there is any added value in spelling
Randy Smith (Not in Mondays)
2016/11/16 14:17:25
Certainly a fair point (whether or not the extra d
Bence
2016/11/16 15:04:33
I agree. Not mentioning what happens in live view
| |
| 64 | |
| 65 Some other events only occur at a single point in time. | |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:41
Suggestion: "... and will not have a sign prefix."
Bence
2016/11/16 14:12:51
Done.
| |
| 66 | |
| 67 Generally only one event can be occuring for a source at a time. If there can | |
| 68 be multiple events doing completely independent things, the code often uses new | |
| 69 sources to represent the parallelism. | |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:41
FYI (not actionable): I've repeatedly wished for (
Bence
2016/11/16 14:12:51
Acknowledged.
| |
| 70 | |
| 71 Most, but not all events correspond to a source. Exceptions are global events, | |
| 72 which have no source, and show up as individual entries in the source list. | |
| 73 Examples include NETWORK_CHANGED, DNS_CONFIG_CHANGED, and PROXY_CONFIG_CHANGED. | |
| 74 | |
| 75 # Common source types | |
| 43 | 76 |
| 44 "Sources" correspond to certain net objects, however, multiple layers of net/ | 77 "Sources" correspond to certain net objects, however, multiple layers of net/ |
| 45 will often log to a single source. Here are the main source types and what they | 78 will often log to a single source. Here are the main source types and what they |
| 46 include (Excluding HTTP2 [SPDY]/QUIC): | 79 include (excluding HTTP2 [SPDY]/QUIC): |
| 47 | 80 |
| 48 * URL_REQUEST: This corresponds to the URLRequest object. It includes events | 81 * URL_REQUEST: This corresponds to the URLRequest object. It includes events |
| 49 from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions, | 82 from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions, |
| 50 HttpStreamFactoryImpl::Requests, HttpStream implementations, and | 83 HttpStreamFactoryImpl::Requests, HttpStream implementations, and |
| 51 HttpStreamParsers used to service a response. If the URL_REQUEST follows HTTP | 84 HttpStreamParsers used to service a response. If the URL_REQUEST follows HTTP |
| 52 redirects, it will include each redirect. This is a lot of stuff, but generally | 85 redirects, it will include each redirect. This is a lot of stuff, but generally |
| 53 only object is doing work at a time. This event source includes the full URL | 86 only one object is doing work at a time. This event source includes the full |
| 54 and generally includes the request / response headers (Except when the cache | 87 URL and generally includes the request / response headers (except when the cache |
| 55 handles the response). | 88 handles the response). |
| 56 | 89 |
| 57 * HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (Note that | 90 * HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (note that |
| 58 one Request can have multiple Jobs). It also includes its proxy and DNS | 91 one Request can have multiple Jobs). It also includes its proxy and DNS |
| 59 lookups. HTTP_STREAM_JOB log events are separate from URL_REQUEST because | 92 lookups. HTTP_STREAM_JOB log events are separate from URL_REQUEST because two |
| 60 two stream jobs may be created and races against each other, in some cases - | 93 stream jobs may be created and races against each other, in some cases -- one |
| 61 one for one for QUIC, and one for HTTP. One of the final events of this source | 94 for one for QUIC, and one for HTTP. One of the final events of this source, |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:41
duplicate "one for"
Bence
2016/11/16 14:12:51
Done.
| |
| 62 indicates how an HttpStream was created (Reusing an existing SOCKET / | 95 before the HTTP_STREAM_JOB_BOUND_TO_REQUEST event, indicates how an HttpStream |
| 63 HTTP2_SESSION / QUIC_SESSION, or creating a new one). | 96 was created: A SOCKET_POOL_BOUND_TO_CONNECT_JOB event means that a new TCP |
| 97 socket was created, whereas a SOCKET_POOL_REUSED_AN_EXISTING_SOCKET event | |
| 98 indicates that an existing TCP socket was reused for a non-HTTP/2 request. An | |
| 99 HTTP2_SESSION_POOL_IMPORTED_SESSION_FROM_SOCKET event indicates that a new | |
| 100 HTTP/2 session was opened by this Job. A | |
| 101 HTTP2_SESSION_POOL_FOUND_EXISTING_SESSION event indicates that the request was | |
| 102 served on a preexisting HTTP/2 session. A | |
| 103 HTTP2_SESSION_POOL_FOUND_EXISTING_SESSION_FROM_IP_POOL event means that the | |
| 104 request was pooled to a preexisting HTTP/2 session which had a different | |
| 105 SpdySessionKey, but DNS resolution resulted in the same IP, and the certificate | |
| 106 matches. There are currently no events logged for opening new QUIC sessions or | |
| 107 reusing existing ones. | |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:41
I'd suggest reformatting this paragraph as a bulle
Bence
2016/11/16 14:12:51
Done.
| |
| 64 | 108 |
| 65 * CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket | 109 * CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket |
| 66 pool uses. A successful CONNECT_JOB return a SOCKET. The events here vary a | 110 pool uses. A successful CONNECT_JOB returns a SOCKET. The events here vary a |
| 67 lot by job type. Their main event is generally either to create a socket, or | 111 lot by job type. Their main event is generally either to create a socket, or |
| 68 request a socket from another socket pool (Which creates another CONNECT_JOB) | 112 request a socket from another socket pool (which creates another CONNECT_JOB) |
| 69 and then do some extra work on top of that - like establish an SSL connection on | 113 and then do some extra work on top of that -- like establish an SSL connection o n |
| 70 top of a TCP connection. | 114 top of a TCP connection. |
| 71 | 115 |
| 72 * SOCKET: These correspond to TCPSockets, but may also have other classes | 116 * SOCKET: These correspond to TCPSockets, but may also have other classes |
| 73 layered on top of them (Like an SSLClientSocket). This is a bit different from | 117 layered on top of them (like an SSLClientSocket). This is a bit different from |
| 74 the other classes, where the name corresponds to the topmost class, instead of | 118 the other classes, where the name corresponds to the topmost class, instead of |
| 75 the bottommost one. This is largely an artifact of the fact the socket is | 119 the bottommost one. This is largely an artifact of the fact the socket is |
| 76 created first, and then SSL (Or a proxy connection) is layered on top of it. | 120 created first, and then SSL (or a proxy connection) is layered on top of it. |
| 77 SOCKETs may be reused between multiple requests, and a request may end up | 121 SOCKETs may be reused between multiple requests, and a request may end up |
| 78 getting a socket created for another request. | 122 getting a socket created for another request. |
| 79 | 123 |
| 80 * HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The | 124 * HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The |
| 81 include information about how long the lookup was queued, each DNS request that | 125 include information about how long the lookup was queued, each DNS request that |
| 82 was attempted (With the platform or built-in resolver) and all the other sources | 126 was attempted (with the platform or built-in resolver) and all the other sources |
| 83 that are waiting on the job. | 127 that are waiting on the job. |
| 84 | 128 |
| 85 When one source depends on another, the code generally logs an event with | 129 When one source depends on another, the code generally logs an event with |
| 86 "source_dependency" value to both sources, which lets you jump between the two | 130 `source_dependency` value to both sources, which lets you jump between the two |
| 87 related events. | 131 related events. |
| 88 | 132 |
| 89 # Debugging | 133 # Debugging |
| 90 | 134 |
| 91 When you receive a report from the user, the first thing you'll generally want | 135 When you receive a report from the user, the first thing you'll generally want |
| 92 to do find the URL_REQUEST[s] that are misbehaving. If the user gives an ERR_* | 136 to do find the URL_REQUEST[s] that are misbehaving. If the user gives an ERR_* |
| 93 code or the exact URL of the resource that won't load, you can just search for | 137 code or the exact URL of the resource that won't load, you can just search for |
| 94 it. If it's an upload, you can search for "post", or if it's a redirect issue, | 138 it. If it's an upload, you can search for "post", or if it's a redirect issue, |
| 95 you can search for "redirect". However, you often won't have much information | 139 you can search for "redirect". However, you often won't have much information |
| 96 about the actual problem. There are two filters in net-internals that can help | 140 about the actual problem. There are two filters in net-internals that can help |
| 97 in a lot of cases: | 141 in a lot of cases: |
| 98 | 142 |
| 99 * "type:URL_REQUEST is:error" will restrict the list to URL_REQUEST object with | 143 * "type:URL_REQUEST is:error" will restrict the source list to URL_REQUEST |
| 100 an error of some sort (red background). Cache errors are often non-fatal, so | 144 objects with an error of some sort. Cache errors are often non-fatal, so you |
| 101 you should generally ignore those, and look for a more interesting one. | 145 should generally ignore those, and look for a more interesting one. |
| 102 | 146 |
| 103 * "type:URL_REQUEST sort:duration" will show the longest-lived requests first. | 147 * "type:URL_REQUEST sort:duration" will show the longest-lived requests first. |
| 104 This is often useful in finding hung or slow requests. | 148 This is often useful in finding hung or slow requests. |
| 105 | 149 |
| 106 For a list of other filter commands, you can mouse over the question mark on | |
| 107 about:net-internals. | |
|
Randy Smith (Not in Mondays)
2016/11/15 23:18:41
FWIW, I find this a more natural place to mention
Bence
2016/11/16 14:12:51
Done.
| |
| 108 | |
| 109 Once you locate the problematic request, the next is to figure out where the | 150 Once you locate the problematic request, the next is to figure out where the |
| 110 problem is - it's often one of the last events, though it could also be related | 151 problem is -- it's often one of the last events, though it could also be related |
| 111 to response or request headers. You can use "source_dependency" links to drill | 152 to response or request headers. You can use "source_dependency" links to drill |
| 112 down into other related sources, or up from layers below URL_REQUEST. | 153 down into other related sources, or up from layers below URL_REQUEST. |
| 113 | 154 |
| 114 You can use the name of an event to search for the code responsible for that | 155 You can use the name of an event to search for the code responsible for that |
| 115 event, and try to deduce what went wrong before/after a particular event. Note | 156 event, and try to deduce what went wrong before/after a particular event. Note |
| 116 that the event names used in net-internals are not the entire string names, so | 157 that the event names used in net-internals are not the entire string names, so |
| 117 you should not do an entire string match. | 158 you should not do an entire string match. |
| 118 | 159 |
| 119 Some things to look for while debugging: | 160 Some things to look for while debugging: |
| 120 | 161 |
| 121 * CANCELLED events almost always come from outside the network stack. | 162 * CANCELLED events almost always come from outside the network stack. |
| 122 | 163 |
| 123 * Changing networks and entering / exiting suspend mode can have all sorts of | 164 * Changing networks and entering / exiting suspend mode can have all sorts of |
| 124 fun and exciting effects on underway network activity. Network changes log a | 165 fun and exciting effects on underway network activity. Network changes log a |
| 125 top level NETWORK_CHANGED event with no source - the event itself is treated as | 166 top level NETWORK_CHANGED event. Suspend events are currently not logged. |
| 126 its own source. Suspend events are currently not logged. | |
| 127 | 167 |
| 128 * URL_REQUEST_DELEGATE / DELEGATE_INFO events mean a URL_REQUEST is blocked on a | 168 * URL_REQUEST_DELEGATE / DELEGATE_INFO events mean a URL_REQUEST is blocked on a |
| 129 URLRequest::Delegate or the NetworkDelegate, which are implemented outside the | 169 URLRequest::Delegate or the NetworkDelegate, which are implemented outside the |
| 130 network stack. A request will sometimes be CANCELED here for reasons known only | 170 network stack. A request will sometimes be CANCELED here for reasons known only |
| 131 to the delegate. Or the delegate may cause a hang. In general, to debug issues | 171 to the delegate. Or the delegate may cause a hang. In general, to debug issues |
| 132 related to delegates, one needs to figure out which method of which object is | 172 related to delegates, one needs to figure out which method of which object is |
| 133 causing the problem. The object may be the a NetworkDelegate, a | 173 causing the problem. The object may be the a NetworkDelegate, a |
| 134 ResourceThrottle, a ResourceHandler, the ResourceLoader itself, or the | 174 ResourceThrottle, a ResourceHandler, the ResourceLoader itself, or the |
| 135 ResourceDispatcherHost. | 175 ResourceDispatcherHost. |
| 136 | 176 |
| 137 * Sockets are often reused between requests. If a request is on a stale | 177 * Sockets are often reused between requests. If a request is on a stale |
| 138 (reused) socket, what was the previous request that used the socket, how long | 178 (reused) socket, what was the previous request that used the socket, how long |
| 139 ago was it made? | 179 ago was it made? |
| 140 | 180 |
| 141 * SSL negotation is a process fraught with peril, particularly with broken | 181 * SSL negotation is a process fraught with peril, particularly with broken |
| 142 proxies. These will generally stall or fail in the SSL_CONNECT phase at the | 182 proxies. These will generally stall or fail in the SSL_CONNECT phase at the |
| 143 SOCKET layer. | 183 SOCKET layer. |
| 144 | 184 |
| 145 * Range requests have magic to handle them at the cache layer, and are often | 185 * Range requests have magic to handle them at the cache layer, and are often |
| 146 issued by the media and PDF code. | 186 issued by the media and PDF code. |
| 147 | 187 |
| 148 * Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until | 188 * Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until |
| 149 a CONNECT_JOB actually connects. This is so the highest priority pending job | 189 a CONNECT_JOB actually connects. This is so the highest priority pending job |
| 150 gets the first available socket (Which may be a new socket, or an old one that's | 190 gets the first available socket (which may be a new socket, or an old one that's |
| 151 freed up). For this reason, it can be a little tricky to relate hung | 191 freed up). For this reason, it can be a little tricky to relate hung |
| 152 HTTP_STREAM_JOBs to CONNECT_JOBs. | 192 HTTP_STREAM_JOBs to CONNECT_JOBs. |
| 153 | 193 |
| 154 * Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If | 194 * Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If |
| 155 all CONNECT_JOBs beling to a group (The CONNECT_JOB's description field) are | 195 all CONNECT_JOBs belonging to a group (the CONNECT_JOB's description field) are |
| 156 stalled waiting on an available socket, the group probably has 6 sockets that | 196 stalled waiting on an available socket, the group probably has 6 sockets that |
| 157 that are hung - either hung trying to connect, or used by stalled requests and | 197 that are hung -- either hung trying to connect, or used by stalled requests and |
| 158 thus outside the socket pool's control. | 198 thus outside the socket pool's control. |
| 159 | 199 |
| 160 * There's a limit on number of DNS resolutions that can be started at once. If | 200 * There's a limit on number of DNS resolutions that can be started at once. If |
| 161 everything is stalled while resolving DNS addresses, you've probably hit this | 201 everything is stalled while resolving DNS addresses, you've probably hit this |
| 162 limit, and the DNS lookups are also misbehaving in some fashion. | 202 limit, and the DNS lookups are also misbehaving in some fashion. |
| 163 | 203 |
| 164 # Miscellany | 204 # Miscellany |
| 165 | 205 |
| 166 These are just miscellaneous things you may notice when looking through the | 206 These are just miscellaneous things you may notice when looking through the |
| 167 logs. | 207 logs. |
| 168 | 208 |
| 169 * URLRequests that look to start twice for no obvious reason. These are | 209 * URLRequests that look to start twice for no obvious reason. These are |
| 170 typically main frame requests, and the first request is AppCache. Can just | 210 typically main frame requests, and the first request is AppCache. Can just |
| 171 ignore it and move on with your life. | 211 ignore it and move on with your life. |
| 172 | 212 |
| 173 * Some HTTP requests are not handled by URLRequestHttpJobs. These include | 213 * Some HTTP requests are not handled by URLRequestHttpJobs. These include |
| 174 things like HSTS redirects (URLRequestRedirectJob), AppCache, ServiceWorker, | 214 things like HSTS redirects (URLRequestRedirectJob), AppCache, ServiceWorker, |
| 175 etc. These generally don't log as much information, so it can be tricky to | 215 etc. These generally don't log as much information, so it can be tricky to |
| 176 figure out what's going on with these. | 216 figure out what's going on with these. |
| 177 | 217 |
| 178 * Non-HTTP requests also appear in the log, and also generally don't log much | 218 * Non-HTTP requests also appear in the log, and also generally don't log much |
| 179 (blob URLs, chrome URLs, etc). | 219 (blob URLs, chrome URLs, etc). |
| 180 | 220 |
| 181 * Preconnects create a "HTTP_STREAM_JOB" event that may create multiple | 221 * Preconnects create a "HTTP_STREAM_JOB" event that may create multiple |
| 182 CONNECT_JOBs (or none) and is then destroyed. These can be identified by the | 222 CONNECT_JOBs (or none) and is then destroyed. These can be identified by the |
| 183 "SOCKET_POOL_CONNECTING_N_SOCKETS" events. | 223 "SOCKET_POOL_CONNECTING_N_SOCKETS" events. |
| OLD | NEW |