Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(53)

Side by Side Diff: net/docs/crash-course-in-net-internals.md

Issue 1515183003: Add a crash course to debugging using about:net-internals. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Created 5 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | net/net.gypi » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 # A Crash Course in Debugging with about:net-internals
2
3 This document is intended to help get people started debugging network errors
4 with about:net-internals, with some commonly useful tips and tricks. This
5 document is aimed more at how to get started using some of its features to
6 investigate bug reports, rather than as a feature overview.
7
8 It would probably be useful to read "Life of a URLRequest" before this document.
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Probably useful to include a link.
mmenke 2015/12/14 18:22:51 Done.
9
10 # What Data Net-Internals Contains
11
12 about:net-internals provides a view of browser activity from net/'s perspective.
13 For this reason, it lacks knowledge of tabs, navigation, frames, resource types,
14 etc.
15
16 The top level network stack object is the URLRequestContext. The Events View
17 has information for all Chrome URLRequestContexts that are hooked up to the
18 single, global, ChromeNetLog object. This includes both incognito and non-
19 incognito profiles, among other things. The Events View only shows events for
20 the period that net-internals was open and running, and is incrementally updated
21 as events occur. We do attempt to add a top level event for URLRequests that
22 were active when the tab was openned, to help debug hung requests, but that's
23 best-effort only, and only includes requests for the current profile.
24
25 The other views are all snapshots of the current state of a particular
26 URLRequestContext's components, and are updated on a 5 second timer. These will
27 show objects that were created before about:net-internals was openned. Since
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 nit: opened.
mmenke 2015/12/14 18:22:51 Done.
28 these tabs display information about a particular components of a particular
29 URLRequestContext, they only have information about the currently active
30 profile's main URLRequestContext. Most debugging is done with the Events tab
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Suggestion: Collapse the distinction between "a pa
mmenke 2015/12/14 18:22:51 Done, removed second sentence.
31 (Which will be all this document covers), but it's good to be aware of this
32 distinction.
33
34 # Events vs Sources
35
36 The Event View shows events logged by the NetLog. The NetLog model is that
37 long-lived network stack objects, called sources, emit events over their
38 lifetime. Some events have a beginning and end point (During which other
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 nit, possible style conflict: What I remember bein
mmenke 2015/12/14 18:22:51 You're right about the rule. I use a lot of paren
39 subevents may occur), and some only occur at a single point in time. Generally
40 only one event can be occuring for a source at a time. If there can be multiple
41 events doing completely independent thing, we often introduce new sources to
42 represent the parallelism.
43
44 "Sources" correspond to certain net objects, however, multiple layers of net/
45 will often log to a single source. Here are the main source types (Excluding
46 HTTP2 [SPDY]/QUIC) and what they include:
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 nit, suggestion: move parenthetical to just before
mmenke 2015/12/14 18:22:51 Done.
47
48 * URL_REQUEST: This corresponds to the URLRequest object. It includes events
49 from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions,
50 HttpStreamFactoryImpl::Requests, HttpStream implementations, and
51 HttpStreamParsers used to service a response. If we follow HTTP redirects, it
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 nit, suggestion: Should we (:-}) be using "we" in
mmenke 2015/12/14 18:22:51 Done.
52 includes each redirect. This is a lot of stuff, but generally only one of these
53 classes is doing work at a time. This event includes the full URL and generally
54 includes the request / response headers (Except when the cache handles the
55 response).
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 tangent: This is actually a frustration to me--I'v
mmenke 2015/12/14 18:22:51 Patches welcome! I agree that we should improve t
56
57 * HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (Note that
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Why isn't this part of the URL_REQUEST source? It
mmenke 2015/12/14 18:22:51 Because we may create two of them (One for QUIC).
58 one Request can have multiple Jobs). It also includes its proxy and DNS
59 lookups. One of the final event of these indicate how they created an
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 nit: events. I also find "of these" to be awkward
mmenke 2015/12/14 18:22:51 Done. Also fixed SpdySession / QuicSession to use
60 HttpStream (Reusing an existing SOCKET / SpdySession / QuicSession, or creating
61 a new one).
62
63 * CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket
64 pool uses. A successful CONNECT_JOB return a SOCKET. The events here vary a
65 lot by job type. Their main event is generally either to create a socket, or
66 request a socket from another socket pool (Which creates another CONNECT_JOB)
67 and then do some extra work on top of that - like establish an SSL connection on
68 top of a TCP connection.
69
70 * SOCKET: These correspond to TCPSockets, but may also have other classes
71 layered on top of them (Like an SSLClientSocket). This is a bit different from
72 the other classes, where the name corresponds to the topmost class, instead of
73 the bottommost one. This is largely an artifact of the fact the socket is
74 created first, and then SSL (Or a proxy connection) is layered on top of it.
75 SOCKETs may be reused between multiple requests, and a request may end up
76 getting a socket created for another request.
77
78 * HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The
79 include information about how long the lookup was queued, each DNS request that
80 was attempted (With the platform or built-in resolver) and all the other sources
81 that are waiting on the job.
82
83 When one source depends on another, we generally log an event with
84 "source_dependency" value to both sources, which lets you jump between the two
85 related events.
86
87 # Debugging
88
89 When you receive a report from the user, the first thing you'll generally want
90 to do find the URL_REQUEST[s] that are misbehaving. If the user gives an ERR_*
91 code or the exact URL of the resource that won't load, you can just search for
92 it. If it's an upload, you can search for "post", or if it's a redirect issue,
93 you can search for "redirect". However, you often won't have much information
94 about the actual problem. There are two filters in net-internals that can help
95 in a lot of cases:
96
97 * "type:URL_REQUEST is:error" will restrict the list to URL_REQUEST object with
98 an error of some sort (red background). Cache errors are often non-fatal, so
99 you should generally ignore those, and look for a more interesting one.
100
101 * "type:URL_REQUEST sort:duration" will show the lonest-lived requests (as of
102 when about:net-internals was opened) first. This is often useful in finding
103 hung or slow requests.
104
105 For a list of other filter commands, you can mouse over the question mark on
106 about:net-internals.
107
108 Once you locate the problematic request, the next is to figure out where the
109 problem is - it's often one of the last events, though it could also be related
110 to response or request headers. You can use "source_dependency" links to drill
111 down (or up) into other related sources.
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 This sorta suggests you can drill up from URL_REQU
mmenke 2015/12/14 18:22:51 Done.
112
113 You can use the name of an event to search for the code responsible for that
114 event, and try to deduce what went wrong before/after a particular event.
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 I've found this tricky because the default emacs g
mmenke 2015/12/14 18:22:51 I use codesearch and MSVC, neither of which does t
115
116 Some things to look for while debugging:
117
118 * CANCELLED events almost always come from outside the network stack.
119
120 * Changing networks and entering / exiting suspend mode can have all sorts of
121 fun and exciting effects on underway network activity.
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Suggestion: Include guidance on how you can tell f
mmenke 2015/12/14 18:22:51 Added comment.
122
123 * URL_REQUEST_DELEGATE / DELEGATE_INFO events mean we're blocked in a
124 URLRequest::Delegate or the NetworkDelegate, which are implemented outside the
125 network stack. We'll often be CANCELED here for reasons known only to the
126 delegate. Or the delegate may cause a hang. In general, what we want to do in
127 this situation is identify the object causing the problem, which may be a
128 ResourceThrottle, a ResourceHandler, the ResourceLoader itself, or
129 ChromeNetworkDelegate.
130
131 * Sockets are often reused between requests. If a request is on a stale
132 (reused) socket, what was the previous request that used the socket, how long
133 ago was it made?
134
135 * SSL negotation is a process fraught with peril, particularly with broken
136 proxies. These will generally stall or fail in the SSL_CONNECT phase at the
137 SOCKET layer.
138
139 * Range requests have magic to handle them at the cache layer, and are often
140 issued by the media and PDF code.
141
142 * Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until
143 a CONNECT_JOB actually connects. This is so the highest priority pending job
144 gets the first available socket (Which may be a new socket, or an old one that's
145 freed up). For this reason, it can be a little tricky to relate hung
146 HTTP_STREAM_JOBs to CONNECT_JOBs.
147
148 * Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If
149 all CONNECT_JOBs beling to a group (The CONNECT_JOB's description field) are
150 stalled waiting on an available socket, we probably have 6 sockets to that group
151 that are hung - either hung connection, or used by stalled requests.
152
153 * There's a limit on number of DNS resolutions that can be started at once. If
154 everything is stalled while resolving DNS addresses, you've probably hit this
155 limit, and the DNS lookups are also misbehaving in some fashion.
156
157 # Miscellany
158
159 These are just miscellaneous things you may notice when looking through the
160 logs.
161
162 * URLRequests that look to start twice for no obvious reason. These are
163 typically main frame requests, and the first request is AppCache. Can just
164 ignore it and move on with your life.
165
166 * Some HTTP requests are not handled by URLRequestHttpJobs. These include
167 things like HSTS redirects (URLRequestRedirectJob), AppCache, ServiceWorker,
168 etc. These generally don't log as much information, so it can be tricky to
169 figure out what's going on with these.
170
171 * Non-HTTP requests also appear in the log, and also generally don't log much
172 (blob URLs, chrome URLs, etc).
173
174 * Preconnects create a "HTTP_STREAM_JOB" event that may create multiple
175 CONNECT_JOBs (or none) and is then destroyed. These can be identified by the
176 "SOCKET_POOL_CONNECTING_N_SOCKETS" events.
OLDNEW
« no previous file with comments | « no previous file | net/net.gypi » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698