Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(182)

Side by Side Diff: net/docs/crash-course-in-net-internals.md

Issue 1515183003: Add a crash course to debugging using about:net-internals. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Response to comments Created 5 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « net/docs/bug-triage.md ('k') | net/net.gypi » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 # A Crash Course in Debugging with about:net-internals
2
3 This document is intended to help get people started debugging network errors
4 with about:net-internals, with some commonly useful tips and tricks. This
5 document is aimed more at how to get started using some of its features to
6 investigate bug reports, rather than as a feature overview.
7
8 It would probably be useful to read [life-of-a-url-request.md](
9 life-of-a-url-request.md) before this document.
10
11 # What Data Net-Internals Contains
12
13 about:net-internals provides a view of browser activity from net/'s perspective.
14 For this reason, it lacks knowledge of tabs, navigation, frames, resource types,
15 etc.
16
17 The top level network stack object is the URLRequestContext. The Events View
18 has information for all Chrome URLRequestContexts that are hooked up to the
19 single, global, ChromeNetLog object. This includes both incognito and non-
20 incognito profiles, among other things. The Events view only shows events for
21 the period that net-internals was open and running, and is incrementally updated
22 as events occur. The code attempts to add a top level event for URLRequests
23 that were active when the tab was opened, to help debug hung requests, but
24 that's best-effort only, and only includes requests for the current profile and
25 the system URLRequestContext.
26
27 The other views are all snapshots of the current state of the main
28 URLRequestContext's components, and are updated on a 5 second timer. These will
29 show objects that were created before about:net-internals was opened. Most
30 debugging is done with the Events view (which will be all this document
31 covers), but it's good to be aware of this distinction.
32
33 # Events vs Sources
34
35 The Event View shows events logged by the NetLog. The NetLog model is that
36 long-lived network stack objects, called sources, emit events over their
37 lifetime. Some events have a beginning and end point (during which other
38 subevents may occur), and some only occur at a single point in time. Generally
39 only one event can be occuring for a source at a time. If there can be multiple
40 events doing completely independent thing, the code often uses new sources to
41 represent the parallelism.
42
43 "Sources" correspond to certain net objects, however, multiple layers of net/
44 will often log to a single source. Here are the main source types and what they
45 include (Excluding HTTP2 [SPDY]/QUIC):
46
47 * URL_REQUEST: This corresponds to the URLRequest object. It includes events
48 from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions,
49 HttpStreamFactoryImpl::Requests, HttpStream implementations, and
50 HttpStreamParsers used to service a response. If the URL_REQUEST follows HTTP
51 redirects, it will include each redirect. This is a lot of stuff, but generally
52 only object is doing work at a time. This event source includes the full URL
53 and generally includes the request / response headers (Except when the cache
54 handles the response).
55
56 * HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (Note that
57 one Request can have multiple Jobs). It also includes its proxy and DNS
58 lookups. HTTP_STREAM_JOB log events are separate from URL_REQUEST because
59 two stream jobs may be created and races against each other, in some cases -
60 one for one for QUIC, and one for HTTP. One of the final events of this source
61 indicates how an HttpStream was created (Reusing an existing SOCKET /
62 HTTP2_SESSION / QUIC_SESSION, or creating a new one).
63
64 * CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket
65 pool uses. A successful CONNECT_JOB return a SOCKET. The events here vary a
66 lot by job type. Their main event is generally either to create a socket, or
67 request a socket from another socket pool (Which creates another CONNECT_JOB)
68 and then do some extra work on top of that - like establish an SSL connection on
69 top of a TCP connection.
70
71 * SOCKET: These correspond to TCPSockets, but may also have other classes
72 layered on top of them (Like an SSLClientSocket). This is a bit different from
73 the other classes, where the name corresponds to the topmost class, instead of
74 the bottommost one. This is largely an artifact of the fact the socket is
75 created first, and then SSL (Or a proxy connection) is layered on top of it.
76 SOCKETs may be reused between multiple requests, and a request may end up
77 getting a socket created for another request.
78
79 * HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The
80 include information about how long the lookup was queued, each DNS request that
81 was attempted (With the platform or built-in resolver) and all the other sources
82 that are waiting on the job.
83
84 When one source depends on another, the code generally logs an event with
85 "source_dependency" value to both sources, which lets you jump between the two
86 related events.
87
88 # Debugging
89
90 When you receive a report from the user, the first thing you'll generally want
91 to do find the URL_REQUEST[s] that are misbehaving. If the user gives an ERR_*
92 code or the exact URL of the resource that won't load, you can just search for
93 it. If it's an upload, you can search for "post", or if it's a redirect issue,
94 you can search for "redirect". However, you often won't have much information
95 about the actual problem. There are two filters in net-internals that can help
96 in a lot of cases:
97
98 * "type:URL_REQUEST is:error" will restrict the list to URL_REQUEST object with
99 an error of some sort (red background). Cache errors are often non-fatal, so
100 you should generally ignore those, and look for a more interesting one.
101
102 * "type:URL_REQUEST sort:duration" will show the lonest-lived requests (as of
103 when about:net-internals was opened) first. This is often useful in finding
104 hung or slow requests.
105
106 For a list of other filter commands, you can mouse over the question mark on
107 about:net-internals.
108
109 Once you locate the problematic request, the next is to figure out where the
110 problem is - it's often one of the last events, though it could also be related
111 to response or request headers. You can use "source_dependency" links to drill
112 down into other related sources, or up from layers below URL_REQUEST.
113
114 You can use the name of an event to search for the code responsible for that
115 event, and try to deduce what went wrong before/after a particular event. Note
116 that the event names used in net-internals are not the entire string names, so
117 you should not do an entire string match.
118
119 Some things to look for while debugging:
120
121 * CANCELLED events almost always come from outside the network stack.
122
123 * Changing networks and entering / exiting suspend mode can have all sorts of
124 fun and exciting effects on underway network activity. Network changes log a
125 top level NETWORK_CHANGED event with no source - the event itself is treated as
126 its own source. Suspend events are currently not logged.
127
128 * URL_REQUEST_DELEGATE / DELEGATE_INFO events mean a URL_REQUEST is blocked on a
129 URLRequest::Delegate or the NetworkDelegate, which are implemented outside the
130 network stack. A request will sometimes be CANCELED here for reasons known only
131 to the delegate. Or the delegate may cause a hang. In general, to debug issues
132 related to delegates, one needs to figure out which method of which object is
133 causing the problem. The object may be the a NetworkDelegate, a
134 ResourceThrottle, a ResourceHandler, the ResourceLoader itself, or the
135 ResourceDispatcherHost.
136
137 * Sockets are often reused between requests. If a request is on a stale
138 (reused) socket, what was the previous request that used the socket, how long
139 ago was it made?
140
141 * SSL negotation is a process fraught with peril, particularly with broken
142 proxies. These will generally stall or fail in the SSL_CONNECT phase at the
143 SOCKET layer.
144
145 * Range requests have magic to handle them at the cache layer, and are often
146 issued by the media and PDF code.
147
148 * Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until
149 a CONNECT_JOB actually connects. This is so the highest priority pending job
150 gets the first available socket (Which may be a new socket, or an old one that's
151 freed up). For this reason, it can be a little tricky to relate hung
152 HTTP_STREAM_JOBs to CONNECT_JOBs.
153
154 * Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If
155 all CONNECT_JOBs beling to a group (The CONNECT_JOB's description field) are
156 stalled waiting on an available socket, the group probably has 6 sockets that
157 that are hung - either hung trying to connect, or used by stalled requests and
158 thus outside the socket pool's control.
159
160 * There's a limit on number of DNS resolutions that can be started at once. If
161 everything is stalled while resolving DNS addresses, you've probably hit this
162 limit, and the DNS lookups are also misbehaving in some fashion.
163
164 # Miscellany
165
166 These are just miscellaneous things you may notice when looking through the
167 logs.
168
169 * URLRequests that look to start twice for no obvious reason. These are
170 typically main frame requests, and the first request is AppCache. Can just
171 ignore it and move on with your life.
172
173 * Some HTTP requests are not handled by URLRequestHttpJobs. These include
174 things like HSTS redirects (URLRequestRedirectJob), AppCache, ServiceWorker,
175 etc. These generally don't log as much information, so it can be tricky to
176 figure out what's going on with these.
177
178 * Non-HTTP requests also appear in the log, and also generally don't log much
179 (blob URLs, chrome URLs, etc).
180
181 * Preconnects create a "HTTP_STREAM_JOB" event that may create multiple
182 CONNECT_JOBs (or none) and is then destroyed. These can be identified by the
183 "SOCKET_POOL_CONNECTING_N_SOCKETS" events.
OLDNEW
« no previous file with comments | « net/docs/bug-triage.md ('k') | net/net.gypi » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698