Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(301)

Unified Diff: net/docs/crash-course-in-net-internals.md

Issue 1515183003: Add a crash course to debugging using about:net-internals. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Created 5 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
« no previous file with comments | « no previous file | net/net.gypi » ('j') | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Index: net/docs/crash-course-in-net-internals.md
diff --git a/net/docs/crash-course-in-net-internals.md b/net/docs/crash-course-in-net-internals.md
new file mode 100644
index 0000000000000000000000000000000000000000..d087611d8b033d4ef049268bd2c945801cf8af7c
--- /dev/null
+++ b/net/docs/crash-course-in-net-internals.md
@@ -0,0 +1,176 @@
+# A Crash Course in Debugging with about:net-internals
+
+This document is intended to help get people started debugging network errors
+with about:net-internals, with some commonly useful tips and tricks. This
+document is aimed more at how to get started using some of its features to
+investigate bug reports, rather than as a feature overview.
+
+It would probably be useful to read "Life of a URLRequest" before this document.
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Probably useful to include a link.
mmenke 2015/12/14 18:22:51 Done.
+
+# What Data Net-Internals Contains
+
+about:net-internals provides a view of browser activity from net/'s perspective.
+For this reason, it lacks knowledge of tabs, navigation, frames, resource types,
+etc.
+
+The top level network stack object is the URLRequestContext. The Events View
+has information for all Chrome URLRequestContexts that are hooked up to the
+single, global, ChromeNetLog object. This includes both incognito and non-
+incognito profiles, among other things. The Events View only shows events for
+the period that net-internals was open and running, and is incrementally updated
+as events occur. We do attempt to add a top level event for URLRequests that
+were active when the tab was openned, to help debug hung requests, but that's
+best-effort only, and only includes requests for the current profile.
+
+The other views are all snapshots of the current state of a particular
+URLRequestContext's components, and are updated on a 5 second timer. These will
+show objects that were created before about:net-internals was openned. Since
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 nit: opened.
mmenke 2015/12/14 18:22:51 Done.
+these tabs display information about a particular components of a particular
+URLRequestContext, they only have information about the currently active
+profile's main URLRequestContext. Most debugging is done with the Events tab
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Suggestion: Collapse the distinction between "a pa
mmenke 2015/12/14 18:22:51 Done, removed second sentence.
+(Which will be all this document covers), but it's good to be aware of this
+distinction.
+
+# Events vs Sources
+
+The Event View shows events logged by the NetLog. The NetLog model is that
+long-lived network stack objects, called sources, emit events over their
+lifetime. Some events have a beginning and end point (During which other
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 nit, possible style conflict: What I remember bein
mmenke 2015/12/14 18:22:51 You're right about the rule. I use a lot of paren
+subevents may occur), and some only occur at a single point in time. Generally
+only one event can be occuring for a source at a time. If there can be multiple
+events doing completely independent thing, we often introduce new sources to
+represent the parallelism.
+
+"Sources" correspond to certain net objects, however, multiple layers of net/
+will often log to a single source. Here are the main source types (Excluding
+HTTP2 [SPDY]/QUIC) and what they include:
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 nit, suggestion: move parenthetical to just before
mmenke 2015/12/14 18:22:51 Done.
+
+* URL_REQUEST: This corresponds to the URLRequest object. It includes events
+from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions,
+HttpStreamFactoryImpl::Requests, HttpStream implementations, and
+HttpStreamParsers used to service a response. If we follow HTTP redirects, it
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 nit, suggestion: Should we (:-}) be using "we" in
mmenke 2015/12/14 18:22:51 Done.
+includes each redirect. This is a lot of stuff, but generally only one of these
+classes is doing work at a time. This event includes the full URL and generally
+includes the request / response headers (Except when the cache handles the
+response).
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 tangent: This is actually a frustration to me--I'v
mmenke 2015/12/14 18:22:51 Patches welcome! I agree that we should improve t
+
+* HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (Note that
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Why isn't this part of the URL_REQUEST source? It
mmenke 2015/12/14 18:22:51 Because we may create two of them (One for QUIC).
+one Request can have multiple Jobs). It also includes its proxy and DNS
+lookups. One of the final event of these indicate how they created an
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 nit: events. I also find "of these" to be awkward
mmenke 2015/12/14 18:22:51 Done. Also fixed SpdySession / QuicSession to use
+HttpStream (Reusing an existing SOCKET / SpdySession / QuicSession, or creating
+a new one).
+
+* CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket
+pool uses. A successful CONNECT_JOB return a SOCKET. The events here vary a
+lot by job type. Their main event is generally either to create a socket, or
+request a socket from another socket pool (Which creates another CONNECT_JOB)
+and then do some extra work on top of that - like establish an SSL connection on
+top of a TCP connection.
+
+* SOCKET: These correspond to TCPSockets, but may also have other classes
+layered on top of them (Like an SSLClientSocket). This is a bit different from
+the other classes, where the name corresponds to the topmost class, instead of
+the bottommost one. This is largely an artifact of the fact the socket is
+created first, and then SSL (Or a proxy connection) is layered on top of it.
+SOCKETs may be reused between multiple requests, and a request may end up
+getting a socket created for another request.
+
+* HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The
+include information about how long the lookup was queued, each DNS request that
+was attempted (With the platform or built-in resolver) and all the other sources
+that are waiting on the job.
+
+When one source depends on another, we generally log an event with
+"source_dependency" value to both sources, which lets you jump between the two
+related events.
+
+# Debugging
+
+When you receive a report from the user, the first thing you'll generally want
+to do find the URL_REQUEST[s] that are misbehaving. If the user gives an ERR_*
+code or the exact URL of the resource that won't load, you can just search for
+it. If it's an upload, you can search for "post", or if it's a redirect issue,
+you can search for "redirect". However, you often won't have much information
+about the actual problem. There are two filters in net-internals that can help
+in a lot of cases:
+
+* "type:URL_REQUEST is:error" will restrict the list to URL_REQUEST object with
+an error of some sort (red background). Cache errors are often non-fatal, so
+you should generally ignore those, and look for a more interesting one.
+
+* "type:URL_REQUEST sort:duration" will show the lonest-lived requests (as of
+when about:net-internals was opened) first. This is often useful in finding
+hung or slow requests.
+
+For a list of other filter commands, you can mouse over the question mark on
+about:net-internals.
+
+Once you locate the problematic request, the next is to figure out where the
+problem is - it's often one of the last events, though it could also be related
+to response or request headers. You can use "source_dependency" links to drill
+down (or up) into other related sources.
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 This sorta suggests you can drill up from URL_REQU
mmenke 2015/12/14 18:22:51 Done.
+
+You can use the name of an event to search for the code responsible for that
+event, and try to deduce what went wrong before/after a particular event.
Randy Smith (Not in Mondays) 2015/12/14 02:34:26 I've found this tricky because the default emacs g
mmenke 2015/12/14 18:22:51 I use codesearch and MSVC, neither of which does t
+
+Some things to look for while debugging:
+
+* CANCELLED events almost always come from outside the network stack.
+
+* Changing networks and entering / exiting suspend mode can have all sorts of
+fun and exciting effects on underway network activity.
Randy Smith (Not in Mondays) 2015/12/14 02:34:25 Suggestion: Include guidance on how you can tell f
mmenke 2015/12/14 18:22:51 Added comment.
+
+* URL_REQUEST_DELEGATE / DELEGATE_INFO events mean we're blocked in a
+URLRequest::Delegate or the NetworkDelegate, which are implemented outside the
+network stack. We'll often be CANCELED here for reasons known only to the
+delegate. Or the delegate may cause a hang. In general, what we want to do in
+this situation is identify the object causing the problem, which may be a
+ResourceThrottle, a ResourceHandler, the ResourceLoader itself, or
+ChromeNetworkDelegate.
+
+* Sockets are often reused between requests. If a request is on a stale
+(reused) socket, what was the previous request that used the socket, how long
+ago was it made?
+
+* SSL negotation is a process fraught with peril, particularly with broken
+proxies. These will generally stall or fail in the SSL_CONNECT phase at the
+SOCKET layer.
+
+* Range requests have magic to handle them at the cache layer, and are often
+issued by the media and PDF code.
+
+* Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until
+a CONNECT_JOB actually connects. This is so the highest priority pending job
+gets the first available socket (Which may be a new socket, or an old one that's
+freed up). For this reason, it can be a little tricky to relate hung
+HTTP_STREAM_JOBs to CONNECT_JOBs.
+
+* Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If
+all CONNECT_JOBs beling to a group (The CONNECT_JOB's description field) are
+stalled waiting on an available socket, we probably have 6 sockets to that group
+that are hung - either hung connection, or used by stalled requests.
+
+* There's a limit on number of DNS resolutions that can be started at once. If
+everything is stalled while resolving DNS addresses, you've probably hit this
+limit, and the DNS lookups are also misbehaving in some fashion.
+
+# Miscellany
+
+These are just miscellaneous things you may notice when looking through the
+logs.
+
+* URLRequests that look to start twice for no obvious reason. These are
+typically main frame requests, and the first request is AppCache. Can just
+ignore it and move on with your life.
+
+* Some HTTP requests are not handled by URLRequestHttpJobs. These include
+things like HSTS redirects (URLRequestRedirectJob), AppCache, ServiceWorker,
+etc. These generally don't log as much information, so it can be tricky to
+figure out what's going on with these.
+
+* Non-HTTP requests also appear in the log, and also generally don't log much
+(blob URLs, chrome URLs, etc).
+
+* Preconnects create a "HTTP_STREAM_JOB" event that may create multiple
+CONNECT_JOBs (or none) and is then destroyed. These can be identified by the
+"SOCKET_POOL_CONNECTING_N_SOCKETS" events.
« no previous file with comments | « no previous file | net/net.gypi » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698