Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(294)

Unified Diff: net/docs/crash-course-in-net-internals.md

Issue 2501113002: Improve net-internals crash course. (Closed)
Patch Set: Expand on dt. Created 4 years, 1 month ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Index: net/docs/crash-course-in-net-internals.md
diff --git a/net/docs/crash-course-in-net-internals.md b/net/docs/crash-course-in-net-internals.md
index 529c7d2c2a792a484e184d3bb66934bc918101ab..e2af1cbd15b46c1e15679d27d14bb0a1a8420418 100644
--- a/net/docs/crash-course-in-net-internals.md
+++ b/net/docs/crash-course-in-net-internals.md
@@ -16,74 +16,131 @@ etc.
The top level network stack object is the URLRequestContext. The Events View
lilyhoughton 2016/12/01 18:26:32 Mention what this is (i.e., the views are listed i
Bence 2016/12/05 17:58:01 Done.
has information for all Chrome URLRequestContexts that are hooked up to the
-single, global, ChromeNetLog object. This includes both incognito and non-
-incognito profiles, among other things. The Events view only shows events for
-the period that net-internals was open and running, and is incrementally updated
-as events occur. The code attempts to add a top level event for URLRequests
-that were active when the tab was opened, to help debug hung requests, but
-that's best-effort only, and only includes requests for the current profile and
-the system URLRequestContext.
+single, global, ChromeNetLog object. This includes both incognito and
+non-incognito profiles, among other things. The Events view only shows events
+for the period that net-internals was open and running, and is incrementally
+updated as events occur. The code attempts to add a top level event for
+URLRequests that were active when the about:net-internals tab was opened, to
+help debug hung requests, but that's best-effort only, and only includes
+requests for the current profile and the system URLRequestContext.
The other views are all snapshots of the current state of the main
URLRequestContext's components, and are updated on a 5 second timer. These will
show objects that were created before about:net-internals was opened. Most
-debugging is done with the Events view (which will be all this document
-covers), but it's good to be aware of this distinction.
+debugging is done with the Events view (which will be all this document covers),
+but it's good to be aware of this distinction.
# Events vs Sources
The Event View shows events logged by the NetLog. The NetLog model is that
long-lived network stack objects, called sources, emit events over their
-lifetime. When looking at the code, a "NetLogWithSource" object contains a source
-ID, and a pointer to the NetLog the source emits events to. Some events have a
-beginning and end point (during which other subevents may occur), and some only
-occur at a single point in time. Generally only one event can be occuring for a
-source at a time. If there can be multiple events doing completely independent
-thing, the code often uses new sources to represent the parallelism.
+lifetime. When looking at the code, a NetLogWithSource object contains a source
lilyhoughton 2016/12/01 18:26:32 What is the point of this clause?
Bence 2016/12/05 17:58:01 Nothing. I guess I'm better off removing it.
+ID, a NetLogSourceType, and a pointer to the NetLog the source emits events to.
+
+Sources show up in the left column of the Event View. Sources that include an
lilyhoughton 2016/12/01 18:26:32 Afaict, Before an event is actually selected, this
Bence 2016/12/05 17:58:01 Done.
+event with a net_error parameter with negative value (that is, some kind of
+ERR_) are shown with red background. Sources whose opening event has not ended
+yet are shown with white background. Other events have green background. The
+search queries corresponding to the first two kinds are `is:error` and
+`is:active`.
+
+Events corresponding to selected sources show up in the right column, organized
lilyhoughton 2016/12/01 18:26:32 Maybe emphasize/clarify the need to and process of
+by source, and by time within each source. There are two time values: t is
+measured from some reference point common to all sources, and st is measured
+from the first event for each source. Time is displayed in milliseconds.
+
+Since the network stack is asynchronous, events from different sources will
+often be interlaced in time, but Event View does not feature showing events from
+different sources ordered by time. Large time gaps in the event list of a
lilyhoughton 2016/12/01 18:26:32 It does appear to feature showing events from diff
Bence 2016/12/05 17:58:01 Correct. I replaced the word "organized" with "so
+single source usually mean that time is spent in the context of another source.
+
+Some events come in pairs: a beginning and end event, between which other events
+may occur. They are shown with + and - prefixes, respectively. The begin event
+has a dt value which shows the duration. If the end event was captured, then
+duration is calculated as the time difference between the begin and the end
+events. Otherwise the time elapsed from the begin event until capturing
+was stopped is displayed (a lower bound for actual duration), followed by a +
+sign (for example, "dt=120+").
+
+If there are no other events in between the begin and end, and the end event has
+no parameters, then they are collapsed in a single line without a sign prefix.
+
+Some other events only occur at a single point in time, and will not have a sign
+prefix.
lilyhoughton 2016/12/01 18:26:32 Do these events also have a dt, or does that apply
Bence 2016/12/05 17:58:01 No duration. Added clarification to the text.
+
+Generally only one event can be occuring for a source at a time. If there can
+be multiple events doing completely independent things, the code often uses new
+sources to represent the parallelism.
+
+Most, but not all events correspond to a source. Exceptions are global events,
+which have no source, and show up as individual entries in the source list.
+Examples include NETWORK_CHANGED, DNS_CONFIG_CHANGED, and PROXY_CONFIG_CHANGED.
lilyhoughton 2016/12/01 18:26:32 Of global events, right?
Bence 2016/12/05 17:58:01 Correct. Added clarification to text.
+
+# Common source types
"Sources" correspond to certain net objects, however, multiple layers of net/
will often log to a single source. Here are the main source types and what they
-include (Excluding HTTP2 [SPDY]/QUIC):
+include (excluding HTTP2 [SPDY]/QUIC):
* URL_REQUEST: This corresponds to the URLRequest object. It includes events
from all the URLRequestJobs, HttpCache::Transactions, NetworkTransactions,
HttpStreamFactoryImpl::Requests, HttpStream implementations, and
HttpStreamParsers used to service a response. If the URL_REQUEST follows HTTP
redirects, it will include each redirect. This is a lot of stuff, but generally
-only object is doing work at a time. This event source includes the full URL
-and generally includes the request / response headers (Except when the cache
+only one object is doing work at a time. This event source includes the full
+URL and generally includes the request / response headers (except when the cache
handles the response).
-* HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (Note that
+* HTTP_STREAM_JOB: This corresponds to HttpStreamFactoryImpl::Job (note that
one Request can have multiple Jobs). It also includes its proxy and DNS
-lookups. HTTP_STREAM_JOB log events are separate from URL_REQUEST because
-two stream jobs may be created and races against each other, in some cases -
-one for one for QUIC, and one for HTTP. One of the final events of this source
-indicates how an HttpStream was created (Reusing an existing SOCKET /
-HTTP2_SESSION / QUIC_SESSION, or creating a new one).
+lookups. HTTP_STREAM_JOB log events are separate from URL_REQUEST because two
+stream jobs may be created and races against each other, in some cases -- one
+for QUIC, and one for HTTP.
+
+ One of the final events of this source, before the
+ HTTP_STREAM_JOB_BOUND_TO_REQUEST event, indicates how an HttpStream was
+ created:
+
+ + A SOCKET_POOL_BOUND_TO_CONNECT_JOB event means that a new TCP socket was
+ created, whereas a SOCKET_POOL_REUSED_AN_EXISTING_SOCKET event indicates that
+ an existing TCP socket was reused for a non-HTTP/2 request.
+
+ + An HTTP2_SESSION_POOL_IMPORTED_SESSION_FROM_SOCKET event indicates that a
+ new HTTP/2 session was opened by this Job.
+
+ + An HTTP2_SESSION_POOL_FOUND_EXISTING_SESSION event indicates that the request
+ was served on a preexisting HTTP/2 session.
+
+ + An HTTP2_SESSION_POOL_FOUND_EXISTING_SESSION_FROM_IP_POOL event means that
+ the request was pooled to a preexisting HTTP/2 session which had a different
+ SpdySessionKey, but DNS resolution resulted in the same IP, and the
+ certificate matches.
+
+ + There are currently no events logged for opening new QUIC sessions or
+ reusing existing ones.
* CONNECT_JOB: This corresponds to the ConnectJob subclasses that each socket
-pool uses. A successful CONNECT_JOB return a SOCKET. The events here vary a
+pool uses. A successful CONNECT_JOB returns a SOCKET. The events here vary a
lilyhoughton 2016/12/01 18:26:32 What does it mean for a source to be successful?
Bence 2016/12/05 17:58:01 I guess that it signals OK (and not ERR_*) (via Co
lot by job type. Their main event is generally either to create a socket, or
-request a socket from another socket pool (Which creates another CONNECT_JOB)
-and then do some extra work on top of that - like establish an SSL connection on
+request a socket from another socket pool (which creates another CONNECT_JOB)
+and then do some extra work on top of that -- like establish an SSL connection on
top of a TCP connection.
* SOCKET: These correspond to TCPSockets, but may also have other classes
-layered on top of them (Like an SSLClientSocket). This is a bit different from
+layered on top of them (like an SSLClientSocket). This is a bit different from
the other classes, where the name corresponds to the topmost class, instead of
the bottommost one. This is largely an artifact of the fact the socket is
-created first, and then SSL (Or a proxy connection) is layered on top of it.
+created first, and then SSL (or a proxy connection) is layered on top of it.
SOCKETs may be reused between multiple requests, and a request may end up
getting a socket created for another request.
* HOST_RESOLVER_IMPL_JOB: These correspond to HostResolverImpl::Job. The
lilyhoughton 2016/12/01 18:26:32 The -> They?
Bence 2016/12/05 17:58:01 Done.
include information about how long the lookup was queued, each DNS request that
-was attempted (With the platform or built-in resolver) and all the other sources
+was attempted (with the platform or built-in resolver) and all the other sources
that are waiting on the job.
When one source depends on another, the code generally logs an event with
-"source_dependency" value to both sources, which lets you jump between the two
+`source_dependency` value to both sources, which lets you jump between the two
related events.
lilyhoughton 2016/12/01 18:26:32 Elaborate on this?
Bence 2016/12/05 17:58:01 Done.
# Debugging
@@ -96,9 +153,9 @@ you can search for "redirect". However, you often won't have much information
about the actual problem. There are two filters in net-internals that can help
in a lot of cases:
-* "type:URL_REQUEST is:error" will restrict the list to URL_REQUEST object with
-an error of some sort (red background). Cache errors are often non-fatal, so
-you should generally ignore those, and look for a more interesting one.
+* "type:URL_REQUEST is:error" will restrict the source list to URL_REQUEST
+objects with an error of some sort. Cache errors are often non-fatal, so you
+should generally ignore those, and look for a more interesting one.
* "type:URL_REQUEST sort:duration" will show the longest-lived requests first.
This is often useful in finding hung or slow requests.
@@ -107,7 +164,7 @@ For a list of other filter commands, you can mouse over the question mark on
about:net-internals.
Once you locate the problematic request, the next is to figure out where the
-problem is - it's often one of the last events, though it could also be related
+problem is -- it's often one of the last events, though it could also be related
to response or request headers. You can use "source_dependency" links to drill
down into other related sources, or up from layers below URL_REQUEST.
lilyhoughton 2016/12/01 18:26:32 This phrasing is a little confusing - it's saying
Bence 2016/12/05 17:58:01 Indeed this is confusing. I agree with your inter
@@ -122,8 +179,7 @@ Some things to look for while debugging:
* Changing networks and entering / exiting suspend mode can have all sorts of
fun and exciting effects on underway network activity. Network changes log a
-top level NETWORK_CHANGED event with no source - the event itself is treated as
-its own source. Suspend events are currently not logged.
+top level NETWORK_CHANGED event. Suspend events are currently not logged.
* URL_REQUEST_DELEGATE / DELEGATE_INFO events mean a URL_REQUEST is blocked on a
URLRequest::Delegate or the NetworkDelegate, which are implemented outside the
@@ -147,14 +203,14 @@ issued by the media and PDF code.
* Late binding: HTTP_STREAM_JOBs are not associated with any CONNECT_JOB until
a CONNECT_JOB actually connects. This is so the highest priority pending job
lilyhoughton 2016/12/01 18:26:32 Referring to the HTTP_STREAM_JOB, not the CONNECT_
Bence 2016/12/05 17:58:01 I think this sentence is correct. When ConnectJob
lilyhoughton 2016/12/09 19:56:09 Specifically, it isn't super clear to me what "the
-gets the first available socket (Which may be a new socket, or an old one that's
+gets the first available socket (which may be a new socket, or an old one that's
freed up). For this reason, it can be a little tricky to relate hung
HTTP_STREAM_JOBs to CONNECT_JOBs.
* Each CONNECT_JOB belongs to a "group", which has a limit of 6 connections. If
-all CONNECT_JOBs beling to a group (The CONNECT_JOB's description field) are
+all CONNECT_JOBs belonging to a group (the CONNECT_JOB's description field) are
stalled waiting on an available socket, the group probably has 6 sockets that
-that are hung - either hung trying to connect, or used by stalled requests and
+that are hung -- either hung trying to connect, or used by stalled requests and
thus outside the socket pool's control.
* There's a limit on number of DNS resolutions that can be started at once. If
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698