net/docs/life-of-a-url-request.md - Issue 1211003003: net: Add Life of a URLRequest documentation.

Side by Side Diff: net/docs/life-of-a-url-request.md

Issue 1211003003: net: Add Life of a URLRequest documentation. (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master

Patch Set: Use single spaces after periods Created 5 years, 5 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 # Life of a URLRequest
	Randy Smith (Not in Mondays) 2015/07/08 20:36:25 Could you add this doc (and maybe the other docs i Could you add this doc (and maybe the other docs in this directory, if you are so moved) into net/net.gypi:net_docs_sources so that HTML gets automatically created in people's output directories? Arguably we don't need to do this (because source view auto-translates) but it does make reviewing easier. I also believe in consistency--i.e. we should do it for all docs or none. mmenke 2015/07/09 19:38:28 Done. I was unaware of that target...and suspect Show quoted text On 2015/07/08 20:36:25, rdsmith wrote: > Could you add this doc (and maybe the other docs in this directory, if you are > so moved) into net/net.gypi:net_docs_sources so that HTML gets automatically > created in people's output directories? Arguably we don't need to do this > (because source view auto-translates) but it does make reviewing easier. I also > believe in consistency--i.e. we should do it for all docs or none. Done. I was unaware of that target...and suspect almost everyone else is, too, which may be a problem. Randy Smith (Not in Mondays) 2015/07/13 15:39:30 Agreed; not sure what to do about it, though. Show quoted text On 2015/07/09 19:38:28, mmenke wrote: > On 2015/07/08 20:36:25, rdsmith wrote: > > Could you add this doc (and maybe the other docs in this directory, if you are > > so moved) into net/net.gypi:net_docs_sources so that HTML gets automatically > > created in people's output directories? Arguably we don't need to do this > > (because source view auto-translates) but it does make reviewing easier. I > also > > believe in consistency--i.e. we should do it for all docs or none. > > Done. I was unaware of that target...and suspect almost everyone else is, too, > which may be a problem. Agreed; not sure what to do about it, though.
	2

	3 This document is intended as an overview of the core layers of the network

	4 stack, their basic responsibilities, how they fit together, and where some of

	5 the pain points are, without going into too much detail. Though it touches a

	6 bit on the renderer process and the content/loader stack, the focus is on net/

	7 itself.

	8

	9 It's particularly targeted at people new to the Chrome network stack, but

	10 should also be useful for team members who may be experts at some parts of the

	11 stack, but are largely unfamiliar with other components. It starts by walking

	12 through how a basic request issued by Blink works its way through the network

	13 stack, and then moves on to discuss how various components plug in.

	14

	15

	16 # Anatomy of the Network Stack

	17

	18 The main top-level network stack object is the URLRequextContext. The context

	19 has non-owning pointers to everything needed to create and issue a URLRequest.

	20 The context must outlive all requests that use it. Creating a context is a

	21 rather complicated process, and it's recommended that most embedders use

	22 URLRequestContextBuilder to do this. Chrome itself has several request

	23 contexts that the network stack team owns:

	24

	25 * The proxy URLRequestContext, owned by the IOThread and used to get PAC

	26 scripts while avoiding re-entrancy.

	27 * The system URLRequestContext, also owned by the IOThread, used for requests

	28 that aren't associated with a profile.

	29 * Each profile, including incognito profiles, has a number of URLRequestContexts

	30 that are created as needed:

	31 * The main URLRequestContext is mostly created in ProfileIOData, though it
	Randy Smith (Not in Mondays) 2015/07/08 20:36:25 These sub-items aren't showing up as indented/seco These sub-items aren't showing up as indented/second level list in the HTML. mmenke 2015/07/09 19:38:29 Fixed. Weird, they're fine when I paste them into Show quoted text On 2015/07/08 20:36:25, rdsmith wrote: > These sub-items aren't showing up as indented/second level list in the HTML. Fixed. Weird, they're fine when I paste them into snippets (See earlier comment about not knowing about net_docs target). Can't we just standardize on one version of markdown for all of Google? :( Looks like this version of markdown needs 4 space indent (I meant to use two, but when I switched to 1 space between sentences, accidentally reduced these to 1 space, too).
	32 has a couple components that are passed in from content's StoragePartition

	33 code. Several other components are shared with the system URLRequestContext,

	34 like the HostResolver.

	35 * Each non-incognito profile also has a media request context, which uses a

	36 different on-disk cache than the main request context. This prevents a single

	37 huge media file from evicting everything else in the cache.

	38 * On desktop platforms, each profile has a request context for extensions.

	39 * Each profile has two contexts for each isolated app (One for media, one for

	40 everything else).

	41

	42 The "HttpNetworkSession" is another major network stack object. It has

	43 pointers to the network stack objects that more directly deal with sockets, and

	44 their dependendencies. Its main objects are the HttpStreamFactory, the socket

	45 pools, and the SPDY/QUIC session pools.

	46

	47 This document does not mention either of these objects much, but at layers

	48 above the HttpStreamFactory, objects often grab their dependencies from the

	49 URLRequestContext, while the HttpStreamFactory and layers below it generally

	50 get their dependencies from the HttpNetworkSession.

	51

	52

	53 # How many "Delegates"?

	54

	55 The network stack informs the embedder of important events for a request using

	56 two main interfaces: The URLRequest::Delegate interface and the NetworkDelegate

	57 interface.

	58

	59 The URLRequest::Delegate interface consists of small set callbacks needed to let

	60 the embedder drive a request forward. URLRequest::Delegates generally own the

	61 URLRequest.

	62

	63 The NetworkDelegate is geerally a single object shared by all requests, and

	64 consists includes callbacks corresponding to most of the URLRequest::Delegate's

	65 callbacks, as well as an assortment of other methods. The NetworkDelegate is

	66 optional, the URLRequest::Delegate is not.

	67

	68

	69 # Life of a "Simple" URLRequest

	70

	71 Consider a simple request issued by the renderer process. Suppose it's an HTTP

	72 request, the response is uncompressed, has no request body (i.e. is not an

	73 upload), and no matching entry in the cache.

	74

	75

	76 ## Overview
	Randy Smith (Not in Mondays) 2015/07/08 20:36:25 I'd go for a (short!) paragraph before you dive in I'd go for a (short!) paragraph before you dive into the bulleted list giving an even more abstract description for a request. Maybe something like: "A request for data is normally dispatched from the renderer to the browser process. There a URLRequest is created to represent it. An job specific to the protocol (e.g. HTTP) is attached to the request. That job first checks the cache then establishes a network connection object to actually fetch the data. That connection object interacts with network socket pools to potentially re-use sockets; the socket pools create and connect a socket if there is no appropriate existing socket. Once that socket exists, the HTTP request is dispatched, the response read and parsed, and the result returned back up the stack and sent over to the renderer. Of course, it's never quite that simple :-}." And then dive into the details. That'll give the reader a bit more understanding of the motivation for the different levels, and some sense of not being completely lost as they wander through those levels.
	77

	78 ### Request Starts

	79

	80 * ResourceDispatcher to creates an IPCResourceLoaderBridge.

	81 * The IPCResourceLoaderBridge asks ResourceDispatcher to start the request.

	82 * ResourceDispatcher sends an IPC to the ResourceDispatcherHost in the browser
	Randy Smith (Not in Mondays) 2015/07/08 20:36:24 I'd add a line with some visual distinction after I'd add a line with some visual distinction after this one to call out the transition to the browser process. mmenke 2015/07/09 19:38:28 I split up the section. Let me know if you think Show quoted text On 2015/07/08 20:36:24, rdsmith wrote: > I'd add a line with some visual distinction after this one to call out the > transition to the browser process. I split up the section. Let me know if you think I should do more.
	83 process.

	84 * ResourceDispatcherHost uses the URLRequestContext to create the URLRequest.

	85 * ResourceDispatcherHost creates a ResourceLoader and ResourceHandler chain to
	Randy Smith (Not in Mondays) 2015/07/08 20:36:24 In this line and the next one, I'd refer to the "U In this line and the next one, I'd refer to the "URLRequest" rather than the "Request" to make clear the top-level concept versus lower level object distinction. mmenke 2015/07/09 19:38:28 Done. Show quoted text On 2015/07/08 20:36:24, rdsmith wrote: > In this line and the next one, I'd refer to the "URLRequest" rather than the > "Request" to make clear the top-level concept versus lower level object > distinction. Done.
	86 manage the request.

	87 * ResourceLoader starts the request.

	88

	89 ### Request is Issued
	Randy Smith (Not in Mondays) 2015/07/08 20:36:24 This section is intimidating in the number of bull This section is intimidating in the number of bullet items it has just hammering at the reader. (At least, it's intimidating to me, and I've repeatedly created this same bulleted list for my own education.) I'd break this into sub-sections, maybe URLRequest->HttpNetworkTransaction, HTTPNetworkTransaction->TransportClientSocketPool, TransportClientSocketPool->HttpStreamFactoryImpl::Job/HttpBasicStream. This might be done with nesting, to make unwinding easier, but if there's a clear way to have it in separate sections, that's somewhat better. Randy Smith (Not in Mondays) 2015/07/08 20:36:25 I'd put some verbiage here indicating that what's I'd put some verbiage here indicating that what's below is the common case, but there may be variations at each level, usually at the "asks the X to create a Y, which is a Z" that sometimes it's not a Z but some other subclass of Y. mmenke 2015/07/09 19:38:29 I don't want to hammer everywhere that this is a " Show quoted text On 2015/07/08 20:36:25, rdsmith wrote: > I'd put some verbiage here indicating that what's below is the common case, but > there may be variations at each level, usually at the "asks the X to create a Y, > which is a Z" that sometimes it's not a Z but some other subclass of Y. I don't want to hammer everywhere that this is a "simple" URLRequest. I've rearranged things slightly, tell me what you think.
	90

	91 * The URLRequest asks the URLRequestJobFactory to create a URLRequest[Http]Job.
	Randy Smith (Not in Mondays) 2015/07/08 20:36:24 Especially if you're targeting this document at fo Especially if you're targeting this document at folks new to the network stack, I think "URLRequest[Http]Job" is too abbreviated a way of saying "A URLRequestJob. Usually, this will be the URLRequestJob subclass, URLRequestHttpJob", so I'd spell that out. mmenke 2015/07/09 19:38:28 Done. Show quoted text On 2015/07/08 20:36:24, rdsmith wrote: > Especially if you're targeting this document at folks new to the network stack, > I think "URLRequest[Http]Job" is too abbreviated a way of saying "A > URLRequestJob. Usually, this will be the URLRequestJob subclass, > URLRequestHttpJob", so I'd spell that out. Done.
	92 * The URLRequestHttpJob asks the HttpCache to create an HttpTransaction (always

	93 an HttpCache::Transaction).

	94 * The HttpCache::Transaction sees there's no cache entry for the request, and

	95 creates an HttpNetworkTransaction.

	96 * The HttpNetworkTransaction calls into the HttpStreamFactory to request an

	97 HttpStream.

	98 * HttpStreamFactory creates an HttpStreamFactoryImpl::Job.

	99 * HttpStreamFactoryImpl::Job calls into the TransportClientSocketPool to

	100 populate an ClientSocketHandle.

	101 * TransportClientSocketPool has no idle sockets, so it creates a

	102 TransportConnectJob and starts it.

	103 * TransportConnectJob creates a StreamSocket and establishes a connection.

	104 * TransportClientSocketPool puts the StreamSocket in the ClientSocketHandle,

	105 and calls into HttpStreamFactoryImpl::Job.

	106 * HttpStreamFactoryImpl::Job creates an HttpBasicStream, which takes ownership

	107 of the ClientSocketHandle.

	108 * It returns the HttpBasicStream to the HttpNetworkTransaction.

	109 * HttpNetworkTransaction gives the request headers to the HttpBasicStream, and

	110 tells it to start the request.

	111 * HttpBasicStream sends the request, and waits for the response.

	112 * The HttpBasicStream sends the response headers back to the

	113 HttpNetworkTransaction.

	114 * Headers are sent up to the URLRequest, to the ResourceLoader, through the

	115 ResourceHandler stack.

	116 * They're then send by the AsyncResourceHandler to the ResourceDispatcher.

	117

	118 ### Response body is read

	119

	120 * AsyncResourceHandler allocates a 512k ring buffer of shared memory to read

	121 the body of the request.

	122 * AsyncResourceHandler tells the ResourceLoader to read the response body to

	123 the buffer, 32kB at a time.

	124 * AsyncResourceHandler informs the ResourceDispatcher of each read.

	125 * ResourceDispatcher tells the AsyncResourceHandler when it's done with the

	126 data with each read, so it knows when parts of the buffer can be reused.

	127

	128 ### URLRequest is Destroyed

	129

	130 * When complete, the RDH deletes the ResourceLoader, which deletes the

	131 URLRequest.

	132 * During destruction, the HttpNetworkTransaction determines if the socket is

	133 reusable, and if so, tells the HttpBasicStream to return it to the socket pool.

	134

	135 ## Details
	Randy Smith (Not in Mondays) 2015/07/08 20:36:24 I'd suggest breaking this section out into subsect I'd suggest breaking this section out into subsections; I think it'll be a lot more readable. Maybe "Process Architecture" for RD/RDH, "URLRequest Creation and Plumbing" for URLRequestContext->ResourceLoader/Handler, "Http Transactions" for the cache and network transactions, and maybe "Socket pools", "Http Parsing and Data Processing" for the rest. It would be really useful if this section was broken out in the same way as the above section (you don't need the labels on the above section, but if the reader has already bucketed the concepts in a certain way, they'll read this section more easily). mmenke 2015/07/09 19:38:28 I've split this into the same sections, with the s Show quoted text On 2015/07/08 20:36:24, rdsmith wrote: > I'd suggest breaking this section out into subsections; I think it'll be a lot > more readable. Maybe "Process Architecture" for RD/RDH, "URLRequest Creation > and Plumbing" for URLRequestContext->ResourceLoader/Handler, "Http Transactions" > for the cache and network transactions, and maybe "Socket pools", "Http Parsing > and Data Processing" for the rest. It would be really useful if this section > was broken out in the same way as the above section (you don't need the labels > on the above section, but if the reader has already bucketed the concepts in a > certain way, they'll read this section more easily). I've split this into the same sections, with the same names, as the first part, which I think is a big improvment. Open to ideas for how to better split things.
	136

	137 Each child process has at most one ResourceDispatcher, which is responsible for

	138 all URL request-related communication with the browser process. When something

	139 in the renderer needs to issue a resource request, it calls into the

	140 ResourceDispatcher, which returns an IPCResourceLoaderBridge to the caller.

	141 The caller uses the bridge to start a request. When started, the

	142 ResourceDispatcher assigns the request a per-renderer ID, and then sends the

	143 ID, along with all information needed to issue the request, to the

	144 ResourceDispatcherHost in the browser process.

	145

	146 The ResourceDispatcherHost (RDH), along with most of the network stack, lives

	147 on the browser process's IO thread. The browser process only has one RDH,

	148 which is responsible for handling all network requests initiated by

	149 ResourceDispatchers in all child processes, not just renderer process.

	150 Browser-initiated don't go through the RDH, with some exceptions.

	151

	152 When the RDH sees the request, it calls into a URLRequestContext to create the

	153 URLRequest. The URLRequestContext has pointers to all the network stack

	154 objects needed to issue the request over the network, such as the cache, cookie

	155 store, and host resolver. The RDH then creates a chain of ResourceHandlers

	156 each of which can monitor/modify/delay/cancel the URLRequest and the

	157 information it returns. The only one of these I'll talk about here is the

	158 AsyncResourceHandler, which is the last ResourceHandler in the chain. The RDH

	159 then creates a ResourceLoader (Which is the URLRequest::Delegate), passes

	160 ownership of the URLRequest and the ResourceHandler chain to it, and then starts

	161 the ResourceLoader.

	162

	163 The ResourceLoader checks that none of the ResourceHandlers want to cancel,

	164 modify, or delay the request, and then finally starts the URLRequest. The

	165 URLRequest then calls into the URLRequestJobFactory to create a URLRequestJob

	166 and then starts it. In the case of an HTTP or HTTPS request, this will be a

	167 URLRequestHttpJob.

	168

	169 The URLRequestHttpJob calls into the HttpCache to create an

	170 HttpCache::Transaction. If there's no matching entry in the cache, the

	171 HttpCache::Transaction will just call into the HttpNetworkLayer to create an

	172 HttpNetworkTransaction, and transparently wrap it.

	173

	174 The HttpNetworkTransaction calls into the HttpStreamFactory to request an

	175 HttpStream to the server. The HttpStreamFactoryImpl::Job creates a

	176 ClientSocketHandle to hold a socket, once connected, and passes it in to the

	177 ClientSocketPoolManager. The ClientSocketPoolManager assembles the

	178 TransportSocketParams needed to establish the connection and creates a group

	179 name ("host:port") used to identify sockets that can be used interchangeably.

	180

	181 The ClientSocketPoolManager directs the request to the

	182 TransportClientSocketPool, since there's no proxy and it will be using

	183 HTTP/1.x. The pool sends it on to the

	184 ClientSocketPoolBase<TransportSocketParams> it wraps, which sends it on to its

	185 ClientSocketPoolBaseHelper, which actually manages the socket pool. If there

	186 isn't already an idle connection, and there are available socket slots, the

	187 ClientSocketPoolBaseHelper will create a new TransportConnectJob using the

	188 aforementioned params object. The Job will do the actual DNS lookup by calling

	189 into the HostResolverImpl, if needed, and then finally establish a connection.

	190

	191 When the socket is connected, ownership of the socket is passed to the

	192 ClientSocketHandle. The HttpStreamFactoryImpl::Job is informed the

	193 connection attempt succeeded, and it then creates an HttpBasicStream, which

	194 takes ownership of the ClientSocketHandle. It then passes ownership of the

	195 HttpBasicStream back to the HttpNetworkTransaction. The Transaction passes

	196 the request headers to the HttpBasicStream, which uses an HttpStreamParser to

	197 (finally) format the request headers and send them to the server. The

	198 HttpStreamParser waits to receive the response and then parses the HTTP/1.x

	199 response headers, and then passes them up through both Transaction classes

	200 to the URLRequestHttpJob, which passes them up to the URLRequest and on to

	201 the ResourceLoader.

	202

	203 The ResourceLoader passes them through the chain of ResourceHandlers, and then

	204 they make their way to the AsyncResourceHandler. The AsyncResourceHandler uses

	205 the renderer process ID ("child ID") to figure out which process the request

	206 was associated with, and then sends the headers along with the request ID to

	207 that process's ResourceDispatcher. The ResourceDispatcher uses the ID to

	208 figure out which IPCResourceLoaderBridge the headers should be sent to, which

	209 sends them on to whatever created the IPCResourceLoaderBridge in the first

	210 place.

	211

	212 Without waiting to hear back from the ResourceDispatcher, the ResourceLoader

	213 tells its ResourceHandler chain to allocate memory to receive the response

	214 body. The AsyncResourceHandler creates a 512KB ring buffer of shared memory,

	215 and then passes the first 32KB of it to the ResourceLoader for the first read.

	216 The ResourceLoader then passes a 32KB body read request down through the

	217 URLRequest all the way down to the HttpResponseParser. Once some data is read,

	218 possibly less than 32KB, the number of bytes read makes its way back to the

	219 AsyncResourceHandler, which passes the shared memory buffer and the offset and

	220 amount of data read to the renderer process.

	221

	222 The AsyncResourceHandler relies on ACKs from the renderer to prevent it from

	223 overwriting data that the rendererer has yet to consume. This process repeats

	224 until the response body is completely read. When the URLRequest informs the

	225 ResourceLoader it's complete, the ResourceLoader tells the ResourceHandlers,

	226 and the AsyncResourceHandler tells the ResourceDispatcher the request is

	227 complete. The RDH then deletes ResourceLoader, which deletes the URLRequest.

	228

	229 When the HttpNetworkTransaction is being torn down, it figures out if the

	230 socket is reusable. If not, it tells the HttpBasicStream to close the socket.

	231 Either way, the ClientSocketHandle returns the socket is then returned to the

	232 socket pool, either for reuse or so the socket pool knows it has another free

	233 socket slot.

	234

	235

	236 # Additional Topics

	237

	238 ## HTTP Cache

	239

	240 The HttpCache::Transaction sits between the URLRequestHttpJob and the

	241 HttpNetworkTransaction, and implements the HttpTransaction interface, just like

	242 the HttpNetworkTransaction. The HttpCache::Transaction checks if a request can

	243 be served out of the cache. If a request needs to be revalidated, it handles

	244 sending a 204 revalidation request over the network. It may also break a range

	245 request into multiple cached and non-cached contiguous chunks, and may issue

	246 multiple network requests for a single range URLRequest.

	247

	248 One important detail is that it has a read/write lock for each URL. The lock

	249 technically allows multiple reads at once, but since an HttpCache::Transaction

	250 always grabs the lock for writing and reading before downgrading it to a read

	251 only lock, all requests for the same URL are effectively done serially. Blink

	252 merges requests for the same URL in many cases, which mitigates this problem to

	253 some extent.

	254

	255 The HttpCache::Transaction uses one of three disk_cache::Backends to actually

	256 store the cache's index and files: The in memory backend, the blockfile cache

	257 backend, and the simple cache backend. The first is used in incognito. The

	258 latter two are both stored on disk, and are used on different platforms.

	259

	260 ## Cancellation

	261

	262 A request can be cancelled by the renderer process Blink or by any of the

	263 ResourceHandlers through the ResourceLoader. When the cancellation message

	264 reaches the URLRequest, it passes on the fact it's been cancelled back to the

	265 ResourceLoader, which then sends the message down the ResourceHandler chain.

	266

	267 When an HttpNetworkTransaction for a canelled request is being torn down, it

	268 figures out if the socket the HttpStream owns can potentially be reused, based

	269 on the protocol (HTTP / SPDY / QUIC) and any received headers. If the socket

	270 potentially can be reused, an HttpResponseBodyDrainer is created to try and

	271 read any remaining body bytes of the HttpStream, if any, before returning the

	272 socket to the SocketPool. If this takes too long, or there's an error, the

	273 socket is closed instead. Since this all happens at the layer below the cache,

	274 any drained bytes are not written to the cache, and as far as the cache layer is

	275 concerned, it only has a partial response.

	276

	277 ## Redirects

	278

	279 The URLRequestHttpJob checks if headers indicate a redirect when it receives

	280 them from the next layer down (Typically the HttpCache::Transaction). If they

	281 indicate a redirect, it tells the cache the response is complete, ignoring the

	282 body, so the cache only has the headers. The cache then treats it as a complete

	283 entry, even if the headers indicated there will be a body.

	284

	285 The URLRequestHttpJob then checks if the URLRequest if the request should be

	286 followed. First it checks the scheme. Then it informs the ResourceLoader

	287 about the redirect, to give it a chance to cancel the request. The information

	288 makes its way down through the AsyncResourceHandler to the ResourceDispatcher

	289 and on into Blink, which checks if the redirect should be followed.

	290

	291 The ResourceDispatcher then asynchronously sends a message back to either

	292 follow the redirect or cancel the request. In either case, the old

	293 HttpTransaction is destroyed, and the HttpNetworkTransaction attempts to drain

	294 the socket for reuse, just as in the cancellation case. If the redirect is

	295 followed, the URLRequest calls into the URLRequestJobFactory to create a new

	296 URLRequestJob, and then starts it.

	297

	298 ## Filters (gzip, SDCH, etc)

	299

	300 When the URLRequestHttpJob receives headers, it sends a list of all Content-

	301 Encoding values to Filter::Factory, which creates a (possibly empty) chain of

	302 filters. As body bytes are received, they're passed through the filters at the

	303 URLRequestJob layer and the decoded bytes are passed back to the embedder.

	304

	305 Since this is done above the cache layer, the cache stores the responses prior

	306 to decompression. As a result, if files aren't compressed over the wire, they

	307 aren't compressed in the cache, either. This behavior can also create problems

	308 when responses are SDCH compressed, as a dictionary may be evicted from the

	309 cache independently of the response that was compressed with it.
	Randy Smith (Not in Mondays) 2015/07/08 20:36:25 nit: I'd leave out "from the cache" since SDCH imp nit: I'd leave out "from the cache" since SDCH implementation is, at least currently, partially memory based around SDCH-specific storage. mmenke 2015/07/09 19:38:29 Reworded it a bit. Show quoted text On 2015/07/08 20:36:25, rdsmith wrote: > nit: I'd leave out "from the cache" since SDCH implementation is, at least > currently, partially memory based around SDCH-specific storage. Reworded it a bit.
	310

	311 TODO(mmenke): Discuss filter creation.

	312

	313 ## Socket Pools

	314

	315 The ClientSocketPoolManager is responsible for assembling the parameters needed

	316 to connect a socket, and then sending the request to the right socket pool.

	317 Each socket request sent to a socket pool comes with a socket params object, a

	318 ClientSocketHandle, and a "group name". The params object contains all the

	319 information a ConnectJob needs to create a connection of a given type, and

	320 different types of socket pools take different params types. The

	321 ClientSocketHandle will take temporary ownership of the socket, once connected

	322 socket, and return it to the socket pool when done. All connections with the

	323 same group name in the same pool can be used to service the same connection

	324 request, so it consists of host, port, protocol, and whether "privacy mode" is

	325 used for requests using the socket or not.

	326

	327 All socket pool classes derive from the ClientSocketPoolBase<SocketParamType>.

	328 The ClientSocketPoolBase handles managing sockets - which requests to create

	329 sockets for, which requests get connected sockets first, which sockets belong

	330 to which groups, connection limits per group, keeping track of and closing idle

	331 sockets, etc. Each ClientSocketPoolBase subclass has its own ConnectJob type,

	332 which establishes a connection using the socket params, before the pool hands

	333 out the connected socket.

	334

	335 ## Socket Pool Layering

	336

	337 Some socket pools are layered on top other socket pools. This is done when a

	338 "socket" in a higher layer needs to establish a connection in a lower level

	339 pool and then take ownership of it as part of its connection process. See later

	340 sections for examples. There are a couple additional complexities here.

	341

	342 From the perspective of the lower layer pool, all of its sockets that a higher

	343 layer pools owns are actively in use, even when the higher layer pool considers

	344 them idle. As a result, when a lower layer pool is at its connection limit and

	345 needs to make a new connection, it will ask any higher layer pools pools to

	346 close an idle connection if they have one, so it can make a new connection.

	347

	348 Sockets in the higher layer pool must have their own distinct group name in the

	349 lower layer pool as well. This is needed so the lower layer pool won't, for

	350 example, group SSL and HTTP connections to the same port together.

	351

	352 ## SSL

	353

	354 When an SSL connection is needed, the ClientSocketPoolManager assembles the

	355 parameters needed both to connect the TCP socket and establish an SSL

	356 connection. It then passes them to the SSLClientSocketPool, which creates

	357 an SSLConnectJob using them. The SSLConnectJob's first step is to call into the

	358 TransportSocketPool to establish a TCP connection.

	359

	360 Once a connection is established by the lower layered pool, the SSLConnectJob

	361 then starts SSL negotiation. Once that's done, the SSL socket is passed back to

	362 the HttpStreamFactoryImpl::Job that initiated the request, and things proceed

	363 just as with HTTP. Whe complete, the socket is returned to the

	364 SSLClientSocketPool.

	365

	366 ## Proxy discovery

	367

	368 The first step the HttpStreamFactoryImpl::Job performs, just before calling

	369 into the ClientSocketPoolManager to create a socket, is to check with the

	370 ProxyService to see if a proxy is needed for the URL it's been given. The

	371 ClientSocketPoolManager then uses this information to find the correct proxy

	372 socket pool to send the request to.

	373

	374 TODO(mmenke): Discuss proxy configurations, WPAD, tracing proxy resolver.

	375

	376 ## Proxy Socket Pools

	377

	378 Each SOCKS or HTTP proxy has its own completely independent set of socket

	379 pools. They have their own exclusive TransportSocketPool, their own protocol-

	380 specific pool above it, and their own SSLSocketPool above that. HTTPS proxies

	381 also have a second SSLSocketPool between the the HttpProxyClientSocketPool and

	382 the TransportSocketPool, since they can talk SSL to both the proxy and the

	383 destination server, layered on top of each other.

	384

	385 ## SPDY

	386

	387 Once an SSL connection is established, the HttpStreamFactoryImpl::Job checks if

	388 SPDY was negotiated over the socket. If so, it creates a SpdySession using the

	389 socket, and a SpdyHttpStream. The SpdyHttpStream will be passed to the

	390 HttpNetworkTransaction, which drives the stream as usual.

	391

	392 The SpdySession will be shared with other Jobs conecting to the same server,

	393 and future Jobs will find the SpdySession before they try to create a

	394 connection. HttpServerProperties also tracks which servers supported SPDY when

	395 we last talked to them. We only try to establish a single connection to servers

	396 we thing speak SPDY when multiple HttpStreamFactoryImpl::Jobs are trying to

	397 connect to them, to avoid wasting resources.

	398

	399 ## QUIC

	400

	401 HttpServerProperties also tracks which servers have advertised QUIC support in

	402 the past. If a server hass advertisied QUIC support, a second

	403 HttpStreamFactoryImpl::Job will be created for SPDY, and will be raced against

	404 the one for HTTP/HTTPS connection. Whichever connects first will be used.

	405 Existing QUIC sessions will be reused if available.

	406

	407 TODO(mmenke): Discuss SPDY/QUIC proxies?

	408

	409 ## Uploads

	410

	411 Upload data is passed to a URLRequest using the UploadDataStream class. Since

	412 the over-the-wire format of uploads is determined by the HttpStream type, the

	413 upload body is read from the stream and prepared to be sent over the write by

	414 the HttpStream classes (HttpBasicStream, SpdyHttpStream, QuicHttpStream).

	415 UploadDataStreams have to be replayable, since redirects and retries may need

	416 to re-upload data.

	417

	418 UploadDataStreams either have a length known in advance, or are "chunked".

	419 The main implementation for the non-chunked case is ElementsUploadDataStream,

	420 which consists of one or more UploadElementReader, each of which contains a

	421 fixed-size chunk of data, either in memory or in a file.

	422

	423 ChunkedUploadDataStream is the main implementation for the chunked case.

	424 Chunked uploads are only used by Chrome internally, since many servers don't

	425 support them, and the length is always known in advance for web-initiated

	426 uploads. Chrome adds data bit by bit, and the HttpStream implementation

	427 sends data as long as more data is needed and it has more data to send.

	428 Because of the replayable requirement mentioned above, the entire content of

	429 these chunked requests must be buffered into memory.

	430

	431 One weirdness is that reads from UploadDataStreams currently aren't allowed to

	432 fail. If a read from a file fails, then the contents of the file are replaced

	433 by 0's. Apparently this matched FireFox's behavior at the time of

	434 implementation.

	435

	436 ## Cookies

	437

	438 Cookies are added to a request by the URLRequestHttpJob, and saved at that layer

	439 as well, once the response headers have been received. The CookieStore (The

	440 implementation of which is called "CookieMonster") handles storage of cookies,

	441 and can be used either as an in-memory store or with an on-disk store, backed by

	442 a SQLitePersistentCookieStore.

	443

	444 The CookieStore is currently reference counted, and outlives the rest of the

	445 network stack, which has led to some lifetime issues.

	446

	447 ## Prioritization

	448

	449 URLRequests are assigned a priority on creation. It only comes into play in

	450 a couple places:

	451

	452 * DNS lookups are initiated based on the highest priority request for a lookup.

	453 * Socket pools hand out and create sockets on prioritization. However, idle

	454 sockets will be assigned to low priority requests in preference to creating new

	455 sockets for higher priority requests.

	456 * SPDY and QUIC both support sending priorities over-the-wire.

	457

	458 At the socket pool layer, sockets are only assigned to socket requests once the

	459 socket is connected and SSL is negotiated, if needed. This is done so that if

	460 a higher priority request for a group reaches the socket pool before a

	461 connection is established, the first usable connection goes to the highest

	462 priority socket request.

	463

	464 ## ResourceScheduler

	465

	466 In addition to net's use of priorities, requests issued by other processes go

	467 through the ResourceScheduler. The ResourceScheduler restricts the number of

	468 low priority URLRequests for a given page can be started at once, based on the

	469 presense of higher priority requests. The idea is to reduce bandwidth

	470 contention, and to reduce the chance of low priority resources, like images,

	471 of delaying high priority HTML, CSS, and blocking scripts, so the page is

	472 displayable and interactive sooner, even if it's missing images and the like.

	473

	474 ## Non-HTTP schemes

	475

	476 The URLRequestJobFactory has a ProtocolHander for each supported scheme.

	477 Non-HTTP URLRequests have their own ProtocolHandlers. Some are implemented in

	478 net/, (like FTP, file, and data, though blink handles some data URLs

	479 internally), and others are implemented in content/ or chrome (like blob,

	480 chrome, and chrome-extension).

OLD	NEW

« no previous file with comments | « no previous file | no next file » | no next file with comments »