Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(171)

Side by Side Diff: storage/browser/blob/README.md

Issue 2637023003: [BlobStorage] Adding explainer for blob storage system. (Closed)
Patch Set: added more information Created 3 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | no next file » | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
(Empty)
1 # Chrome's Blob Storage System Design
2
3 Elaboration of the blob storage system in Chrome.
4
5 # What are blobs?
6
7 Please see the [FileAPI Spec](https://www.w3.org/TR/FileAPI/) for the full
8 specification for Blobs, or [Mozilla's Blob documentation](
9 https://developer.mozilla.org/en-US/docs/Web/API/Blob) for a description of how
10 Blobs are used in the Web Platform in general. For the purposes of this
11 document, the important aspects of blobs are:
12
13 1. Blobs are immutable.
14 2. Blob can be made using one or more of: bytes, files, or other blobs.
15 3. Blobs can be 'sliced', which creates a blob that is a subsection of another
pwnall 2017/01/20 02:10:54 How about having the word "sliced" link to https:/
dmurph 2017/01/20 20:23:16 Done.
16 blob.
17 4. Reading blobs is asynchronous.
pwnall 2017/01/20 02:10:55 Is it worth noting that obtaining blob metadata (e
dmurph 2017/01/20 20:23:15 Done.
18 5. Blobs can be passed to other browsing contexts, such as Javascript workers
19 or other tabs.
20
21 In Chrome, after blob creation the actual blob 'data' gets transported to and
22 lives in the browser process. The renderer just holds a reference -
23 specifically a string UUID - to the blob, which it can use to read the blob or
24 pass it to other processes.
25
26 # Summary & Terminology
27
28 Blobs are created in the renderer process, where their data is temporarily held
pwnall 2017/01/20 02:10:55 in _a_ renderer process?
dmurph 2017/01/20 20:23:16 Done.
29 for the browser (while Javascript execution can continue). When the browser has
30 enough memory quota for the blob, it requests the data from the renderer. Once
31 all data is transported and construction is complete, any pending reads for the
pwnall 2017/01/20 02:10:54 I'd emphasize the word "transported" in some way,
dmurph 2017/01/20 20:23:16 Done.
32 blob are allowed to complete. Blobs can be small (bytes) or huge (GBs), so
pwnall 2017/01/20 02:10:54 "small (bytes)" does not seem to add value here
dmurph 2017/01/20 20:23:15 Done.
33 quota is necessary.
34
35 If the in-memory space for blobs is getting full, or a new blob is too large to
36 be in-memory, then the blob system uses the disk. This can either be paging old
37 blobs to disk, or saving the new too-large blob straight to disk.
38
39 Blob reading goes through the network layer, where the renderer dispatches a
40 network request for the blob and the browser responds with the
41 `BlobURLRequestJob`.
42
43 General Chrome terminology:
pwnall 2017/01/20 02:10:55 I'd like to https://www.chromium.org/developers/de
dmurph 2017/01/20 20:23:16 Done.
44
45 * **Renderer (Process)**: Process where the web contents and javascript lives.
46 This is basically a tab. There are multiple renderers, and they all have
47 security restrictions.
48 * **Browser (Process)**: There is only one browser process, and it doesn't have
49 security restrictions.
50 * **Shared Memory**: Memory that both the browser and renderer process can read
51 & write. Created only between 2 processes.
52 * **IPC**: A message sent between processes. To avoid crashes and memory issues
53 the blob system tries to limit the maximum size of an ipc message.
54
55 Blob system terminology:
56
57 * **Blob**: This is a blob object, which can consist of bytes or files, as
58 described above.
59 * **BlobItem** or **[DataElement](
60 https://cs.chromium.org/chromium/src/storage/common/data_element.h)**:
61 This is a primitive element that can basically be a File, Bytes, or another
62 Blob. It also stores an offset and size, so this can be a part of a file. (This
63 can also represent "future" file and "future" bytes, which is used to signify a
pwnall 2017/01/20 02:10:54 "future" files?
dmurph 2017/01/20 20:23:17 Done.
64 bytes or file item that has not been transported yet).
65 * **dependent blobs**: These are blobs that our blob depends on to be
pwnall 2017/01/20 02:10:54 "blobs that a blob has data dependencies on"? The
dmurph 2017/01/20 20:23:17 mmmmmmm I think I saw it as I'm 'dependent' on the
pwnall 2017/01/21 02:56:22 Precisely -- this blob is dependent on the other b
66 constructed. As in, we were constructed with a dependency on another blob
pwnall 2017/01/20 02:10:55 I'm not a big fan of "we" and "our" usage here. I'
dmurph 2017/01/20 20:23:15 Done.
67 (maybe we're a slice or just a blob was in our constructor), and we might need
68 to wait for these to complete constuction before we can declare ourselves
69 constructed as well.
70 * **transportation strategy**: We can have one of 3 transportation strategies
pwnall 2017/01/20 02:10:55 : a method for sending the data in a BlobItem from
dmurph 2017/01/20 20:23:16 Done.
71 for Blobs: send data over IPC, Shared Memory, or Files.
72
73 # How to use Blobs (Browser-side)
74
75 ### Building
pwnall 2017/01/20 02:10:54 ## instead of ###?
dmurph 2017/01/20 20:23:16 Done.
76 All blob interaction should go through the `BlobStorageContext`. Blobs are
77 built using a `BlobDataBuilder`, and as long as you don't use any
pwnall 2017/01/20 02:10:55 any chance you could move the caveat after the mai
dmurph 2017/01/20 20:23:16 Done.
78 `BlobDataBuilder::AppendFuture*` methods then calling
79 `BlobStorageContext::AddFinishedBlob` or `::BuildBlob` is all you need to do to
80 create a `BlobDataHandle` that is eventually readable.
81
82 If you have known data that is not available yet, you can use the
83 `AppendFuture*` methods no the builder, but you must use
pwnall 2017/01/20 02:10:55 no -> on?
dmurph 2017/01/20 20:23:16 Done.
84 `BlobStorageContext::BuildBlob`, and provide a callback that will notify you
85 when the blob system has enough quota to store the data. At that point you can
86 use the appropriate `BlobDataBuilder::Populate*` methods, and notify the
87 context by calling `BlobStorageContext::NotifyTransportComplete` when done.
88
pwnall 2017/01/20 02:10:55 In general, this sections seems to assume that I'v
dmurph 2017/01/20 20:23:15 Done.
89 ## Accessing / Reading
90
91 All blob information should come from the `BlobDataHandle` returned on
92 construction. This handle is cheap to copy. Once all instances of handles for
93 a blob are destructed, the blob is destroyed.
94
95 `BlobDataHandle::RunOnConstructionComplete` will notify you when the blob is
96 done or broken (due to not enough space, filesystem error, etc).
pwnall 2017/01/20 02:10:54 done -> constructed? broken (construction failed
dmurph 2017/01/20 20:23:17 Done.
97
98 The `BlobReader` class is for reading blobs, and is accessible off of the
99 `BlobDataHandle` at any time.
100
101 # Blob Creation & Transportation (Renderer)
102
103 **This process is outlined with diagrams and illustrations [here](
104 https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XD PSM/edit#slide=id.g75c319281_0_681).**
105
106 This outlines the renderer-side responsabilities of the blob system. The
107 renderer needs to:
108
109 1. Consolidate small bytes items into larger chunks (avoiding a huge array of
110 1 byte items).
111 2. Communicate the blob componsition to the browser immediately on
112 construction.
113 3. Populate shared memory or files sent from the browser with the consolidated
114 blob data items.
115 4. Hold the blob data until the browser is finished requesting it.
116
117 The meat of blob construction starts in the [WebBlobRegistryImpl](
118 https://cs.chromium.org/chromium/src/content/child/blob_storage/webblobregistry_ impl.h)'s
119 `createBuilder(uuid, content_type)`.
120
121 ## Blob Data Consolidation
122
123 Since blobs are often constructed with arrays with single bytes, we try to
124 consolidate all **adjacent** memory blob items into one. This is done in
125 [BlobConsolidation](https://cs.chromium.org/chromium/src/content/child/blob_stor age/blob_consolidation.h).
126 The implementation doesn't actually do any copying or allocating of new memory
127 buffers, instead it facilitates the transformation between the 'consolidated'
128 blob items and the underlying bytes items. This way we don't waste any memory.
129
130 ## Blob Transportation, Renderer
pwnall 2017/01/20 02:10:54 I think it'd be more consistent to end the heading
dmurph 2017/01/20 20:23:16 Done.
131
132 After the blob has been 'consolidated', it is given to the
133 [BlobTransportController](https://cs.chromium.org/chromium/src/content/child/blo b_storage/blob_transport_controller.h).
134 This class:
135
136 1. Immediately communicates the contents of the blob to the Browser. We also
137 [optimistically send](https://cs.chromium.org/chromium/src/content/child/blob_st orage/blob_transport_controller.cc?l=325)
138 the blob data if the total memory is less than our IPC threshold.
139 2. Stores the blob consolidation for data requests from the browser.
140 3. Answers requests from the browser to populate or send the blob data. The
141 browser can request the renderer:
142 1. Send items and populate the data in IPC ([code](
pwnall 2017/01/20 02:10:54 I think line-level links like this one are quite b
dmurph 2017/01/20 20:23:16 Done.
143 https://cs.chromium.org/chromium/src/content/child/blob_storage/blob_transport_c ontroller.cc?l=238)).
144 2. Populate items in shared memory and notify the browser when population is
145 complete ([code](https://cs.chromium.org/chromium/src/content/child/blob_storage /blob_transport_controller.cc?l=249)).
146 3. Populate items in files and notify the browser when population is complete
147 ([code](https://cs.chromium.org/chromium/src/content/child/blob_storage/blob_tra nsport_controller.cc?l=292)).
148 4. Destroys the blob consolidation when the browser says it's done.
149
150 The transport controller also tries to keep the renderer alive while we are
151 sending blobs, as if the renderer is closed then we would lose any pending blob
152 data. It does this by using the [incrementing and decrementing the process ref
pwnall 2017/01/20 02:10:54 remove "using the"? also, ref -> reference?
dmurph 2017/01/20 20:23:16 Done.
153 count](https://cs.chromium.org/chromium/src/content/child/blob_storage/blob_tran sport_controller.cc?l=62),
154 which should prevent fast shutdown.
155
156 # Blob Transportation & Storage (Browser).
pwnall 2017/01/20 02:10:54 the period looks inappropriate here
dmurph 2017/01/20 20:23:16 Done.
157
158 The browser side is a little more complicated. We are thinking about:
159
160 1. Do we have enough space for this blob?
161 2. If so, how do we want to transport it? IPC? Shared Memory? IPC?
pwnall 2017/01/20 02:10:54 Do you mean "File" instead of the last "IPC"? Alt
dmurph 2017/01/20 20:23:16 Done.
162 3. Can I save this in memory right now? Or do I need to wait for older blob
pwnall 2017/01/20 02:10:55 Does this mean "Is there enough free memory to tra
dmurph 2017/01/20 20:23:16 Done.
163 data to be paged to disk?
164 4. Do I need to wait for files to be created?
165 5. Do I need to wait for dependent blobs?
166
167 ## Summary
168
169 We follow this general flow for constructing a blob on the browser side:
170
171 1. Does the blob fit, and what transportation strategy should be used.
172 2. Create our browser-side representation of the blob data, including any data
pwnall 2017/01/20 02:10:54 our -> the
dmurph 2017/01/20 20:23:15 Done.
173 items from dependent blobs. We try to share data items as much as possible, and
pwnall 2017/01/20 02:10:54 Does the 2nd sentence here mean that data items ar
dmurph 2017/01/20 20:23:16 Done.
174 allow for the dependent blob items to be not populated yet.
175 3. Request memory and/or file quota from the BlobMemoryController, which
176 manages our blob storage limits. Quota can be requested for both transportation
pwnall 2017/01/20 02:10:54 can be requested -> is necessary?
dmurph 2017/01/20 20:23:16 Done.
177 and any copies we have to do from dependent blobs.
178 4. If transporation quota is needed and when it is granted:
179 1. Tell the BlobTransportHost to start asking for blob data given the earlier
180 decision of strategy.
181 * The BlobTransportHost populates the browser-side blob data item.
182 2. When transportation is done we notify the BlobStorageContext
183 5. When transportation is done, copy quota is granted, and dependent blobs are
184 complete, we finish the blob.
185 1. We perform any pending copies from dependent blobs
186 2. We notify any listeners that the blob has been completed.
187
188 Note: The transportation sections (steps 1, 2, 3) of this process are described
189 (without thinking about blob dependencies) with diagrams and details in [this
pwnall 2017/01/20 02:10:54 thinking about -> accounting for
dmurph 2017/01/20 20:23:17 Done.
190 presentation](https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjX gx0fp5AP17L7XDPSM/edit#slide=id.g75d5729ce_0_105).
191
192 ## BlobTransportHost
193
194 The `BlobTransportHost` is in charge of the actual transportation of the data
195 from the renderer to the browser. When the initial description of the blob is
pwnall 2017/01/20 02:10:55 I like "description of the blob" / "initial descri
dmurph 2017/01/20 20:23:16 Done.
196 sent to the browser, the BlobTransportHost asks the BlobMemoryController which
197 'strategy' (IPC, Shared Memory, or File) it should use to transport the file.
pwnall 2017/01/20 02:10:54 I don't think you need quotes here, you introduced
dmurph 2017/01/20 20:23:16 Done.
198 Based on this strategy it can transform the memory items sent from the renderer
pwnall 2017/01/20 02:10:54 transform -> translate?
dmurph 2017/01/20 20:23:17 Done.
199 into a browser represetation to facilitate the transportation. See [this](
200 https://docs.google.com/presentation/d/1MOm-8kacXAon1L2tF6VthesNjXgx0fp5AP17L7XD PSM/edit#slide=id.g75d5729ce_0_145)
201 slide, which illustrates how the browser might segment or split up the
202 renderer's memory into transportable chunks.
203
204 Once the transport host decides it's strategy, it will create it's own
pwnall 2017/01/20 02:10:55 it's -> its (twice)
dmurph 2017/01/20 20:23:15 Done.
205 transport state for the blob, including a `BlobDataBuilder` using the
206 transport's data segment representation. Then it will tell the
207 `BlobStorageContext` that it is ready to build the blob.
208
209 When the `BlobStorageContext` tells the transport host that it is ready to
210 transport the blob data, this class's responsability is to populate the
pwnall 2017/01/20 02:10:54 class' ? Or, better yet, "the transport host popu
dmurph 2017/01/20 20:23:15 Done.
211 `BlobDataBuilder` with all the data from the renderer, then signal the storage
212 context that it is done.
213
214 ## BlobStorageContext
215
216 The `BlobStorageContext` is the hub of the blob storage system. It is
217 responsible for creating & managing all the state of constructing blobs, as
218 well as all blob handle generation and general blob status access.
219
220 When a `BlobDataBuilder` is given to the context, whether from the
221 `BlobTransportHost` or from elsewhere, the context will do the following:
222
223 1. Find all dependent blobs in the new blob (any blob reference in the blob
224 item list), and create a 'slice' of their items for the new blob.
225 2. Create the final blob item list representation, which creates a new blob
226 item list which inserts these 'slice' items into the blob reference spots. This
227 is 'flattening' the blob.
228 3. Ask the `BlobMemoryManager` for file or memory quota for the transportation
229 if necessary
230 * When this is approved, it notifies the `BlobTransportHost` that it can
pwnall 2017/01/20 02:10:55 it notifies -> notify (for consistency with the ot
dmurph 2017/01/20 20:23:16 Done.
231 begin transporting the data.
232 4. Ask the `BlobMemoryManager` for memory quota for any copies necessary from
pwnall 2017/01/20 02:10:55 necessary for blob slicing?
dmurph 2017/01/20 20:23:17 Done.
233 the blob slicing.
234 5. Adds completion callbacks to any dependent blobs that our blob depends on.
pwnall 2017/01/20 02:10:54 the word "dependent" here seems redundant
dmurph 2017/01/20 20:23:16 Done.
235
236 When all of the following conditions are met:
237
238 1. The `BlobTransportHost` tells us it has transported all the data (or we
239 don't need to transport data),
240 2. The `BlobMemoryManager` approves our memory quota for slice copies (or we
241 don't need slice copies), and
242 3. All dependent blobs are completed (or we don't have dependent blobs),
243
244 The blob can finish constructing, where any pending blob slice copies are
245 performed, and we set the status of the blob.
246
247 ### BlobStatus lifecycle
248
249 The BlobStatus outlines this procedure (specifically the transport process),
pwnall 2017/01/20 02:10:54 As a reader, I am unsure what "this procedure" ref
dmurph 2017/01/20 20:23:15 Done.
250 and the copy memory quota and dependent blob process is encompassed in
251 `PENDING_INTERNALS`.
252
253 Once a blob is finished constructing, the status is set to `DONE`, or any of
pwnall 2017/01/20 02:10:54 I think you can say "to `DONE`, or to one of the `
dmurph 2017/01/20 20:23:16 Done.
254 the `ERR_*` values if there was an error.
255
256 ### BlobSlice
257
258 During construction, 'slices' are created for dependent blobs using the given
pwnall 2017/01/20 02:10:54 I don't think slices needs quotes here. It's a con
dmurph 2017/01/20 20:23:16 Done.
259 offset and size of the reference. This slice consists of the relevant blob
260 items, and metadata about possible copies from either end. If blob items can
261 entirely be used by the new blob, then we just share the item between the. But
262 if there is a 'slice' of the first or last item, then our resulting BlobSlice
263 representation will create a new bytes item for the new blob, and store the
264 necessary copy data for later.
265
266 ### BlobFlattener
267
268 The `BlobFlattener` takes the new blob description (including blob references),
269 creates blob slices for all the referenced blobs, and constructs a 'flat'
270 representation of the new blob, where all blob references are replaced with the
pwnall 2017/01/20 02:10:54 remove "the"?
dmurph 2017/01/20 20:23:15 Done.
271 'BlobSlice' items. It also stores any copy data from the slices.
pwnall 2017/01/20 02:10:55 I think you want backticks instead of single quote
dmurph 2017/01/20 20:23:16 Done.
272
273 ## BlobMemoryController
274
275 The `BlobMemoryController` is responsable for:
276
277 1. Determining storage quota limits for files and memory, including restricting
278 file quota when disk space is low.
279 2. Determining whether a blob can fit and the transportation strategy to use.
280 3. Allocating memory quota.
pwnall 2017/01/20 02:10:55 It seems to me that "tracking" is a slightly bette
dmurph 2017/01/20 20:23:16 Done.
281 4. Allocating file quota and creating files.
282 5. Accumulating and evicting old blob data to files to disk.
283
OLDNEW
« no previous file with comments | « no previous file | no next file » | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698