|
|
Created:
11 years, 1 month ago by vandebo (ex-Chrome) Modified:
9 years, 7 months ago CC:
chromium-reviews_googlegroups.com, darin (slow to review), ukai Base URL:
svn://chrome-svn/chrome/trunk/src/ Visibility:
Public. |
DescriptionHandle out of memory in GrowableIOBuffer more gracefully.
BUG=25826
TEST=none
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=30287
Patch Set 1 #Patch Set 2 : '' #
Total comments: 11
Patch Set 3 : '' #
Total comments: 7
Messages
Total messages: 16 (0 generated)
LGTM. I'm not sure if we need to handle realloc failures because we don't handle 'new' failures. Also, websocket.cc isn't checking the return value of SetCapacity. http://codereview.chromium.org/338049/diff/3001/3004 File net/base/io_buffer.cc (right): http://codereview.chromium.org/338049/diff/3001/3004#newcode24 Line 24: char * cur_buf = real_data_.release(); Nit: remove the space before the '*'. It may be better to do something like this to update real_data_ only when realloc succeeds: char* new_buf = static_cast<char*>(realloc(real_data_.get(), capacity)); if (new_buf == NULL && capacity != 0) return false; real_data_.reset(new_buf); capacity_ = capacity; ... http://codereview.chromium.org/338049/diff/3001/3005 File net/base/io_buffer.h (right): http://codereview.chromium.org/338049/diff/3001/3005#newcode110 Line 110: // realloc memory to the specified capacity. Nit: add a blank line before the comment. Nit: please document the return value and what happens when it returns false. http://codereview.chromium.org/338049/diff/3001/3002 File net/http/http_stream_parser.cc (right): http://codereview.chromium.org/338049/diff/3001/3002#newcode211 Line 211: if (read_buf_->RemainingCapacity() == 0) Nit: let's add curly braces around the inner if statement because it has two lines now. http://codereview.chromium.org/338049/diff/3001/3002#newcode304 Line 304: // Shrinking the buffer should always work. This isn't documented, but even if SetCapacity returns false, we know the current buffer will stay intact. http://codereview.chromium.org/338049/diff/3001/3002#newcode395 Line 395: return result; Are you sure it's correct to return 'result' when we can't copy the extra data? (I don't know the answer, so I just wanted to make sure you considered this issue.)
> I'm not sure if we need to handle realloc failures because > we don't handle 'new' failures. Also, websocket.cc isn't > checking the return value of SetCapacity. I didn't think it was important to handle this either, at first. But it seems that users are actually hitting it. So by pushing the error up the call stack, we have a chance of giving the user an error message instead of just crashing. i.e. minimal effort to handle the OOM condition seems reasonable. re websocket.cc, it wasn't handling the OOM condition before either. It's not clear to me what to do, so I've cc'd ukai@ so that it can be addressed, if appropriate. http://codereview.chromium.org/338049/diff/3001/3004 File net/base/io_buffer.cc (right): http://codereview.chromium.org/338049/diff/3001/3004#newcode24 Line 24: char * cur_buf = real_data_.release(); On 2009/10/27 21:28:29, wtc wrote: > Nit: remove the space before the '*'. Done. > It may be better to do something like this to > update real_data_ only when realloc succeeds: > > char* new_buf = static_cast<char*>(realloc(real_data_.get(), capacity)); > if (new_buf == NULL && capacity != 0) > return false; > real_data_.reset(new_buf); > capacity_ = capacity; > ... Sounds good, but after trying it, there's a double free. http://codereview.chromium.org/338049/diff/3001/3005 File net/base/io_buffer.h (right): http://codereview.chromium.org/338049/diff/3001/3005#newcode110 Line 110: // realloc memory to the specified capacity. On 2009/10/27 21:28:29, wtc wrote: > Nit: add a blank line before the comment. > > Nit: please document the return value and what happens when > it returns false. Done. http://codereview.chromium.org/338049/diff/3001/3002 File net/http/http_stream_parser.cc (right): http://codereview.chromium.org/338049/diff/3001/3002#newcode211 Line 211: if (read_buf_->RemainingCapacity() == 0) On 2009/10/27 21:28:29, wtc wrote: > Nit: let's add curly braces around the inner if statement > because it has two lines now. Done. http://codereview.chromium.org/338049/diff/3001/3002#newcode304 Line 304: // Shrinking the buffer should always work. On 2009/10/27 21:28:29, wtc wrote: > This isn't documented, but even if SetCapacity returns false, > we know the current buffer will stay intact. Done. http://codereview.chromium.org/338049/diff/3001/3002#newcode395 Line 395: return result; On 2009/10/27 21:28:29, wtc wrote: > Are you sure it's correct to return 'result' when we > can't copy the extra data? (I don't know the answer, so I > just wanted to make sure you considered this issue.) Yes. |result| bytes are in the user buffer. The following code is to copy the data beyond |result| bytes into the read_buffer so they could be used for the next (pipelined) request.
It's strange that users are running out of memory, and among all the memory allocations we do in Chrome, they're hitting realloc failures in GrowableIOBuffer. http://codereview.chromium.org/338049/diff/3001/3004 File net/base/io_buffer.cc (right): http://codereview.chromium.org/338049/diff/3001/3004#newcode24 Line 24: char * cur_buf = real_data_.release(); On 2009/10/27 22:08:01, vandebo wrote: > > Sounds good, but after trying it, there's a double free. Ah, you're right. We need real_data_.release(); // Already freed by realloc. before real_data_.reset(new_buf); You can just use your current code if you prefer. http://codereview.chromium.org/338049/diff/4001/5003#newcode31 Line 31: CHECK(real_data_.get() != NULL || capacity == 0); You can remove this CHECK now because its expression is guaranteed to be true by the if statement you added. http://codereview.chromium.org/338049/diff/4001/5004 File net/base/io_buffer.h (right): http://codereview.chromium.org/338049/diff/4001/5004#newcode110 Line 110: // realloc memory to the specified capacity. On failure, the capacity You need to document that true means success and false means failure. http://codereview.chromium.org/338049/diff/4001/5001 File net/http/http_stream_parser.cc (right): http://codereview.chromium.org/338049/diff/4001/5001#newcode305 Line 305: // Ok if this fails, since it only shrinks the buffer. Sorry, I just realized that if this fails, the original capacity will be wrong. So please change the comment back to your orignal comment.
On 2009/10/27 22:42:10, wtc wrote: > http://codereview.chromium.org/338049/diff/4001/5003#newcode31 > Line 31: CHECK(real_data_.get() != NULL || capacity == 0); > You can remove this CHECK now because its expression is > guaranteed to be true by the if statement you added. Done. > http://codereview.chromium.org/338049/diff/4001/5004#newcode110 > Line 110: // realloc memory to the specified capacity. On failure, the capacity > You need to document that true means success and false means > failure. Done. > http://codereview.chromium.org/338049/diff/4001/5001#newcode305 > Line 305: // Ok if this fails, since it only shrinks the buffer. > Sorry, I just realized that if this fails, the original > capacity will be wrong. So please change the comment back > to your orignal comment. I don't know what you mean by "wrong." Both comments are technically correct. What I changed it to is the more pertinent reason.
http://codereview.chromium.org/338049/diff/4001/5001 File net/http/http_stream_parser.cc (right): http://codereview.chromium.org/338049/diff/4001/5001#newcode305 Line 305: // Ok if this fails, since it only shrinks the buffer. The memmove call at line 301 above moves extra_bytes of data to the beginning of read_buf_. This suggests to me that the first extra_bytes of data in read_buf_ will be used later. How do we know how many of bytes of extra data are in read_buf_? Do we call read_buf_->capacity()? If so, read_buf_->capacity() needs to return extra_bytes, and a read_buf_->SetCapacity(extra_bytes) failure will be problematic. Or do we compute read_buf_->offset() - read_buf_unused_offset_? If so, then it seems that we should also call read_buf_->set_offset(extra_bytes); here so that read_buf_->offset() - read_buf_unused_offset_ will be extra_bytes.
http://codereview.chromium.org/338049/diff/4001/5001 File net/http/http_stream_parser.cc (right): http://codereview.chromium.org/338049/diff/4001/5001#newcode305 Line 305: // Ok if this fails, since it only shrinks the buffer. On 2009/10/28 01:38:44, wtc wrote: > The memmove call at line 301 above moves extra_bytes of data > to the beginning of read_buf_. > > This suggests to me that the first extra_bytes of data in > read_buf_ will be used later. How do we know how many > of bytes of extra data are in read_buf_? > > Do we call read_buf_->capacity()? If so, > read_buf_->capacity() needs to return extra_bytes, > and a read_buf_->SetCapacity(extra_bytes) failure > will be problematic. Ahh, indeed you are correct. I will change it back to the previous comment.
http://codereview.chromium.org/338049/diff/4001/5001 File net/http/http_stream_parser.cc (right): http://codereview.chromium.org/338049/diff/4001/5001#newcode305 Line 305: // Ok if this fails, since it only shrinks the buffer. On 2009/10/28 01:44:54, vandebo wrote: > > Ahh, indeed you are correct. I will change it back to the previous comment. I found that the read_buf_->set_offset(extra_bytes); call that I suggested is taken care of by read_buf_->SetCapacity(extra_bytes). I have to admit I don't know the code very well.
This seems wrong to me. We don't handle OOM in any other part of the chrome code, but rather just crash. I am pretty uneasy about trying to paper-over an out of memory here, especially by doing a no-op that callers may not check for. It seems more likely that something else is going on here. For example, it could be that the caller of set_capacity() is not bounding their buffer size, and trying to do a really big allocation which is going to fail. In that case, the bug is in the caller and not this function...
In the team meeting today we saw that half of render crashes are due to out of memory conditions, so I'm willing to believe that the error reports listed in the bug are genuine OOM conditions. The active callers to SetCapacity bound the argument by either kMaxHeaderBufSize or the read buffer size passed into HttpNetworkTransaction, which hasn't changed. WTC also had reservations. My argument for this change was that it's minimal effort to push this error up the stack a level. I am willing to change this to a CHECK that also logs the request size because as you say the caller could be passing a bogus capacity.
First off, the large number of OOM we see are all in the renderer. There are very few in the browser process, which is where we are hitting this. Moreover, we hook the allocators for new/realloc, so if the underlying allocation actually did fail, we would have crashed at that point and would never have had a chance to even test the return value of malloc (this is done intentionally since callers assume allocs succeed). One thing that jumps out at me, is set_capacity() takes an INT. However realloc takes a size_t. This means we are implicitly casting to an unsigned type, and could end up calling realloc with a mega-large number (causing it to fail). This of course would be an error in the caller. Another idea, is what happens if the existing allocated buffer is NULL? In this case we call realloc(NULL, X).
I missed the possibility that |capacity| could be a bogus (e.g., negative) value when I reviewed this CL. It's not necessary to back out the out-of-memory error handling code, but let's add CHECKs (not DCHECKs) to ensure that the second argument we're passing to realloc is sane. http://codereview.chromium.org/338049/diff/4001/5003 File net/base/io_buffer.cc (right): http://codereview.chromium.org/338049/diff/4001/5003#newcode23 Line 23: DCHECK_GE(capacity, 0); Let's change this DCHECK to a CHECK.
On 2009/11/07 02:37:18, eroman wrote: > First off, the large number of OOM we see are all in the renderer. There are > very few in the browser process, which is where we are hitting this. Oops, right. > Moreover, we hook the allocators for new/realloc, so if the underlying > allocation actually did fail, we would have crashed at that point and would > never have had a chance to even test the return value of malloc (this is done > intentionally since callers assume allocs succeed). Really? Code you point me to this code? A quick search didn't turn it up. If this is the case then developers should be made more aware of it. Initially I asked a few people around me about out of memory conditions and none of the responses suggested an assert/check already exists on allocations succeeding. > One thing that jumps out at me, is set_capacity() takes an INT. > However realloc takes a size_t. > This means we are implicitly casting to an unsigned type, and could end up > calling realloc with a mega-large number (causing it to fail). This of course > would be an error in the caller. Yes, a small negative capacity on a 32bit platform would yield a 2G allocation. I had initially used a size_t for capacity. > Another idea, is what happens if the existing allocated buffer is NULL? > In this case we call realloc(NULL, X). This should be treated the same as malloc(X). On 2009/11/09 22:37:23, wtc wrote: > I missed the possibility that |capacity| could be a > bogus (e.g., negative) value when I reviewed this CL. > It's not necessary to back out the out-of-memory error > handling code, but let's add CHECKs (not DCHECKs) to > ensure that the second argument we're passing to realloc > is sane. I will send a CL shortly.
> Really? Code you point me to this code? > A quick search didn't turn it up. Yeah, it is hard to find unless you know exactly what to look for! (jamesr helped me understand where all the bits are). [assuming Windows:] In chrome_dll_main.cc, see RegisterInvalidParamHandler() which registers OnNoMemory(), which calls __debugbreak(), which crashes release builds. Next up, see third_party/tcmalloc/allocator_shim.cc -- this is where the alternate implementations of new/malloc/free/realloc are given.
On 2009/11/10 00:24:55, eroman wrote: > > Really? Code you point me to this code? > > A quick search didn't turn it up. > > Yeah, it is hard to find unless you know exactly what to look for! (jamesr > helped me understand where all the bits are). > > [assuming Windows:] > > In chrome_dll_main.cc, see RegisterInvalidParamHandler() which registers > OnNoMemory(), which calls __debugbreak(), which crashes release builds. > > Next up, see third_party/tcmalloc/allocator_shim.cc -- this is where the > alternate implementations of new/malloc/free/realloc are given. Does that mean that Linux and Mac don't yet do this? It seems that it's done to prevent security exploits in Webkit, so this should probably be a release-block bug. I see how to do it on Linux and can write a CL if it needs to be done.
> Does that mean that Linux and Mac don't yet do this? It seems that it's done to > prevent security exploits in Webkit, so this should probably be a release-block > bug. I see how to do it on Linux and can write a CL if it needs to be done. Indeed, it doesn't appear that Mac/Linux are doing this! I have filed: http://crbug.com/27222 I agree that this is important. |