|
|
Created:
5 years ago by ccameron Modified:
5 years ago CC:
chromium-reviews, danakj+watch_chromium.org, darin-cc_chromium.org, jam, jbauman+watch_chromium.org, kalyank, piman+watch_chromium.org, sievers+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionMac: Require child AllocateGpuMemoryBuffer to not fail
It is already the case that this will cause a crash in the renderer
process when it dereferences the GpuMemoryBuffer. Move this higher up
the stack.
BUG=561672
Committed: https://crrev.com/d34a0fd2b9080114bd22ad148f8e37da2b3a24f1
Cr-Commit-Position: refs/heads/master@{#363222}
Patch Set 1 #Patch Set 2 : Add GPU proc as well #
Total comments: 7
Patch Set 3 : Just crash ChildGpuMemoryBufferManager::AllocateGpuMemoryBuffer #
Total comments: 2
Messages
Total messages: 20 (6 generated)
ccameron@chromium.org changed reviewers: + reveman@chromium.org
This will take down renderer processes on failure. If we truly believe that GMB allocation should never fail, then this is the right thing to do.
https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... File content/child/child_gpu_memory_buffer_manager.cc (right): https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... content/child/child_gpu_memory_buffer_manager.cc:49: CHECK(!handle.is_null()); It would be useful to know what arguments such as 'size' are when this fails. Fyi, if CHECK(success) fails then we have the same problem in other manager implementations. ChildDiscardableSharedMemoryManager, ChildSharedBitmapManager.. https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... content/child/child_gpu_memory_buffer_manager.cc:54: if (!buffer) { This doesn't make sense if we have a CHECK below. Maybe just return GpuMemoryBufferImpl::CreateFromHandle(handle, size, ...) here and have the caller CHECK the result? https://codereview.chromium.org/1486873002/diff/20001/content/child/child_io_... File content/child/child_io_surface_manager_mac.h (right): https://codereview.chromium.org/1486873002/diff/20001/content/child/child_io_... content/child/child_io_surface_manager_mac.h:31: // This will always return a valid IOSurface (failure will crash). We need to be able to use this in the GPU process without crashing on failure to solve the problem where GMBs can be become invalid after a gpu process crash. Is it enough for now to add CHECKs to the user of this interface? If we need CHECKs here too then let's make it clear that they are only to temporarily diagnose some crashes by adding TODOs and keeping as much of the existing code as is. https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/clie... File content/common/gpu/client/gpu_memory_buffer_impl_io_surface.cc (right): https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/clie... content/common/gpu/client/gpu_memory_buffer_impl_io_surface.cc:58: CHECK(io_surface); I think this needs to be able to fail without crashing as this client interface is also used in the browser process and potentially for importing buffers that are not guaranteed to be valid. Can we move this CHECK up to some renderer side code that should never fail? https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... File content/common/gpu/gpu_memory_buffer_factory_io_surface.cc (right): https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:53: CHECK(io_surface); This makes it possible for a malicious renderer to crash the GPU process by requesting an IOSurface to be allocated that is too large. https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:58: CHECK(register_result); Is it possible that the browser decides to start a new gpu process before this process has been shutdown? This can fail in that case without being a real problem. https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:65: io_surfaces_[key] = io_surface; FYI, storing the IOSurface here is actually a bug as it it allows an IOSurface to become invalid (if the gpu process crashes) after successfully allocating a GMB. I don't think this affects the user experience today but it means that we need to be handle failure to create a GLImage.
I feel that we are going around in circles here. On 2015/12/01 20:49:04, reveman wrote: > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... > File content/child/child_gpu_memory_buffer_manager.cc (right): > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... > content/child/child_gpu_memory_buffer_manager.cc:49: CHECK(!handle.is_null()); > It would be useful to know what arguments such as 'size' are when this fails. > > Fyi, if CHECK(success) fails then we have the same problem in other manager > implementations. ChildDiscardableSharedMemoryManager, ChildSharedBitmapManager.. Isn't that what you wanted? Why should just the IOSurface implementation be required to never fail? > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... > content/child/child_gpu_memory_buffer_manager.cc:54: if (!buffer) { > This doesn't make sense if we have a CHECK below. Maybe just > return GpuMemoryBufferImpl::CreateFromHandle(handle, size, ...) > here and have the caller CHECK the result? True, doing the last IPC before crashing isn't necessary. > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_io_... > File content/child/child_io_surface_manager_mac.h (right): > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_io_... > content/child/child_io_surface_manager_mac.h:31: // This will always return a > valid IOSurface (failure will crash). > We need to be able to use this in the GPU process without crashing on failure to > solve the problem where GMBs can be become invalid after a gpu process crash. > > Is it enough for now to add CHECKs to the user of this interface? If we need > CHECKs here too then let's make it clear that they are only to temporarily > diagnose some crashes by adding TODOs and keeping as much of the existing code > as is. The code is currently written to return nullptr for GMB::Allocate if the GPU process crashes at various points, so I don't understand why we are writing the code to assume that it will not return nullptr. > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/clie... > File content/common/gpu/client/gpu_memory_buffer_impl_io_surface.cc (right): > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/clie... > content/common/gpu/client/gpu_memory_buffer_impl_io_surface.cc:58: > CHECK(io_surface); > I think this needs to be able to fail without crashing as this client interface > is also used in the browser process and potentially for importing buffers that > are not guaranteed to be valid. Can we move this CHECK up to some renderer side > code that should never fail? > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > File content/common/gpu/gpu_memory_buffer_factory_io_surface.cc (right): > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:53: > CHECK(io_surface); > This makes it possible for a malicious renderer to crash the GPU process by > requesting an IOSurface to be allocated that is too large. > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:58: > CHECK(register_result); > Is it possible that the browser decides to start a new gpu process before this > process has been shutdown? This can fail in that case without being a real > problem. > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:65: io_surfaces_[key] > = io_surface; > FYI, storing the IOSurface here is actually a bug as it it allows an IOSurface > to become invalid (if the gpu process crashes) after successfully allocating a > GMB. I don't think this affects the user experience today but it means that we > need to be handle failure to create a GLImage. This is, in effect, what I was mentioning in back in https://codereview.chromium.org/1438273002/ We need to be robust to the case of the GMB failing to allocate -- there are legitimate reasons for this to happen, including the GPU process crashing at various points. We need to investigate why the GMB allocation failure is occurring, but, we cannot write the renderer-side code assuming that it will never happen. Ultimately, I think we really need to be okay the patch in https://codereview.chromium.org/1438273002/ -- there are legitimate reasons for the allocation to fail. We need to investigate why the allocation failures are happening with such frequency on Mac. Also, to take a step back on the design, why is it necessary to allocate the IOSurfaces in any process other than the renderer process? Is there something about the Mach sequence that prevents this?
On 2015/12/01 at 21:18:34, ccameron wrote: > I feel that we are going around in circles here. > > On 2015/12/01 20:49:04, reveman wrote: > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... > > File content/child/child_gpu_memory_buffer_manager.cc (right): > > > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... > > content/child/child_gpu_memory_buffer_manager.cc:49: CHECK(!handle.is_null()); > > It would be useful to know what arguments such as 'size' are when this fails. > > > > Fyi, if CHECK(success) fails then we have the same problem in other manager > > implementations. ChildDiscardableSharedMemoryManager, ChildSharedBitmapManager.. > > Isn't that what you wanted? Why should just the IOSurface implementation be required to never fail? Sorry for the confusion. I think it's correct to fail hard here. This was just an FYI that if you see the first CHECK fail here, the "CHECK(success)", then you've discovered an issue that goes beyond GMBs as other managers (non-GMB such as discardable memory manager) currently assume that Send never fails and doesn't have this CHECK. > > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_gpu... > > content/child/child_gpu_memory_buffer_manager.cc:54: if (!buffer) { > > This doesn't make sense if we have a CHECK below. Maybe just > > return GpuMemoryBufferImpl::CreateFromHandle(handle, size, ...) > > here and have the caller CHECK the result? > > True, doing the last IPC before crashing isn't necessary. > > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_io_... > > File content/child/child_io_surface_manager_mac.h (right): > > > > https://codereview.chromium.org/1486873002/diff/20001/content/child/child_io_... > > content/child/child_io_surface_manager_mac.h:31: // This will always return a > > valid IOSurface (failure will crash). > > We need to be able to use this in the GPU process without crashing on failure to > > solve the problem where GMBs can be become invalid after a gpu process crash. > > > > Is it enough for now to add CHECKs to the user of this interface? If we need > > CHECKs here too then let's make it clear that they are only to temporarily > > diagnose some crashes by adding TODOs and keeping as much of the existing code > > as is. > > The code is currently written to return nullptr for GMB::Allocate if the GPU process crashes at various points, so I don't understand why we are writing the code to assume that it will not return nullptr. It used to be the case that a GPU process crash could result in GMB allocation failure but that made it impossible to reliably allocate GMBs and know that they are associated with the current instance of the GPU process. GMB allocation will now re-launch the gpu process as needed and a GMB is not supposed to be tied to a specific instance of the gpu process. The alternative is to add some form of GMB lost notification mechanism similar to how we have a context lost notifications but I'd like to avoid that if possible. If there are cases where GMB allocation fails then we need to understand why. It might be some gpu process race condition that we haven't thought of yet. > > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/clie... > > File content/common/gpu/client/gpu_memory_buffer_impl_io_surface.cc (right): > > > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/clie... > > content/common/gpu/client/gpu_memory_buffer_impl_io_surface.cc:58: > > CHECK(io_surface); > > I think this needs to be able to fail without crashing as this client interface > > is also used in the browser process and potentially for importing buffers that > > are not guaranteed to be valid. Can we move this CHECK up to some renderer side > > code that should never fail? > > > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > > File content/common/gpu/gpu_memory_buffer_factory_io_surface.cc (right): > > > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > > content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:53: > > CHECK(io_surface); > > This makes it possible for a malicious renderer to crash the GPU process by > > requesting an IOSurface to be allocated that is too large. > > > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > > content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:58: > > CHECK(register_result); > > Is it possible that the browser decides to start a new gpu process before this > > process has been shutdown? This can fail in that case without being a real > > problem. > > > > https://codereview.chromium.org/1486873002/diff/20001/content/common/gpu/gpu_... > > content/common/gpu/gpu_memory_buffer_factory_io_surface.cc:65: io_surfaces_[key] > > = io_surface; > > FYI, storing the IOSurface here is actually a bug as it it allows an IOSurface > > to become invalid (if the gpu process crashes) after successfully allocating a > > GMB. I don't think this affects the user experience today but it means that we > > need to be handle failure to create a GLImage. > > This is, in effect, what I was mentioning in back in > https://codereview.chromium.org/1438273002/ > We need to be robust to the case of the GMB failing to allocate -- there are legitimate reasons for this to happen, including the GPU process crashing at various points. We need to investigate why the GMB allocation failure is occurring, but, we cannot write the renderer-side code assuming that it will never happen. > > Ultimately, I think we really need to be okay the patch in https://codereview.chromium.org/1438273002/ -- there are legitimate reasons for the allocation to fail. We need to investigate why the allocation failures are happening with such frequency on Mac. If we allow allocations to fail then it becomes harder to ensure that it only fails for legitimate reasons and I think the code becomes more complicated as we have to deal with the error conditions in an appropriate way. If there are legitimate reasons for failure, e.g. a limit to IOSurface dimensions then I'd prefer if we knew about this in the renderer and never tried to allocate GMBs that violate these limits. > > Also, to take a step back on the design, why is it necessary to allocate the IOSurfaces in any process other than the renderer process? Is there something about the Mach sequence that prevents this? It might be possible to allocate an IOSurface in the renderer. I don't remember if the sandbox allows this or not. Ideally it would not be allowed as that probably means that the renderer can also access global IOSurfaces used in other apps on your mac. The current mechanism for IOSurface backed GMBs was created to match other platforms as close as possible to make it easier to ensure that different type of buffers behave the same way. We still need something like the current IOSurfaceManager system to provide sharing of IOsurface in a secure way between processes but the work currently done in the GPU process could be moved to the browser process and maybe even the renderer if the sandbox allows it.
Drive-by from the peanut gallery. On 2015/12/02 00:18:49, reveman wrote: > > Also, to take a step back on the design, why is it necessary to allocate the > IOSurfaces in any process other than the renderer process? Is there something > about the Mach sequence that prevents this? > > It might be possible to allocate an IOSurface in the renderer. I don't remember > if the sandbox allows this or not. Ideally it would not be allowed as that > probably means that the renderer can also access global IOSurfaces used in other > apps on your mac. The current mechanism for IOSurface backed GMBs was created to > match other platforms as close as possible to make it easier to ensure that > different type of buffers behave the same way. We still need something like the > current IOSurfaceManager system to provide sharing of IOsurface in a secure way > between processes but the work currently done in the GPU process could be moved > to the browser process and maybe even the renderer if the sandbox allows it. IOSurface allocation requires access to the IOSurfaceRootUserClient, which is not permitted by the renderer sandbox, nor do we want this to permitted. Privileged resources need to be brokered into the renderer process. Also, on principal this change seems like it's at the wrong layer. Why not just CHECK the result of IOSurfaceManager::AcquireIOSurface where necessary, rather than making the API crashy? The latter also makes it harder to write good tests.
I've updated the patch to just CHECK inside ChildGpuMemoryBufferManager::AllocateGpuMemoryBuffer on failure. PTAL. I suspect that we'll see a ton of crashes come in, because the other main caller of this, ZeroCopyTileTaskWorkerPool, does not assume that the GMB pointer will be (see https://code.google.com/p/chromium/codesearch#chromium/src/cc/raster/zero_cop...). This will at least distinguish between "failed to allocate the IOSurface" and "failed to open the IOSurface in the renderer".
lgtm with nit On 2015/12/04 at 01:12:21, ccameron wrote: > I've updated the patch to just CHECK inside ChildGpuMemoryBufferManager::AllocateGpuMemoryBuffer on failure. PTAL. > > I suspect that we'll see a ton of crashes come in, because the other main caller of this, ZeroCopyTileTaskWorkerPool, does not assume that the GMB pointer will be (see https://code.google.com/p/chromium/codesearch#chromium/src/cc/raster/zero_cop...). That's another leftover from the time when GMB allocation could fail if the gpu process crashed but should be changed to a DCHECK now. > > This will at least distinguish between "failed to allocate the IOSurface" and "failed to open the IOSurface in the renderer". Ok, sgtm. https://codereview.chromium.org/1486873002/diff/40001/content/child/child_gpu... File content/child/child_gpu_memory_buffer_manager.cc (right): https://codereview.chromium.org/1486873002/diff/40001/content/child/child_gpu... content/child/child_gpu_memory_buffer_manager.cc:54: CHECK(buffer); nit: I don't think this should be able to fail unless there's a bug in our code and we shouldn't be using CHECKs for that in the general case. If you think this check is useful for temporarily diagnosing GMB allocation failure then add a TODO here about removing it later. fyi, the CHECK(!handle.is_null()) above is different as that can fail as a result of running out of os resources (e.g. system memory) which is not something we can fully control and it's best to catch that as early as possible.
Description was changed from ========== Mac: Require child AllocateGpuMemoryBuffer to not fail It is already the case that this will cause a crash in the renderer process when it dereferences the GpuMemoryBuffer. Move this higher up the stack. BUG=561672 ========== to ========== Mac: Require child AllocateGpuMemoryBuffer to not fail It is already the case that this will cause a crash in the renderer process when it dereferences the GpuMemoryBuffer. Move this higher up the stack. BUG=561672 ==========
ccameron@chromium.org changed reviewers: + avi@chromium.org
Thanks! Adding avi@ for content/ OWNERship. https://codereview.chromium.org/1486873002/diff/40001/content/child/child_gpu... File content/child/child_gpu_memory_buffer_manager.cc (right): https://codereview.chromium.org/1486873002/diff/40001/content/child/child_gpu... content/child/child_gpu_memory_buffer_manager.cc:54: CHECK(buffer); On 2015/12/04 01:47:04, reveman wrote: > nit: I don't think this should be able to fail unless there's a bug in our code > and we shouldn't be using CHECKs for that in the general case. If you think this > check is useful for temporarily diagnosing GMB allocation failure then add a > TODO here about removing it later. > > fyi, the CHECK(!handle.is_null()) above is different as that can fail as a > result of running out of os resources (e.g. system memory) which is not > something we can fully control and it's best to catch that as early as possible. I could picture it failing if, for instance, IOSurfaceLookupFromMachPort (from ChildIOSurfaceManager::AcquireIOSurface, from GpuMemoryBufferImplIOSurface::CreateFromHandle) requires allocating some sorts of resources. I think that failure at creation time is more likely, but I'm still curious if this will be hit.
lgtm ok
Thanks!
The CQ bit was checked by ccameron@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1486873002/40001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1486873002/40001
Message was sent while issue was closed.
Description was changed from ========== Mac: Require child AllocateGpuMemoryBuffer to not fail It is already the case that this will cause a crash in the renderer process when it dereferences the GpuMemoryBuffer. Move this higher up the stack. BUG=561672 ========== to ========== Mac: Require child AllocateGpuMemoryBuffer to not fail It is already the case that this will cause a crash in the renderer process when it dereferences the GpuMemoryBuffer. Move this higher up the stack. BUG=561672 ==========
Message was sent while issue was closed.
Committed patchset #3 (id:40001)
Message was sent while issue was closed.
Description was changed from ========== Mac: Require child AllocateGpuMemoryBuffer to not fail It is already the case that this will cause a crash in the renderer process when it dereferences the GpuMemoryBuffer. Move this higher up the stack. BUG=561672 ========== to ========== Mac: Require child AllocateGpuMemoryBuffer to not fail It is already the case that this will cause a crash in the renderer process when it dereferences the GpuMemoryBuffer. Move this higher up the stack. BUG=561672 Committed: https://crrev.com/d34a0fd2b9080114bd22ad148f8e37da2b3a24f1 Cr-Commit-Position: refs/heads/master@{#363222} ==========
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/d34a0fd2b9080114bd22ad148f8e37da2b3a24f1 Cr-Commit-Position: refs/heads/master@{#363222}
Message was sent while issue was closed.
A revert of this CL (patchset #3 id:40001) has been created in https://codereview.chromium.org/1507893004/ by ccameron@chromium.org. The reason for reverting is: Causing lots of crashes.. |