|
|
Chromium Code Reviews|
Created:
4 years, 4 months ago by liamjm (20p) Modified:
4 years, 2 months ago CC:
chromium-reviews, tracing+reviews_chromium.org, wfh+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
Descriptionwinheap_dump: handle errors gracefully
In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc
The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not.
It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them.
The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully.
It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done.
[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng
BUG=464430
Committed: https://crrev.com/c56e1ffa7c82070f974f065c1a5b0bc99165d32e
Cr-Commit-Position: refs/heads/master@{#425523}
Patch Set 1 #Patch Set 2 : winheap_dump: handle errors gracefully #Patch Set 3 : Merge branch 'master' of https://chromium.googlesource.com/chromium/src into winheap_dump #
Total comments: 12
Patch Set 4 : changes from review #Patch Set 5 : rebase and move changes to new file as old one was deleted #Patch Set 6 : adjust line breaks on comment #
Total comments: 2
Patch Set 7 : switch params to check, and match signedness #Messages
Total messages: 26 (10 generated)
Description was changed from ========== winheap_dump: handle errors gracefully fix quoting in gfx.gyp BUG= ========== to ========== In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 BUG=464430 ==========
liamjm@chromium.org changed reviewers: + wfh@chromium.org
Description was changed from ========== In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 BUG=464430 ========== to ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 BUG=464430 ==========
liamjm@chromium.org changed reviewers: + danakj@chromium.org
danakj@chromium.org: Please review changes in
(background) This l-g-t-m as the approach we think is best to unblock the CSRSS lockdown work, but passing to Dana to decide as OWNER. Thanks!
> As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. Why not ensure that at most one heap was not dumped successfully (the CSRSS one)? https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... File base/trace_event/winheap_dump_provider_win.cc (right): https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:96: CHECK(all_heap_info.allocated_size != 0); CHECK_NE will give better failure output https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:105: // NOTE: bug 464430 nit: crbug.com/464430 https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:106: // As a part of the CSRSS lockdown in the referenced bug, it will invalidate Please explain/write out what CSRSS is here, acronyms are hard. https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:114: if (::IsBadReadPtr(heap_info->heap_id, 0x100)) { How confident are you that 0x100 is a good size? Why not just 1?
https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... File base/trace_event/winheap_dump_provider_win.cc (right): https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:114: if (::IsBadReadPtr(heap_info->heap_id, 0x100)) { On 2016/08/15 23:42:03, danakj wrote: > How confident are you that 0x100 is a good size? Why not just 1? Also it's not clear to me after reading your patch description and the comment here why this call is needed at all. The description says that HeapLock will fail if the heap is the deleted CSRSS one. And this *is* racey, a heap could become bad after this returned true, right? https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:117: // This is implicitly checks certain aspects of the HEAP structure, such as nit: This implicitly checks
(Context: liamjm is only 20% on chrome security so replies to this CL might come in a few weeks)
On Tue, Aug 16, 2016 at 9:08 AM, <wfh@chromium.org> wrote: > (Context: liamjm is only 20% on chrome security so replies to this CL > might come > in a few weeks) > ok thx :) > > https://codereview.chromium.org/2242953002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Hi all, I'm heading out on vacation for the month of September. I really wanted to get back to this CL, but my 80% had lots of things to deal with. Thanks for your patience.
https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... File base/trace_event/winheap_dump_provider_win.cc (right): https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:96: CHECK(all_heap_info.allocated_size != 0); On 2016/08/15 23:42:03, danakj wrote: > CHECK_NE will give better failure output Done. https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:105: // NOTE: bug 464430 On 2016/08/15 23:42:03, danakj wrote: > nit: crbug.com/464430 Done. https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:106: // As a part of the CSRSS lockdown in the referenced bug, it will invalidate On 2016/08/15 23:42:03, danakj wrote: > Please explain/write out what CSRSS is here, acronyms are hard. Done. https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:114: if (::IsBadReadPtr(heap_info->heap_id, 0x100)) { On 2016/08/15 23:43:51, danakj wrote: > On 2016/08/15 23:42:03, danakj wrote: > > How confident are you that 0x100 is a good size? Why not just 1? > > Also it's not clear to me after reading your patch description and the comment > here why this call is needed at all. The description says that HeapLock will > fail if the heap is the deleted CSRSS one. > > And this *is* racey, a heap could become bad after this returned true, right? Yes this using this function is racey. The idea was that it is not racey under the circumstances present at this point. However, it doesn't really offer any protection that HeapLock doesn't provide, so it can be removed. https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:114: if (::IsBadReadPtr(heap_info->heap_id, 0x100)) { On 2016/08/15 23:42:03, danakj wrote: > How confident are you that 0x100 is a good size? Why not just 1? Obsolete as this code has been removed. https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea... base/trace_event/winheap_dump_provider_win.cc:117: // This is implicitly checks certain aspects of the HEAP structure, such as On 2016/08/15 23:43:51, danakj wrote: > nit: This implicitly checks Done.
primiano@chromium.org changed reviewers: + primiano@chromium.org
ehm I think this file doesn't exist anymore and has been merged in the malloc provider. at very least this cl requires a rebase
On 2016/10/11 19:52:34, Primiano Tucci wrote: > ehm I think this file doesn't exist anymore and has been merged in the malloc > provider. > at very least this cl requires a rebase Yes, doing a rebase now and found this....
On 2016/10/11 20:58:48, liamjm (20p) wrote: > On 2016/10/11 19:52:34, Primiano Tucci wrote: > > ehm I think this file doesn't exist anymore and has been merged in the malloc > > provider. > > at very least this cl requires a rebase > > Yes, doing a rebase now and found this.... OK, found the functions in malloc_dump_provider.cc and made the changes in this file. PTAL.
LGTM with 1 comment https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/mallo... File base/trace_event/malloc_dump_provider.cc (right): https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/mallo... base/trace_event/malloc_dump_provider.cc:188: CHECK_EQ(heap_info_errors, 1); you really want here: CHECK_EQ(1u, heap_info_errors); CHECK_EQ/NE yoda comparison are (as opposite to _GT, _LT) also, they are pretti sensible on signed vs unsigned, so you need 1u.
Thanks! https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/mallo... File base/trace_event/malloc_dump_provider.cc (right): https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/mallo... base/trace_event/malloc_dump_provider.cc:188: CHECK_EQ(heap_info_errors, 1); On 2016/10/12 16:39:25, Primiano Tucci wrote: > you really want here: > CHECK_EQ(1u, heap_info_errors); > CHECK_EQ/NE yoda comparison are (as opposite to _GT, _LT) > also, they are pretti sensible on signed vs unsigned, so you need 1u. Done.
Description was changed from ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 BUG=464430 ========== to ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng BUG=464430 ==========
The CQ bit was checked by liamjm@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from primiano@chromium.org Link to the patchset: https://codereview.chromium.org/2242953002/#ps120001 (title: "switch params to check, and match signedness")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Message was sent while issue was closed.
Description was changed from ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng BUG=464430 ========== to ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng BUG=464430 ==========
Message was sent while issue was closed.
Committed patchset #7 (id:120001)
Message was sent while issue was closed.
Description was changed from ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng BUG=464430 ========== to ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming to lockdown CSRSS. However this causes a problem with the dumping of the heap, which has necessitated the change to base/trace_event/winheap_dump_provider_win.cc The dumping of the heaps get a list of process heaps (for the renderer) and dumps each of these. One of these heaps is the one used by CSRSS for its communication between the client (this process) and the CSRSS server (csrss.exe). For more background see [0] and [1]. When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the server destroys this heap. However as this is only meant to happen as part of process termination, there is no cleanup inside the client process. So the client process is left thinking this process heap exists, when it does not. It is possible to destroy this heap right before the ALPC port is closed, however the 2 options I've experimented with both require use of undocumented features. This is not desired in general, and from my observations, the internal heap structures change a lot from version to version, so we really can't rely on them. The solution implemented here is to handle the invalid heaps when they are being dumped, by gracefully supporting failures, rather than CHECK()ing. In particular, the call to HeapLock() is performing the heap validation for us, if this fails we assume that we can't read the heap, which is now considered ok. As an extra check, we add a CHECK() to ensure that at least one heap was dumped successfully. It would be an option, perhaps, to plumb through this lockdown state in to the heap dumping code, however that would require a very large amount of changes so it has not been done. [0]: http://j00ru.vexillium.org/?p=527 [1]: https://github.com/dynamorio/drmemory/issues/1221 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng BUG=464430 Committed: https://crrev.com/c56e1ffa7c82070f974f065c1a5b0bc99165d32e Cr-Commit-Position: refs/heads/master@{#425523} ==========
Message was sent while issue was closed.
Patchset 7 (id:??) landed as https://crrev.com/c56e1ffa7c82070f974f065c1a5b0bc99165d32e Cr-Commit-Position: refs/heads/master@{#425523} |
