Issue 2242953002: winheap_dump: handle errors gracefully

liamjm (20p)

Description was changed from ========== winheap_dump: handle errors gracefully fix quoting in gfx.gyp BUG= ========== ...

4 years, 4 months ago (2016-08-12 18:48:53 UTC) #1

liamjm (20p)

liamjm@chromium.org changed reviewers: + wfh@chromium.org

4 years, 4 months ago (2016-08-12 18:48:53 UTC) #2

liamjm (20p)

Description was changed from ========== In bug 464430 we are aiming to lockdown CSRSS. However ...

4 years, 4 months ago (2016-08-12 21:38:18 UTC) #3

Description was changed from

==========
In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221



BUG=464430
==========

to

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221



BUG=464430
==========

liamjm (20p)

liamjm@chromium.org changed reviewers: + danakj@chromium.org

4 years, 4 months ago (2016-08-12 21:41:48 UTC) #4

Will Harris

(background) This l-g-t-m as the approach we think is best to unblock the CSRSS lockdown ...

4 years, 4 months ago (2016-08-12 21:45:36 UTC) #6

danakj

> As an extra check, we add a CHECK() to ensure that at least one ...

4 years, 4 months ago (2016-08-15 23:42:03 UTC) #7

danakj

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winheap_dump_provider_win.cc File base/trace_event/winheap_dump_provider_win.cc (right): https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winheap_dump_provider_win.cc#newcode114 base/trace_event/winheap_dump_provider_win.cc:114: if (::IsBadReadPtr(heap_info->heap_id, 0x100)) { On 2016/08/15 23:42:03, danakj wrote: ...

4 years, 4 months ago (2016-08-15 23:43:51 UTC) #8

Will Harris

(Context: liamjm is only 20% on chrome security so replies to this CL might come ...

4 years, 4 months ago (2016-08-16 16:08:01 UTC) #9

danakj

On Tue, Aug 16, 2016 at 9:08 AM, <wfh@chromium.org> wrote: > (Context: liamjm is only ...

4 years, 4 months ago (2016-08-16 17:35:36 UTC) #10

liamjm (20p)

Hi all, I'm heading out on vacation for the month of September. I really wanted ...

4 years, 3 months ago (2016-09-01 23:19:06 UTC) #11

liamjm (20p)

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winheap_dump_provider_win.cc File base/trace_event/winheap_dump_provider_win.cc (right): https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winheap_dump_provider_win.cc#newcode96 base/trace_event/winheap_dump_provider_win.cc:96: CHECK(all_heap_info.allocated_size != 0); On 2016/08/15 23:42:03, danakj wrote: > ...

4 years, 2 months ago (2016-10-11 18:57:31 UTC) #12

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
File base/trace_event/winheap_dump_provider_win.cc (right):

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
base/trace_event/winheap_dump_provider_win.cc:96:
CHECK(all_heap_info.allocated_size != 0);
On 2016/08/15 23:42:03, danakj wrote:
> CHECK_NE will give better failure output

Done.

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
base/trace_event/winheap_dump_provider_win.cc:105: // NOTE: bug 464430
On 2016/08/15 23:42:03, danakj wrote:
> nit: crbug.com/464430

Done.

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
base/trace_event/winheap_dump_provider_win.cc:106: // As a part of the CSRSS
lockdown in the referenced bug, it will invalidate
On 2016/08/15 23:42:03, danakj wrote:
> Please explain/write out what CSRSS is here, acronyms are hard.

Done.

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
base/trace_event/winheap_dump_provider_win.cc:114: if
(::IsBadReadPtr(heap_info->heap_id, 0x100)) {
On 2016/08/15 23:43:51, danakj wrote:
> On 2016/08/15 23:42:03, danakj wrote:
> > How confident are you that 0x100 is a good size? Why not just 1?
> 
> Also it's not clear to me after reading your patch description and the comment
> here why this call is needed at all. The description says that HeapLock will
> fail if the heap is the deleted CSRSS one.
> 
> And this *is* racey, a heap could become bad after this returned true, right?

Yes this using this function is racey. The idea was that it is not racey under
the circumstances present at this point. 

However, it doesn't really offer any protection that HeapLock doesn't provide,
so it can be removed.

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
base/trace_event/winheap_dump_provider_win.cc:114: if
(::IsBadReadPtr(heap_info->heap_id, 0x100)) {
On 2016/08/15 23:42:03, danakj wrote:
> How confident are you that 0x100 is a good size? Why not just 1?

Obsolete as this code has been removed.

https://codereview.chromium.org/2242953002/diff/40001/base/trace_event/winhea...
base/trace_event/winheap_dump_provider_win.cc:117: // This is implicitly checks
certain aspects of the HEAP structure, such as
On 2016/08/15 23:43:51, danakj wrote:
> nit: This implicitly checks

Done.

Primiano Tucci (use gerrit)

primiano@chromium.org changed reviewers: + primiano@chromium.org

4 years, 2 months ago (2016-10-11 19:52:33 UTC) #13

Primiano Tucci (use gerrit)

ehm I think this file doesn't exist anymore and has been merged in the malloc ...

4 years, 2 months ago (2016-10-11 19:52:34 UTC) #14

liamjm (20p)

On 2016/10/11 19:52:34, Primiano Tucci wrote: > ehm I think this file doesn't exist anymore ...

4 years, 2 months ago (2016-10-11 20:58:48 UTC) #15

liamjm (20p)

On 2016/10/11 20:58:48, liamjm (20p) wrote: > On 2016/10/11 19:52:34, Primiano Tucci wrote: > > ...

4 years, 2 months ago (2016-10-11 21:37:11 UTC) #16

Primiano Tucci (use gerrit)

LGTM with 1 comment https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/malloc_dump_provider.cc File base/trace_event/malloc_dump_provider.cc (right): https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/malloc_dump_provider.cc#newcode188 base/trace_event/malloc_dump_provider.cc:188: CHECK_EQ(heap_info_errors, 1); you really want ...

4 years, 2 months ago (2016-10-12 16:39:25 UTC) #17

liamjm (20p)

Thanks! https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/malloc_dump_provider.cc File base/trace_event/malloc_dump_provider.cc (right): https://codereview.chromium.org/2242953002/diff/100001/base/trace_event/malloc_dump_provider.cc#newcode188 base/trace_event/malloc_dump_provider.cc:188: CHECK_EQ(heap_info_errors, 1); On 2016/10/12 16:39:25, Primiano Tucci wrote: ...

4 years, 2 months ago (2016-10-14 16:47:08 UTC) #18

Will Harris

Description was changed from ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming ...

4 years, 2 months ago (2016-10-14 22:41:51 UTC) #19

Description was changed from

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221



BUG=464430
==========

to

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng
BUG=464430
==========

liamjm (20p)

The patchset sent to the CQ was uploaded after l-g-t-m from primiano@chromium.org Link to the ...

4 years, 2 months ago (2016-10-14 23:43:11 UTC) #21

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2242953002/120001

4 years, 2 months ago (2016-10-14 23:43:22 UTC) #22

commit-bot: I haz the power

Description was changed from ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming ...

4 years, 2 months ago (2016-10-15 01:05:08 UTC) #23

Message was sent while issue was closed.

Description was changed from

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng
BUG=464430
==========

to

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng
BUG=464430
==========

commit-bot: I haz the power

Description was changed from ========== winheap_dump: handle errors gracefully In bug 464430 we are aiming ...

4 years, 2 months ago (2016-10-15 01:07:57 UTC) #25

Message was sent while issue was closed.

Description was changed from

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng
BUG=464430
==========

to

==========
winheap_dump: handle errors gracefully

In bug 464430 we are aiming to lockdown CSRSS.
However this causes a problem with the dumping of the heap, which has
necessitated the change to base/trace_event/winheap_dump_provider_win.cc

The dumping of the heaps get a list of process heaps (for the renderer) and
dumps each of these. One of these heaps is the one used by CSRSS for its
communication between the client (this process) and the CSRSS server
(csrss.exe). For more background see [0] and [1].
When the handle to ALPC port to CSRSS is closed in the sandbox lockdown, the
server destroys this heap. However as this is only meant to happen as part of
process termination, there is no cleanup inside the client process. So the
client process is left thinking this process heap exists, when it does not.

It is possible to destroy this heap right before the ALPC port is closed,
however the 2 options I've experimented with both require use of undocumented
features. This is not desired in general, and from my observations, the internal
heap structures change a lot from version to version, so we really can't rely on
them.

The solution implemented here is to handle the invalid heaps when they are being
dumped, by gracefully supporting failures, rather than CHECK()ing. In
particular, the call to HeapLock() is performing the heap validation for us, if
this fails we assume that we can't read the heap, which is now considered ok. As
an extra check, we add a CHECK() to ensure that at least one heap was dumped
successfully.

It would be an option, perhaps, to plumb through this lockdown state in to the
heap dumping code, however that would require a very large amount of changes so
it has not been done.

[0]: http://j00ru.vexillium.org/?p=527
[1]: https://github.com/dynamorio/drmemory/issues/1221

CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.win:win10_chromium_x64_rel_ng
BUG=464430

Committed: https://crrev.com/c56e1ffa7c82070f974f065c1a5b0bc99165d32e
Cr-Commit-Position: refs/heads/master@{#425523}
==========

commit-bot: I haz the power

4 years, 2 months ago (2016-10-15 01:07:57 UTC) #26

Message was sent while issue was closed.

Patchset 7 (id:??) landed as
https://crrev.com/c56e1ffa7c82070f974f065c1a5b0bc99165d32e
Cr-Commit-Position: refs/heads/master@{#425523}

Issue 2242953002: winheap_dump: handle errors gracefully (Closed)

Description

Patch Set 1 #

Patch Set 2 : winheap_dump: handle errors gracefully #

Patch Set 3 : Merge branch 'master' of https://chromium.googlesource.com/chromium/src into winheap_dump #

Patch Set 4 : changes from review #

Patch Set 5 : rebase and move changes to new file as old one was deleted #

Patch Set 6 : adjust line breaks on comment #

Patch Set 7 : switch params to check, and match signedness #

Messages