|
|
DescriptionMake CHECK int 3 on Windows, rather than crash
Not being able to distinguish intentional crashes (CHECK) from unintentional ones (other access violations) in terms of exit codes reduces the amount of signal we have in stability metrics.
R=wfh@chromium.org
BUG=664209, 687326
Review-Url: https://codereview.chromium.org/2676483002
Cr-Commit-Position: refs/heads/master@{#450809}
Committed: https://chromium.googlesource.com/chromium/src/+/a17c8db528cadca9eef98ce03b0910700735722e
Patch Set 1 #
Total comments: 8
Patch Set 2 : . #
Total comments: 8
Patch Set 3 : fixes #
Total comments: 2
Patch Set 4 : no need to special case #if for clang #Messages
Total messages: 69 (23 generated)
Description was changed from ========== Make CHECK int 3 on Windows, rather than crash BUG=664209,687326 ========== to ========== Make CHECK int 3 on Windows, rather than crash R=wfh@chromium.org BUG=664209,687326 ==========
scottmg@chromium.org changed reviewers: + wfh@chromium.org
The CQ bit was checked by scottmg@chromium.org to run a CQ dry run
scottmg@chromium.org changed reviewers: + primiano@chromium.org
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
what does LOG_STREAM(FATAL) do? is this eventually calling the same code? https://codereview.chromium.org/2676483002/diff/1/base/logging.h File base/logging.h (right): https://codereview.chromium.org/2676483002/diff/1/base/logging.h#newcode465 base/logging.h:465: #define IMMEDIATE_CRASH() __builtin_trap() I'm assuming clang doesn't support __debugbreak() intrinsic? https://codereview.chromium.org/2676483002/diff/1/base/logging.h#newcode469 base/logging.h:469: #define IMMEDIATE_CRASH() ((void)(*(volatile char*)0 = 0)) when does this ever happen, do we support a compiler that isn't clang, gcc or msvc? https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc File base/logging_unittest.cc (right): https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc#ne... base/logging_unittest.cc:200: __declspec(noinline) void CheckContainingFunc(uint64_t x, uint64_t y, int z) { not sure here (base owner might know) but I think convention here is to use NOINLINE rather than __declspec(noinline) https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc#ne... base/logging_unittest.cc:216: uint64_t x = base::RandUint64(); this can return 0, so the test can fail with a very very low likelyhood...
On 2017/02/02 01:21:06, Will Harris wrote: > what does LOG_STREAM(FATAL) do? is this eventually calling the same code? LOG(FATAL) is insane, and eventually lands in BreakDebugger() here https://cs.chromium.org/chromium/src/base/logging.cc?rcl=f25b3265878e57ef40f5... . I think. Maybe. > > https://codereview.chromium.org/2676483002/diff/1/base/logging.h > File base/logging.h (right): > > https://codereview.chromium.org/2676483002/diff/1/base/logging.h#newcode465 > base/logging.h:465: #define IMMEDIATE_CRASH() __builtin_trap() > I'm assuming clang doesn't support __debugbreak() intrinsic? > > https://codereview.chromium.org/2676483002/diff/1/base/logging.h#newcode469 > base/logging.h:469: #define IMMEDIATE_CRASH() ((void)(*(volatile char*)0 = 0)) > when does this ever happen, do we support a compiler that isn't clang, gcc or > msvc? > > https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc > File base/logging_unittest.cc (right): > > https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc#ne... > base/logging_unittest.cc:200: __declspec(noinline) void > CheckContainingFunc(uint64_t x, uint64_t y, int z) { > not sure here (base owner might know) but I think convention here is to use > NOINLINE rather than __declspec(noinline) > > https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc#ne... > base/logging_unittest.cc:216: uint64_t x = base::RandUint64(); > this can return 0, so the test can fail with a very very low likelyhood...
The CQ bit was checked by scottmg@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2676483002/diff/1/base/logging.h File base/logging.h (right): https://codereview.chromium.org/2676483002/diff/1/base/logging.h#newcode465 base/logging.h:465: #define IMMEDIATE_CRASH() __builtin_trap() On 2017/02/02 01:21:06, Will Harris wrote: > I'm assuming clang doesn't support __debugbreak() intrinsic? You mean clang-win? I guess it probably does. Done. https://codereview.chromium.org/2676483002/diff/1/base/logging.h#newcode469 base/logging.h:469: #define IMMEDIATE_CRASH() ((void)(*(volatile char*)0 = 0)) On 2017/02/02 01:21:06, Will Harris wrote: > when does this ever happen, do we support a compiler that isn't clang, gcc or > msvc? Shrug, OK, removed. https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc File base/logging_unittest.cc (right): https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc#ne... base/logging_unittest.cc:200: __declspec(noinline) void CheckContainingFunc(uint64_t x, uint64_t y, int z) { On 2017/02/02 01:21:06, Will Harris wrote: > not sure here (base owner might know) but I think convention here is to use > NOINLINE rather than __declspec(noinline) Done. https://codereview.chromium.org/2676483002/diff/1/base/logging_unittest.cc#ne... base/logging_unittest.cc:216: uint64_t x = base::RandUint64(); On 2017/02/02 01:21:06, Will Harris wrote: > this can return 0, so the test can fail with a very very low likelyhood... Meh. Done.
lgtm but sounds like you need a base owner, I recommend thakis :) https://codereview.chromium.org/2676483002/diff/20001/base/logging.h File base/logging.h (right): https://codereview.chromium.org/2676483002/diff/20001/base/logging.h#newcode467 base/logging.h:467: #define IMMEDIATE_CRASH() __debugbreak() might be nice to have an #else and an #error here to point people at this code if they do try and compile with something else, but no strong feelings, happy to go with what OWNERS think.
never gonna give you up https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.cc File base/logging_unittest.cc (right): https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.c... base/logging_unittest.cc:204: CHECK(!y); One final thought - does this test need to be changed to put the CHECKs right next to each other (rather than split by a likely JNE) to replicate the exact issue in https://crbug.com/664209 ?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromeos_amd64-generic_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_amd64-...) chromeos_daisy_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_daisy_...) linux_chromium_chromeos_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_tsan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
non-owner LGTM. maybe you can expand a bit the commit message though (copy/paste and/or point to will's thread on chromium-dev). https://codereview.chromium.org/2676483002/diff/20001/base/logging.h File base/logging.h (right): https://codereview.chromium.org/2676483002/diff/20001/base/logging.h#newcode467 base/logging.h:467: #define IMMEDIATE_CRASH() __debugbreak() On 2017/02/02 03:12:03, Will Harris wrote: > might be nice to have an #else and an #error here to point people at this code > if they do try and compile with something else, but no strong feelings, happy to > go with what OWNERS think. not an owner, but +1 to the #error pointing to crbug.com/664209 or anything else that has an explanation of this esoteric piece of code https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.cc File base/logging_unittest.cc (right): https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.c... base/logging_unittest.cc:14: #include <windows.h> as per Nico's recent PSA I think there should be a blank line here to prevent windows.h to be reordered after excpt.h https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.c... base/logging_unittest.cc:204: CHECK(!y); On 2017/02/02 03:18:14, Will Harris wrote: > One final thought - does this test need to be changed to put the CHECKs right > next to each other (rather than split by a likely JNE) to replicate the exact > issue in https://crbug.com/664209 ? +1, I'd go for something similar to https://codereview.chromium.org/2502953003/diff/20001/base/logging_unittest.cc lines 340-344
Thanks. https://codereview.chromium.org/2676483002/diff/20001/base/logging.h File base/logging.h (right): https://codereview.chromium.org/2676483002/diff/20001/base/logging.h#newcode467 base/logging.h:467: #define IMMEDIATE_CRASH() __debugbreak() On 2017/02/02 03:12:03, Will Harris wrote: > might be nice to have an #else and an #error here to point people at this code > if they do try and compile with something else, but no strong feelings, happy to > go with what OWNERS think. That's why I left the deref null in the #else block. You prefer #error though? Done. https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.cc File base/logging_unittest.cc (right): https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.c... base/logging_unittest.cc:14: #include <windows.h> On 2017/02/03 03:15:05, Primiano Tucci wrote: > as per Nico's recent PSA I think there should be a blank line here to prevent > windows.h to be reordered after excpt.h This one didn't actually have to be backwards, I just do that by habit. Changed to sort normally. https://codereview.chromium.org/2676483002/diff/20001/base/logging_unittest.c... base/logging_unittest.cc:204: CHECK(!y); On 2017/02/02 03:18:14, Will Harris wrote: > One final thought - does this test need to be changed to put the CHECKs right > next to each other (rather than split by a likely JNE) to replicate the exact > issue in https://crbug.com/664209 ? Done. (I had previously been trying to outwit the optimizer with the rand and CommandLine fiddling, but the NOINLINE added later is sufficient to make the test useful, so I removed that junk.)
scottmg@chromium.org changed reviewers: + thakis@chromium.org
Nico, could you review as base OWNERS? Thanks.
the test is gated on OFFICIAL_BUILD so won't run on the trybots, so I presume you've verified this with a local build?
int 3 seems wrong to me in that it's not noreturn, you can just step through it in a debugger. i don't understand the second linked bug, why is that bad? Anyhoo, I'd prefer if we do the same thing on all platforms, so if https://codereview.chromium.org/1982123002 is causing trouble maybe we should just revert that?
On 2017/02/06 19:59:12, Nico wrote: > int 3 seems wrong to me in that it's not noreturn well I think it doesn't have to if it's a straight opcode (see scott's disasm in https://bugs.chromium.org/p/chromium/issues/detail?id=664209#c14) which, to me, it seems ideal for crash reporting (the various int 3 are disjoint) and seems reasonable perf-wise (the code just has to jump over an instruction) But very likely I am missing something else about why this one should be a noreturn instruction However, just realizing now that in https://codereview.chromium.org/2502953003/ (I promise, I'll get to that tomorrow) on Linux/Android I'm switching that to be an "undefined instruction" opcode. Maybe we should make the behavior uniform? .
On 2017/02/06 19:59:12, Nico wrote: > int 3 seems wrong to me in that it's not noreturn, you can just step through it > in a debugger. i don't understand the second linked bug, why is that bad? You can step through *null = 0 in a debugger too so it seems about the same. The second linked bug was concerned with the fact that unintentional crashes had been merged into the intentional crashes bucket. It seems somewhat useful to distinguish them I think, if we can manage it. > > Anyhoo, I'd prefer if we do the same thing on all platforms, so if > https://codereview.chromium.org/1982123002 is causing trouble maybe we should > just revert that? I think that was for perf for Blink right? So Blink people will be upset if ASSERT and CHECK aren't the same speed or whatever it was. (?)
I actually think that abort() is the right way to do this on non-Windows, and I’m happy to send out a change to switch to that for our POSIXy platforms. abort()’s weird on Windows, so it’s probably not right for this one, but I thought I’d mention it on the off chance that y’all think it’s NOT weird. Do int3 and ud2 behave differently on Windows with respect to the JIT debugger? I generally like noreturn forms for this kind of thing because the compiler can do nice things when it recognizes that control flow can’t go in a certain direction. int3 can’t legitimately be noreturn, but ud2 can. This can produce smaller code and allow stuff like this to be accepted: unsigned int F(int v) { if (v >= 0) return static_cast<unsigned int>(v); NOTREACHED(); } Assuming NOTREACHED() is noreturn, nobody will insist that you put a never-reached return statement at the end of that snippet.
On 2017/02/10 16:10:20, Mark Mentovai wrote: > I actually think that abort() is the right way to do this on non-Windows, and > I’m happy to send out a change to switch to that for our POSIXy platforms. I am addressing non-posix in https://codereview.chromium.org/2502953003 I just freshly baked a new patch-set right now. Feel free to add comment there. there is a quite exhaustive background in the CL description which tries to explain the entanglement of issues ;-)
I wonder if thakis@ had any further thoughts on this CL?
I didn't know that I can stop across 0 pointer derefs -- and I can't repro that. I made c:\src\chrome\src>type test.cc int main() { __debugbreak(); ((void)(*(volatile char*)0 = 0)); } and when I run it, the MSVC debugger opens, and I can hit continue to go past the first one but not past the second one. https://codereview.chromium.org/2676483002/diff/40001/base/logging.h File base/logging.h (left): https://codereview.chromium.org/2676483002/diff/40001/base/logging.h#oldcode464 base/logging.h:464: #if defined(COMPILER_GCC) || defined(__clang__) the || __clang__ was here to use __builtin_trap on both win and non-win with clang. since you want __debugbreak() on win, you can just make this #if defined(COMPILER_GCC) #else defined(COMPILER_MSVC) and clang will mask as the appropriate thing on each platform
On 2017/02/10 22:45:30, Nico wrote: > I didn't know that I can stop across 0 pointer derefs -- and I can't repro that. > I made > > c:\src\chrome\src>type test.cc > int main() { > __debugbreak(); > ((void)(*(volatile char*)0 = 0)); > } > > and when I run it, the MSVC debugger opens, and I can hit continue to go past > the first one but not past the second one. Ah, I see what you mean. I always use Ctrl-Shift-F10 to step past these type of statements. Do you think there's a compelling case for one of int3, ud2, or *null=0, or something else in terms of different code gen? I've always thought of int3 as the end of the line when you're not running in a debugger (i.e. normal shipping use). I'll check some more generated code with other variants I guess. But your concern is whether the debugger will re-raise an exception if you attempt to step over the given line a second time? > > https://codereview.chromium.org/2676483002/diff/40001/base/logging.h > File base/logging.h (left): > > https://codereview.chromium.org/2676483002/diff/40001/base/logging.h#oldcode464 > base/logging.h:464: #if defined(COMPILER_GCC) || defined(__clang__) > the || __clang__ was here to use __builtin_trap on both win and non-win with > clang. since you want __debugbreak() on win, you can just make this > > #if defined(COMPILER_GCC) > #else defined(COMPILER_MSVC) > > and clang will mask as the appropriate thing on each platform
On Fri, Feb 10, 2017 at 5:59 PM, <scottmg@chromium.org> wrote: > On 2017/02/10 22:45:30, Nico wrote: > > I didn't know that I can stop across 0 pointer derefs -- and I can't > repro > that. > > I made > > > > c:\src\chrome\src>type test.cc > > int main() { > > __debugbreak(); > > ((void)(*(volatile char*)0 = 0)); > > } > > > > and when I run it, the MSVC debugger opens, and I can hit continue to go > past > > the first one but not past the second one. > > Ah, I see what you mean. I always use Ctrl-Shift-F10 to step past these > type of > statements. > > Do you think there's a compelling case for one of int3, ud2, or *null=0, or > something else in terms of different code gen? I've always thought of int3 > as > the end of the line when you're not running in a debugger (i.e. normal > shipping > use). I'll check some more generated code with other variants I guess. > > But your concern is whether the debugger will re-raise an exception if you > attempt to step over the given line a second time? > Yes. int3 feels like "stop, or don't" while the null deref is just "stop" without the "or don't" part. > > > > > > https://codereview.chromium.org/2676483002/diff/40001/base/logging.h > > File base/logging.h (left): > > > > > https://codereview.chromium.org/2676483002/diff/40001/ > base/logging.h#oldcode464 > > base/logging.h:464: #if defined(COMPILER_GCC) || defined(__clang__) > > the || __clang__ was here to use __builtin_trap on both win and non-win > with > > clang. since you want __debugbreak() on win, you can just make this > > > > #if defined(COMPILER_GCC) > > #else defined(COMPILER_MSVC) > > > > and clang will mask as the appropriate thing on each platform > > > > https://codereview.chromium.org/2676483002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
On 2017/02/10 23:02:06, Nico wrote: > On Fri, Feb 10, 2017 at 5:59 PM, <mailto:scottmg@chromium.org> wrote: > > > On 2017/02/10 22:45:30, Nico wrote: > > > I didn't know that I can stop across 0 pointer derefs -- and I can't > > repro > > that. > > > I made > > > > > > c:\src\chrome\src>type test.cc > > > int main() { > > > __debugbreak(); > > > ((void)(*(volatile char*)0 = 0)); > > > } > > > > > > and when I run it, the MSVC debugger opens, and I can hit continue to go > > past > > > the first one but not past the second one. > > > > Ah, I see what you mean. I always use Ctrl-Shift-F10 to step past these > > type of > > statements. > > > > Do you think there's a compelling case for one of int3, ud2, or *null=0, or > > something else in terms of different code gen? I've always thought of int3 > > as > > the end of the line when you're not running in a debugger (i.e. normal > > shipping > > use). I'll check some more generated code with other variants I guess. > > > > But your concern is whether the debugger will re-raise an exception if you > > attempt to step over the given line a second time? > > > > Yes. int3 feels like "stop, or don't" while the null deref is just "stop" > without the "or don't" part. > leaving this as a null deref doesn't fix the bug, which is that CHECK was not resulting in a STATUS_BREAKPOINT and was thus indistinguishable from a STATUS_ACCESS_VIOLATION crash. I don't mind what exception gets raised here as long as it's unique for CHECK...
Wait, I just realized we used to do BreakDebugger() which is implemented as __debugbreak(). So now I'm not sure why we're trying to avoid that exactly.
Description was changed from ========== Make CHECK int 3 on Windows, rather than crash R=wfh@chromium.org BUG=664209,687326 ========== to ========== Make CHECK int 3 on Windows, rather than crash Not being able to distinguish intentional crashes (CHECK) from unintentional ones (other access violations) in terms of exit codes reduces the amount of signal we have in stability metrics. R=wfh@chromium.org BUG=664209,687326 ==========
mark@chromium.org changed reviewers: + mark@chromium.org
DebugBreak() is in kernel32, and __debugbreak() gives you your own int3?
mark@chromium.org changed reviewers: - mark@chromium.org
On 2017/02/11 00:15:23, Mark Mentovai wrote: > DebugBreak() is in kernel32, and __debugbreak() gives you your own int3? BreakDebugger() wasn't a typo; this regressed in https://codereview.chromium.org/1982123002/diff/20001/base/logging.h where we're using base::debug::BreakDebugger() https://cs.chromium.org/chromium/src/base/debug/debugger_win.cc?rcl=2c9e09708... .
Oh, but you said BreakDebugger()… Yeah…so that’s still a call (which is a few bytes), and everyone gets funneled through the same point, with (useless?) BreakDebugger() at the top of the stack instead of pinpointing your own int3 as the crash context. Maybe both of those are undesirable. That’s part of my read of Primiano’s non-Windows proposal.
On 2017/02/11 00:17:16, scottmg wrote: > On 2017/02/11 00:15:23, Mark Mentovai wrote: > > DebugBreak() is in kernel32, and __debugbreak() gives you your own int3? > > BreakDebugger() wasn't a typo; this regressed in > https://codereview.chromium.org/1982123002/diff/20001/base/logging.h where we're > using base::debug::BreakDebugger() > https://cs.chromium.org/chromium/src/base/debug/debugger_win.cc?rcl=2c9e09708... > . Or did you mean we should use kernel32!DebugBreak()? I don't even know what I'm doing any more. :)
On 2017/02/11 00:18:50, Mark Mentovai wrote: > Oh, but you said BreakDebugger()… > > Yeah…so that’s still a call (which is a few bytes), and everyone gets funneled > through the same point, with (useless?) BreakDebugger() at the top of the stack > instead of pinpointing your own int3 as the crash context. Maybe both of those > are undesirable. That’s part of my read of Primiano’s non-Windows proposal. That's part of it, yeah. The other problem is that when the code to call a common function becomes too large, compilers tend to coalesce the function calls via something like if (!cond1) jmp handle_bad_case ... if (!cond2) jmp handle_bad_case ... if (!cond3) jmp handle_bad_case ... handle_bad_case: call SomeFunction() # where this is more bytes than an inline 0xcc rather than inlining the CHECK expansion, which in turn makes it unclear which specific CHECK has failed based on crash line number information. (That's what the added unittest here is confirming.)
I definitely think we should stick with __debugbreak() not kernel32!DebugBreak() I'm not sure exactly what the objections to this CL are...? This is restoring the behavior to what it was before https://codereview.chromium.org/1982123002 landed, but without regressing the blink performance... in addition, scott is adding some additional test coverage that was never there before, so that's even better. Right now we are losing data fidelity because we cannot distinguish CHECK from other types of crashes, so I'd like either this CL (or something similar to it that passes muster) or a revert to 1982123002 (which will cause perf regressions in blink) to land ASAP and be merged to M57. Thanks.
The CQ bit was checked by scottmg@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2676483002/diff/40001/base/logging.h File base/logging.h (left): https://codereview.chromium.org/2676483002/diff/40001/base/logging.h#oldcode464 base/logging.h:464: #if defined(COMPILER_GCC) || defined(__clang__) On 2017/02/10 22:45:30, Nico wrote: > the || __clang__ was here to use __builtin_trap on both win and non-win with > clang. since you want __debugbreak() on win, you can just make this > > #if defined(COMPILER_GCC) > #else defined(COMPILER_MSVC) > > and clang will mask as the appropriate thing on each platform Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
how bad would it be for codesize if we did `int 3, mov 0 [0]`?
mark@chromium.org changed reviewers: + mark@chromium.org
int3 ud2 would be better.
mark@chromium.org changed reviewers: - mark@chromium.org
Better than current patch set or better than what I said? (It's what my suggestion is supposed to mean -- looks like compilers convert the the current thing to 0 write followed by ud2 tho) On Feb 12, 2017 3:00 PM, <mark@chromium.org> wrote: > int3 > ud2 > > would be better. > > https://codereview.chromium.org/2676483002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Nico wrote: > Better than current patch set or better than what I said? (It's what my > suggestion is supposed to mean -- looks like compilers convert the the > current thing to 0 write followed by ud2 tho) Better than what you suggested for code size at least. mov byte ptr [0], 0 (Intel syntax for what I think you meant) is 8 bytes (plus another 2 for ud2, if that’s what you’re saying?) If you can afford to trash a register, you can get this down to 6 with xor rax, rax; mov byte [rax], 0. ud2 is 2 bytes.
Hi I wonder if there was any progress on reviewing this CL? We are still suffering a regression of fidelity of exit codes for CHECK() until this CL lands. Is it easier to just revert https://codereview.chromium.org/1982123002? See https://crbug.com/687326
Is this currently blocked on the question in #47? are you suggesting: #define IMMEDIATE_CRASH() {__debugbreak(); __asm ud2;} I'm not sure how to get msvc to place a ud2 in code... scott?
On 2017/02/14 00:21:59, Will Harris wrote: > Is this currently blocked on the question in #47? That's the pending review question, yes. > are you suggesting: > > #define IMMEDIATE_CRASH() {__debugbreak(); __asm ud2;} > > I'm not sure how to get msvc to place a ud2 in code... scott?
FWIW if this is time-sensitive for some reason (wfh seems a bit agitated :-) ), I'm fine with landing this (lgtm) and then discussing comments in a follow-up
On 2017/02/14 18:54:22, Nico wrote: > On 2017/02/14 00:21:59, Will Harris wrote: > > Is this currently blocked on the question in #47? > > That's the pending review question, yes. > > > are you suggesting: > > > > #define IMMEDIATE_CRASH() {__debugbreak(); __asm ud2;} > > > > I'm not sure how to get msvc to place a ud2 in code... scott? Here's binary size data for just int3 vs. int3;ud2. [check-3]C:\src\cr\src>type out\reloff\args.gn is_debug = false is_chrome_branded = true is_component_build = false symbol_level = 2 is_official_build = true is_syzyasan = false target_cpu = "x64" full_wpo_on_official = true win_console_app = true [check-3]C:\src\cr\src>dir out\reloff\just_debugbreak\chrome_child.dll Volume in drive C has no label. Volume Serial Number is 88D1-EBAA Directory of C:\src\cr\src\out\reloff\just_debugbreak 2017-02-14 02:39 PM 68,448,768 chrome_child.dll 1 File(s) 68,448,768 bytes 0 Dir(s) 139,948,883,968 bytes free [check-3]C:\src\cr\src>dir out\reloff\debugbreak_ud2\chrome_child.dll Volume in drive C has no label. Volume Serial Number is 88D1-EBAA Directory of C:\src\cr\src\out\reloff\debugbreak_ud2 2017-02-14 01:18 PM 68,553,216 chrome_child.dll 1 File(s) 68,553,216 bytes 0 Dir(s) 139,948,851,200 bytes free So, adding ud2 after debugbreak costs 104,448 bytes for that DLL in Official. I didn't build chrome.dll too because it takes forever, but the ratio of sizes between chrome.dll and chrome_child.dll predicts another 84,400 bytes for chrome.dll. I was honestly a little baffled (and I think Will was too) at the request to add something after the int3 as that's always what BreakDebugger() has been on Windows (that is, only __debugbreak()). But having now ventured into the _posix file, I see the source of disagreement, because on posix BreakDebugger() does one of bkpt 0, int 3, or abort() depending on platform, and but *then* follows that by _exit(1). So, I guess the question comes down whether ~190K of binary size is worth the belt-and-suspenders of crashing two different ways. To me, the only way that a ud2 after int3 is reachable is in the debugger, and if you're in the debugger all instructions are reachable, including the one after a ud2. The secondary question is how important consistency is across platforms. On non-Windows, we've switched to __builtin_trap() which I gather is effectively the same as __ud2() (for Intel at least). Is there any reason *not* to homogenize which of [ {int3}, {ud2}, {int3; ud2} ] we do across all OSs? If we have a defensible reason, then presumably it applies everywhere.
On 2017/02/14 23:12:35, scottmg wrote: > On 2017/02/14 18:54:22, Nico wrote: > > On 2017/02/14 00:21:59, Will Harris wrote: > > > Is this currently blocked on the question in #47? > > > > That's the pending review question, yes. > > > > > are you suggesting: > > > > > > #define IMMEDIATE_CRASH() {__debugbreak(); __asm ud2;} > > > > > > I'm not sure how to get msvc to place a ud2 in code... scott? > > Here's binary size data for just int3 vs. int3;ud2. > > [check-3]C:\src\cr\src>type out\reloff\args.gn > is_debug = false > is_chrome_branded = true > is_component_build = false > symbol_level = 2 > is_official_build = true > is_syzyasan = false > target_cpu = "x64" > full_wpo_on_official = true > win_console_app = true > > [check-3]C:\src\cr\src>dir out\reloff\just_debugbreak\chrome_child.dll > Volume in drive C has no label. > Volume Serial Number is 88D1-EBAA > > Directory of C:\src\cr\src\out\reloff\just_debugbreak > > 2017-02-14 02:39 PM 68,448,768 chrome_child.dll > 1 File(s) 68,448,768 bytes > 0 Dir(s) 139,948,883,968 bytes free > > [check-3]C:\src\cr\src>dir out\reloff\debugbreak_ud2\chrome_child.dll > Volume in drive C has no label. > Volume Serial Number is 88D1-EBAA > > Directory of C:\src\cr\src\out\reloff\debugbreak_ud2 > > 2017-02-14 01:18 PM 68,553,216 chrome_child.dll > 1 File(s) 68,553,216 bytes > 0 Dir(s) 139,948,851,200 bytes free > > > So, adding ud2 after debugbreak costs 104,448 bytes for that DLL in Official. I > didn't build chrome.dll too because it takes forever, but the ratio of sizes > between chrome.dll and chrome_child.dll predicts another 84,400 bytes for > chrome.dll. > > I was honestly a little baffled (and I think Will was too) at the request to add > something after the int3 as that's always what BreakDebugger() has been on > Windows (that is, only __debugbreak()). > > But having now ventured into the _posix file, I see the source of disagreement, > because on posix BreakDebugger() does one of bkpt 0, int 3, or abort() depending > on platform, and but *then* follows that by _exit(1). > > So, I guess the question comes down whether ~190K of binary size is worth the > belt-and-suspenders of crashing two different ways. To me, the only way that a > ud2 after int3 is reachable is in the debugger, and if you're in the debugger > all instructions are reachable, including the one after a ud2. > > The secondary question is how important consistency is across platforms. On > non-Windows, we've switched to __builtin_trap() which I gather is effectively > the same as __ud2() (for Intel at least). > > Is there any reason *not* to homogenize which of [ {int3}, {ud2}, {int3; ud2} ] > we do across all OSs? If we have a defensible reason, then presumably it > applies everywhere. Lacking consensus at the moment, I'm going to land this as-is, in service of continuity of renderer UMA stats, for a merge to 57. I'm happy to discuss/test/write followups, in particular to make behaviour the same-as-possible across platforms.
On 2017/02/14 23:12:35, scottmg > The secondary question is how important consistency is across platforms. On > non-Windows, we've switched to __builtin_trap() which I gather is effectively > the same as __ud2() (for Intel at least). > Is there any reason *not* to homogenize which of [ {int3}, {ud2}, {int3; ud2} ] > we do across all OSs? If we have a defensible reason, then presumably it > applies everywhere. I'm about to change that in https://codereview.chromium.org/2502953003/ and Mark convinced me that we should use int3 on x86 also. one of the tricky thing there, which is the reason why I'm not finished yet, is that GCC is getting annoyingly smart and when building with -Os (which is the default on Android) and takes the libery of folding together different asm volatile("int3") instructions, which defeats the multiples-CHECK-in-same-function goal (I hate when compilers try to outsmart my code). So, on Linux/Android at least, we will need something more than int 3. Some along the lines of : int 3 + *nullptr = __COUNTER__. I just have to find the right alignment of planets that fools consistently 4 compilers (android/gcc on x86, arm, arm64 and clang for desktop), doesn't make the binary too large, and produces a meaningful trap for breakpad.
On 2017/02/14 23:12:35, scottmg wrote: > On 2017/02/14 18:54:22, Nico wrote: > > On 2017/02/14 00:21:59, Will Harris wrote: > > > Is this currently blocked on the question in #47? > > > > That's the pending review question, yes. > > > > > are you suggesting: > > > > > > #define IMMEDIATE_CRASH() {__debugbreak(); __asm ud2;} > > > > > > I'm not sure how to get msvc to place a ud2 in code... scott? > > Here's binary size data for just int3 vs. int3;ud2. > > [check-3]C:\src\cr\src>type out\reloff\args.gn > is_debug = false > is_chrome_branded = true > is_component_build = false > symbol_level = 2 > is_official_build = true > is_syzyasan = false > target_cpu = "x64" > full_wpo_on_official = true > win_console_app = true > > [check-3]C:\src\cr\src>dir out\reloff\just_debugbreak\chrome_child.dll > Volume in drive C has no label. > Volume Serial Number is 88D1-EBAA > > Directory of C:\src\cr\src\out\reloff\just_debugbreak > > 2017-02-14 02:39 PM 68,448,768 chrome_child.dll > 1 File(s) 68,448,768 bytes > 0 Dir(s) 139,948,883,968 bytes free > > [check-3]C:\src\cr\src>dir out\reloff\debugbreak_ud2\chrome_child.dll > Volume in drive C has no label. > Volume Serial Number is 88D1-EBAA > > Directory of C:\src\cr\src\out\reloff\debugbreak_ud2 > > 2017-02-14 01:18 PM 68,553,216 chrome_child.dll > 1 File(s) 68,553,216 bytes > 0 Dir(s) 139,948,851,200 bytes free > > > So, adding ud2 after debugbreak costs 104,448 bytes for that DLL in Official. I > didn't build chrome.dll too because it takes forever, but the ratio of sizes > between chrome.dll and chrome_child.dll predicts another 84,400 bytes for > chrome.dll. It's still smaller than the current 0 write which is 5 bytes per CHECK, right? > I was honestly a little baffled (and I think Will was too) at the request to add > something after the int3 as that's always what BreakDebugger() has been on > Windows (that is, only __debugbreak()). > > But having now ventured into the _posix file, I see the source of disagreement, > because on posix BreakDebugger() does one of bkpt 0, int 3, or abort() depending > on platform, and but *then* follows that by _exit(1). > > So, I guess the question comes down whether ~190K of binary size is worth the > belt-and-suspenders of crashing two different ways. To me, the only way that a > ud2 after int3 is reachable is in the debugger, and if you're in the debugger > all instructions are reachable, including the one after a ud2. How are they reachable? By writing to rip? As mentioned above, int3 can just be f10'd over, and ud2 can't. > The secondary question is how important consistency is across platforms. On > non-Windows, we've switched to __builtin_trap() which I gather is effectively > the same as __ud2() (for Intel at least). > > Is there any reason *not* to homogenize which of [ {int3}, {ud2}, {int3; ud2} ] > we do across all OSs? If we have a defensible reason, then presumably it > applies everywhere. It depends a bit on how things map to the various SIGs on posix, but yes, I think it'd be good to have a consistent behavior on all platforms (see primiano's reply)
(but as said, landing as-is for now sounds fine to me, as does reverting my original change as mentioned all the way back in https://codereview.chromium.org/2676483002/#msg22)
The CQ bit was checked by scottmg@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from primiano@chromium.org, wfh@chromium.org Link to the patchset: https://codereview.chromium.org/2676483002/#ps60001 (title: "no need to special case #if for clang")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/02/15 17:42:04, Nico wrote: > On 2017/02/14 23:12:35, scottmg wrote: > > On 2017/02/14 18:54:22, Nico wrote: > > > On 2017/02/14 00:21:59, Will Harris wrote: > > > > Is this currently blocked on the question in #47? > > > > > > That's the pending review question, yes. > > > > > > > are you suggesting: > > > > > > > > #define IMMEDIATE_CRASH() {__debugbreak(); __asm ud2;} > > > > > > > > I'm not sure how to get msvc to place a ud2 in code... scott? > > > > Here's binary size data for just int3 vs. int3;ud2. > > > > [check-3]C:\src\cr\src>type out\reloff\args.gn > > is_debug = false > > is_chrome_branded = true > > is_component_build = false > > symbol_level = 2 > > is_official_build = true > > is_syzyasan = false > > target_cpu = "x64" > > full_wpo_on_official = true > > win_console_app = true > > > > [check-3]C:\src\cr\src>dir out\reloff\just_debugbreak\chrome_child.dll > > Volume in drive C has no label. > > Volume Serial Number is 88D1-EBAA > > > > Directory of C:\src\cr\src\out\reloff\just_debugbreak > > > > 2017-02-14 02:39 PM 68,448,768 chrome_child.dll > > 1 File(s) 68,448,768 bytes > > 0 Dir(s) 139,948,883,968 bytes free > > > > [check-3]C:\src\cr\src>dir out\reloff\debugbreak_ud2\chrome_child.dll > > Volume in drive C has no label. > > Volume Serial Number is 88D1-EBAA > > > > Directory of C:\src\cr\src\out\reloff\debugbreak_ud2 > > > > 2017-02-14 01:18 PM 68,553,216 chrome_child.dll > > 1 File(s) 68,553,216 bytes > > 0 Dir(s) 139,948,851,200 bytes free > > > > > > So, adding ud2 after debugbreak costs 104,448 bytes for that DLL in Official. > I > > didn't build chrome.dll too because it takes forever, but the ratio of sizes > > between chrome.dll and chrome_child.dll predicts another 84,400 bytes for > > chrome.dll. > > It's still smaller than the current 0 write which is 5 bytes per CHECK, right? > > > I was honestly a little baffled (and I think Will was too) at the request to > add > > something after the int3 as that's always what BreakDebugger() has been on > > Windows (that is, only __debugbreak()). > > > > But having now ventured into the _posix file, I see the source of > disagreement, > > because on posix BreakDebugger() does one of bkpt 0, int 3, or abort() > depending > > on platform, and but *then* follows that by _exit(1). > > > > So, I guess the question comes down whether ~190K of binary size is worth the > > belt-and-suspenders of crashing two different ways. To me, the only way that a > > ud2 after int3 is reachable is in the debugger, and if you're in the debugger > > all instructions are reachable, including the one after a ud2. > > How are they reachable? By writing to rip? As mentioned above, int3 can just be > f10'd over, and ud2 can't. Right, by using Set Next Instruction. I agree that it requires different actions to step over, but in practice both are only available while running in a debugger, so it doesn't matter to the end user. (Sometimes it's actually nice to be able to just run past a CHECK while debugging locally too.) > > > The secondary question is how important consistency is across platforms. On > > non-Windows, we've switched to __builtin_trap() which I gather is effectively > > the same as __ud2() (for Intel at least). > > > > Is there any reason *not* to homogenize which of [ {int3}, {ud2}, {int3; ud2} > ] > > we do across all OSs? If we have a defensible reason, then presumably it > > applies everywhere. > > It depends a bit on how things map to the various SIGs on posix, but yes, I > think it'd be good to have a consistent behavior on all platforms (see > primiano's reply) Ah, right, I saw the SIG mapping table debugger_posix that discusses that a bit, but I wasn't sure how Android fit in there.
On 2017/02/15 18:01:46, scottmg wrote: > > It depends a bit on how things map to the various SIGs on posix, but yes, I > > think it'd be good to have a consistent behavior on all platforms (see > > primiano's reply) > > Ah, right, I saw the SIG mapping table debugger_posix that discusses that a bit, > but I wasn't sure how Android fit in there. In one word: inconsistently. the same opcode (bkpt) on the same arm32 binary will generate a SIGBUS if running on a armv7 kernel and a SIGTRAP when running (in 32-bit mode) on aarch64 kernel, because the kernel has completely different paths for handling the fault for the 2 archs and they are obviously inconsistent. There doesn't seem to be any reasonable way get a SIGTRAP on arm32 kernels. In the rest of the platforms (x86, aarch64, arm32 on aarch64) we should be able to get consistently a SIGTRAP, which is the closest posixy signal for a CHECK.
CQ is committing da patch. Bot data: {"patchset_id": 60001, "attempt_start_ts": 1487181462292860, "parent_rev": "64b01bc4eef3d913f7d48565a141d1334613fd9e", "commit_rev": "a17c8db528cadca9eef98ce03b0910700735722e"}
Message was sent while issue was closed.
Description was changed from ========== Make CHECK int 3 on Windows, rather than crash Not being able to distinguish intentional crashes (CHECK) from unintentional ones (other access violations) in terms of exit codes reduces the amount of signal we have in stability metrics. R=wfh@chromium.org BUG=664209,687326 ========== to ========== Make CHECK int 3 on Windows, rather than crash Not being able to distinguish intentional crashes (CHECK) from unintentional ones (other access violations) in terms of exit codes reduces the amount of signal we have in stability metrics. R=wfh@chromium.org BUG=664209,687326 Review-Url: https://codereview.chromium.org/2676483002 Cr-Commit-Position: refs/heads/master@{#450809} Committed: https://chromium.googlesource.com/chromium/src/+/a17c8db528cadca9eef98ce03b09... ==========
Message was sent while issue was closed.
Committed patchset #4 (id:60001) as https://chromium.googlesource.com/chromium/src/+/a17c8db528cadca9eef98ce03b09... |