|
|
Created:
8 years, 2 months ago by willchan no longer on Chromium Modified:
7 years, 11 months ago CC:
v8-dev, ojan, jln (very slow on Chromium) Visibility:
Public. |
DescriptionPass the SIGPROF signal on to previously registered signal handler.
This enables the google-perftools SIGPROF signal handler to continue to work properly.
BUG=none
Committed: https://code.google.com/p/v8/source/detail?r=12889
Patch Set 1 #
Total comments: 1
Patch Set 2 : Address comment. #Patch Set 3 : Compiles now. #
Total comments: 7
Patch Set 4 : fixes as per markus #Patch Set 5 : Handle SIG_DFL & SIG_IGN #
Total comments: 6
Patch Set 6 : Address latest comments. #Patch Set 7 : Add newlines. #Messages
Total messages: 20 (0 generated)
This is my first time sending a patch to V8. I'm probably doing something silly, so please let me know what I'm doing wrong :)
https://codereview.chromium.org/11195045/diff/1/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/1/src/platform-linux.cc#newcode80 src/platform-linux.cc:80: static void (*g_old_signal_handler)(int, siginfo_t *, void *) = NULL; I don't think you need to store this separately. Add a "CallOldSignalHandler" method to SignalSender that checks signal_handler_installed_ and then calls old_signal_handler_.sa_sigaction. Call that from ProfilerSignalHandler.
Sorry for the delay, I think I've address the comment now.
lgtm, I'll land this for you.
Hold on, although the patch applies cleanly, the call of SignalSender::CallOldSignalHandler isn't declared before it's use, so compilation fails. Can you please address this (make sure that "make check -j15" runs on your V8 directory) and resubmit the patch?
OK, I think this should work for real this time.
lgtm, I'll land this
Two Webkit tests started crashing with this change on both (32 bit and 64 bit) Webkit Linux integration builders. The name strongly suggests that it is actually related to this change and not just a flake. Regressions: Unexpected crashes (2) inspector/profiler/cpu-profiler-profiling-without-inspector.html inspector/profiler/cpu-profiler-profiling.html http://build.chromium.org/p/client.v8/builders/Webkit%20Linux/builds/4988 http://build.chromium.org/p/client.v8/builders/Webkit%20Linux%2064/builds/4975
Message was sent while issue was closed.
https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1073: if (signal_handler_installed_ && old_signal_handler_.sa_sigaction) old_signal_handler_.sa_sigaction is 0x1 in the crash. Apparently this is the magic value SIG_IGN or something. I'm adding markus@ to the code review so he can educate me on what I'm actually supposed to be doing here to make V8's SIGPROF handler and TCMalloc's SIGPROF handler play nice. Then I'll fix this. danno/v8-dev: Should I reopen this issue and upload new patches here? Or should I open a new issue?
Message was sent while issue was closed.
https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1074: old_signal_handler_.sa_sigaction(signal, info, context); If you want to do this right, things get a lot more complicated. In practice, you can probably skip some of the more complicated corner-cases and you'll be OK. You ideally should check the signal mask first. If the signal was originally masked, then no matter what you do, it would never be raised. Of course, inside of the signal handler, this is all too late. The signal was obviously unmasked, because otherwise you would never be inside your own signal handler. It might in fact be masked right now, though, as that happens unless you set the SA_NODEFER flag. Once you know that signal delivery was in fact expected, you now have to determine what the signal disposition is. For that, you should check whether SA_SIGINFO was set in sa_flags. If so, then your code is correct. You can just call the function in sa_sigaction. If it wasn't set though, you are looking at a signal that behaves as if it had been set with the old-school signal(2) function. This means a) you have to look at sa_handler rather than at sa_sigaction, and b) you have to also be prepared to handle SIG_IGN and SIG_DFL. These are magic constants and you can just compare them directly against the value of sa_handler. If it is SIG_IGN, things are easy. You don't need to do anything. If it is SIG_DFL, things are a little more complicated. You now need to check what the default signal disposition is; the man page for signal(7) tells you. In the case of SIGPROF, it terminates the program. You could just call _exit() (please note the underscore!). But the exit code won't be quite right. If you need the correct exit code, make sure that your SIGPROF signal is masked (as it would be unless you set SA_NODEFER) and then raise SIGPROF again. This will result in the expect exit code. Only, there is another complication. I don't believe that SIGPROF is a synchronous signal, and signal masks are a per-thread property. So, SIGPROF is most likely unmasked in another thread and the kernel could very well decide to deliver it there. Rather than calling "raise(SIGPROF)", you might want to do "tgkill(-1, gettid(), SIGPROF)". I don't know whether we currently expose "gettid()"; if we don't, then use "syscall(__NR_gettid)". Now that you know how to handle SIG_IGN and SIG_DFL, the remaining question is what to do if sa_handler points to an actual function. This is the most difficult problem to solve, as the two different types of signal handlers expect completely different stack frames (at least on some architectures). You can (often) convert one into the other; if I recall correctly, a sigaction-style stack frame is technically a superset of a sighandler-style stack frame. This is all highly specific to the target platform though. And it is of course Linux specific; POSIX doesn't allow for any of what you are doing here. But I suspect we are running into some diminishing returns. As you mentioned earlier, it is hopefully safe to assume that both handlers (if present at all) are sigaction-style handlers. I'll leave it up to you to decide what to do if this last assumption doesn't hold. You could fail with a CHECK() or at least a DCHECK(). Or you could simply decide to do what we did before and ignore the old handler. You could also do crazy things with repointing the signal handler and then rethrowing the signal; POSIX would be happier if you did that. But again, that's probably more trouble than what it's worth. And I don't even want to know what the associated unittest is going to look like (you do plan on testing this, don't you?). Talking about CHECK() and DCHECK(). Neither one of them is technically safe to be called at this time and could quite conceivably result in a dead lock or worse. Hopefully, it's a rare case and we are OK with very rare inexplicable bugs; but you might still want to document this. And just for completeness sake, there is one last bit of information. This is probably not something that you'll run into though. Chaining of signal handlers interposes an additional stack frame. This break any call to sigreturn(), and it could also break other low-level code. Fortunately, you are unlikely to see that in any signal handler that is written in a high-level language. But if you wanted to be really precise, you would pop your own stack frame prior to chaining. https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1212: if (signal != SIGPROF) return; This seems somewhat bogus. That really should never happen. And if it does, then why do decide to skip calling the old signal handler. Isn't that just papering over bugs. Maybe a CHECK() or at least a DCHECK() would be more appropriate. https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1263: sampler->Tick(sample); I don't see any code here that preserves "errno". That is broken. You will almost certainly clobber this value and cause very subtle bugs. All of that is ignoring the fact that SIGPROF is an asynchronous signal and I suspect at least some of what you are calling is not async signal safe ... but I haven't bothered going over that code with a fine-toothed comb. Maybe, somebody else already did and it is in fact OK. In general though, writing signal handlers in C++ is almost impossible to get right.
Message was sent while issue was closed.
https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1073: if (signal_handler_installed_ && old_signal_handler_.sa_sigaction) Go ahead and re-open this one, since I'd like to make sure that all of the knowledge how to fix this including Markus' comments are in one place. After what he wrote, however, I feel completely unqualified to review this patch. :-) On 2012/12/12 22:28:37, willchan wrote: > old_signal_handler_.sa_sigaction is 0x1 in the crash. Apparently this is the > magic value SIG_IGN or something. I'm adding markus@ to the code review so he > can educate me on what I'm actually supposed to be doing here to make V8's > SIGPROF handler and TCMalloc's SIGPROF handler play nice. Then I'll fix this. > > danno/v8-dev: Should I reopen this issue and upload new patches here? Or should > I open a new issue?
Reopened. I'll hopefully upload a patch later today. https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1073: if (signal_handler_installed_ && old_signal_handler_.sa_sigaction) On 2012/12/13 00:00:19, danno wrote: > Go ahead and re-open this one, since I'd like to make sure that all of the > knowledge how to fix this including Markus' comments are in one place. After > what he wrote, however, I feel completely unqualified to review this patch. :-) Don't worry, that's exactly why I added Markus to the code review so he could tell me everything I'm doing wrong :) Just need you to doublecheck we're not doing anything stupid with regards to V8, since we're both foreigners here. > > On 2012/12/12 22:28:37, willchan wrote: > > old_signal_handler_.sa_sigaction is 0x1 in the crash. Apparently this is the > > magic value SIG_IGN or something. I'm adding markus@ to the code review so he > > can educate me on what I'm actually supposed to be doing here to make V8's > > SIGPROF handler and TCMalloc's SIGPROF handler play nice. Then I'll fix this. > > > > danno/v8-dev: Should I reopen this issue and upload new patches here? Or > should > > I open a new issue? > https://codereview.chromium.org/11195045/diff/9001/src/platform-linux.cc#newc... src/platform-linux.cc:1074: old_signal_handler_.sa_sigaction(signal, info, context); On 2012/12/12 22:58:17, Markus (顧孟勤) wrote: > If you want to do this right, things get a lot more complicated. In practice, > you can probably skip some of the more complicated corner-cases and you'll be > OK. Yeah, I'm going to skip almost all the corner cases. Make things fail loudly when people start doing unexpected things, but otherwise I'll just try to make sure Crankshaft and google-perftools play nice. > > You ideally should check the signal mask first. If the signal was originally > masked, then no matter what you do, it would never be raised. > > Of course, inside of the signal handler, this is all too late. The signal was > obviously unmasked, because otherwise you would never be inside your own signal > handler. It might in fact be masked right now, though, as that happens unless > you set the SA_NODEFER flag. > > Once you know that signal delivery was in fact expected, you now have to > determine what the signal disposition is. For that, you should check whether > SA_SIGINFO was set in sa_flags. > > If so, then your code is correct. You can just call the function in > sa_sigaction. > > If it wasn't set though, you are looking at a signal that behaves as if it had > been set with the old-school signal(2) function. This means a) you have to look > at sa_handler rather than at sa_sigaction, and b) you have to also be prepared > to handle SIG_IGN and SIG_DFL. > > These are magic constants and you can just compare them directly against the > value of sa_handler. > > If it is SIG_IGN, things are easy. You don't need to do anything. If it is > SIG_DFL, things are a little more complicated. You now need to check what the > default signal disposition is; the man page for signal(7) tells you. > > In the case of SIGPROF, it terminates the program. You could just call _exit() > (please note the underscore!). But the exit code won't be quite right. If you > need the correct exit code, make sure that your SIGPROF signal is masked (as it > would be unless you set SA_NODEFER) and then raise SIGPROF again. This will > result in the expect exit code. > > Only, there is another complication. I don't believe that SIGPROF is a > synchronous signal, and signal masks are a per-thread property. So, SIGPROF is > most likely unmasked in another thread and the kernel could very well decide to > deliver it there. Rather than calling "raise(SIGPROF)", you might want to do > "tgkill(-1, gettid(), SIGPROF)". I don't know whether we currently expose > "gettid()"; if we don't, then use "syscall(__NR_gettid)". > > Now that you know how to handle SIG_IGN and SIG_DFL, the remaining question is > what to do if sa_handler points to an actual function. > > This is the most difficult problem to solve, as the two different types of > signal handlers expect completely different stack frames (at least on some > architectures). You can (often) convert one into the other; if I recall > correctly, a sigaction-style stack frame is technically a superset of a > sighandler-style stack frame. This is all highly specific to the target platform > though. And it is of course Linux specific; POSIX doesn't allow for any of what > you are doing here. > > But I suspect we are running into some diminishing returns. As you mentioned > earlier, it is hopefully safe to assume that both handlers (if present at all) > are sigaction-style handlers. > > I'll leave it up to you to decide what to do if this last assumption doesn't > hold. You could fail with a CHECK() or at least a DCHECK(). Or you could simply > decide to do what we did before and ignore the old handler. > > You could also do crazy things with repointing the signal handler and then > rethrowing the signal; POSIX would be happier if you did that. But again, that's > probably more trouble than what it's worth. And I don't even want to know what > the associated unittest is going to look like (you do plan on testing this, > don't you?). > > Talking about CHECK() and DCHECK(). Neither one of them is technically safe to > be called at this time and could quite conceivably result in a dead lock or > worse. Hopefully, it's a rare case and we are OK with very rare inexplicable > bugs; but you might still want to document this. > > And just for completeness sake, there is one last bit of information. This is > probably not something that you'll run into though. > > Chaining of signal handlers interposes an additional stack frame. This break any > call to sigreturn(), and it could also break other low-level code. Fortunately, > you are unlikely to see that in any signal handler that is written in a > high-level language. But if you wanted to be really precise, you would pop your > own stack frame prior to chaining.
OK Markus, I think this is ready for a look. It passes that webkit test now.
https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc#new... src/platform-linux.cc:1036: static void ProfilerSignalHandler(int signal, siginfo_t* info, void* context); Our style guide for some reason prefers that we don't mark symbols as static. Instead, we should put them into an anonymous namespace. I don't have particularly strong feelings either way, so use your own best judgement. https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc#new... src/platform-linux.cc:1074: if (sigismember(&old_signal_handler_.sa_mask, SIGPROF)) I don't believe this is the correct mask. See below for an explanation. You should use sigprocmask() to store the old signal mask when you use sigaction() to set up your signal handler. And in fact, you really can't rely on the pre-existing signal mask, so you really need to make sure you unmask the signal when you originally set up the signal handler. I believe this is a pre-existing bug in this code. Please note that this is somewhat failure prone, if you already launched threads. Unlike sigaction(), which makes a global change, sigprocmask() changes a per-thread property. You are kind of out of luck, if there are other threads around that also require their signal mask to be adjusted. I don't know enough about V8 to tell you whether this is a problem or not. Background information follows: "old_signal_handler_.sa_mask" is the mask that should be applied each time a SIGPROF is being handled. For example, you could use this feature to prevent SIGALRM from triggering a signal handler while you are already inside of a SIGPROF handler. Once your SIGPROF signal handler returns, the mask will be restored to the value that it had prior to entry into the signal handler. And any delayed signals will then be handled. So, you are not actually losing the pending SIGALRM -- this only ever works for a single signal though; queuing of signal doesn't (typically) happen. This means, sa_mask gives you control over whether you want to nest signal handlers, and if so, which signal handlers can validly be nested. In most use cases, people clear the mask. This results in all nesting of signal handlers to be allowed. The only nesting that is not allowed is a repeat of the same signal (i.e. you won't try to handle another SIGPROF while you are already handling the first one). If you also want to allow nesting of identical signals, then you would have to set the SA_NODEFER flag. But that's often difficult to get right. So, don't do that unless you really need it. https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc#new... src/platform-linux.cc:1077: // It's safe to ignore the old signal handler if it's one of these values. Your code is arguably the right thing to do ... even though it is technically incorrect. signal(7) tells you that the default disposition for SIGPROF is to terminate the process. In other words, the technically correct solution would require for you to exit if you see that sa_handler used to be SIG_DFL. In practice, this is probably not what you want to do. If nobody ever used profiling, then it is quite likely that sa_handler has been unchanged and is in fact SIG_DFL. And that just didn't cause any problems, because nobody ever raised SIGPROF before we came along. So, yes, ignoring SIG_DFL is the sanest solution. You might just want to add a more verbose comment explaining why it is sane. Maybe, something along the lines: If the old signal disposition was SIG_IGN, then it is safe to ignore the old signal handler. If it is SIG_DFL, then that is pretty strong evidence that nobody other than us is using the profiling infrastructure. It is probably reasonable to assume that in this case we don't need to chain to the old behavior, as that would merely result in the application getting terminated. https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc#new... src/platform-linux.cc:1225: if (signal != SIGPROF) return; You should probably restore "errno" for each of the possible exit paths of this function. Maybe, for once, it makes sense to use "goto" to jump to a common exit handler at the end of the function?
https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc#new... src/platform-linux.cc:1036: static void ProfilerSignalHandler(int signal, siginfo_t* info, void* context); On 2012/12/13 08:49:45, Markus (顧孟勤) wrote: > Our style guide for some reason prefers that we don't mark symbols as static. > Instead, we should put them into an anonymous namespace. [...] The Google C++ style guide allows both possibilities and does not even give a preference to one of them. The Chrome style guide (which we don't follow at all ;-) seems to prefer unnamed namespaces, but I don't understand the reasoning behind it: At least on Linux the end result seems to be the same. But to make things short: Just leave "static" here, we use it all over the place in v8.
https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc File src/platform-linux.cc (right): https://codereview.chromium.org/11195045/diff/18002/src/platform-linux.cc#new... src/platform-linux.cc:1036: static void ProfilerSignalHandler(int signal, siginfo_t* info, void* context); Ah, sanity prevails. And yes, I agree that the preference in the Chrome style guide seems somewhat arbitrary :-) On 2012/12/13 09:34:54, Sven Panne wrote: > On 2012/12/13 08:49:45, Markus (顧孟勤) wrote: > > Our style guide for some reason prefers that we don't mark symbols as static. > > Instead, we should put them into an anonymous namespace. [...] > > The Google C++ style guide allows both possibilities and does not even give a > preference to one of them. The Chrome style guide (which we don't follow at all > ;-) seems to prefer unnamed namespaces, but I don't understand the reasoning > behind it: At least on Linux the end result seems to be the same. > > But to make things short: Just leave "static" here, we use it all over the place > in v8.
OK, I think this is ready for another look!
lgtm I like it. Thank you. And sorry for being so anal. I should probably ask about what we do for unit testing. But I am not sufficiently familiar with either V8 or tcmalloc to make any suggestion on what proper testing would look like.
I don't know what we do for unit testing for this code in v8. I'm happy to do as they advise here.
We have (a lot of) C++ based tests of the V8 API in src/test/cctest/. You should be able to add something src/test/cctest/test-platform-linux.cc that sets up a signal handler, initializes V8 and verifies that the call through works properly. It would be great if you added something like this before we land this CL. |