|
|
Created:
10 years, 6 months ago by tessamac Modified:
9 years, 7 months ago CC:
chromium-reviews, John Grabowski, pam+watch_chromium.org, brettw-cc_chromium.org Base URL:
http://src.chromium.org/svn/trunk/src/ Visibility:
Public. |
DescriptionAdd and alternative GetAppOutput() to process_util that takes a timeout.
Contributed by tessamac@chromium.org
TEST=none
BUG=47356
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=52608
Patch Set 1 #
Total comments: 38
Patch Set 2 : '' #
Total comments: 23
Patch Set 3 : '' #
Total comments: 2
Patch Set 4 : '' #Patch Set 5 : '' #
Total comments: 6
Patch Set 6 : '' #
Total comments: 2
Patch Set 7 : '' #
Messages
Total messages: 23 (0 generated)
Looks like a good start to me. Adding agl (to make sure the POSIX code is solid) and viettrungluu. http://codereview.chromium.org/2810014/diff/1/2 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/1/2#newcode238 base/process_util.h:238: // Just like GetAppOutput() but you can specify a timeout (-1 for no timeout), I'd prefer something more like "Similar to GetAppOutput. Puts the exit code in |exit_code|, and waits no more than |timeout_milliseconds| for the launched process to exit. After the timeout is triggered, the process is killed." Also, we'd need a Windows version of this function. It'd be good to do it now, to make sure we can write an implementation for this interface on all platforms. http://codereview.chromium.org/2810014/diff/1/2#newcode250 base/process_util.h:250: nit: Please remove the empty line. http://codereview.chromium.org/2810014/diff/1/3 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/1/3#newcode740 base/process_util_posix.cc:740: // Executes the application specified by |cl| and wait for it to exit. Stores Let's put the info about timeout here. Like this: "... and wait for it to exit, but at most |timeout_milliseconds| (use kNoTimeout for no timeout)." http://codereview.chromium.org/2810014/diff/1/3#newcode741 base/process_util_posix.cc:741: // the output (stdout) in |output|. If |do_search_path| is set, it searches the "... the output (stdout) in |output|, and the exit code in |exit_code|." http://codereview.chromium.org/2810014/diff/1/3#newcode831 base/process_util_posix.cc:831: // This read is not interruptable, so we may exceed timout. nit: interruptable -> interruptible, timout -> timeout It seems dangerous though. I'm not sure how it's going to behave if the launched process hangs or generally exceeds the timeout. Maybe we should set up some timer to kill the process. http://codereview.chromium.org/2810014/diff/1/3#newcode833 base/process_util_posix.cc:833: std::min(output_buf_left, nit: I think it was fine before (on one line). Could you restore the previous version? http://codereview.chromium.org/2810014/diff/1/3#newcode849 base/process_util_posix.cc:849: *exit_code = -1; // timeout nit: the "timeout" comment adds no value. Maybe we should use some constant, like PROCESS_END_KILLED_BY_USER. http://codereview.chromium.org/2810014/diff/1/3#newcode850 base/process_util_posix.cc:850: KillProcess(pid, *exit_code, true); We should LOG(ERROR) that the process has exceeded the timeout and we're killing it. http://codereview.chromium.org/2810014/diff/1/3#newcode870 base/process_util_posix.cc:870: int unused_exit_code; nit: I'd prefer not to put "unused" in the name. Applies to other places in this file. http://codereview.chromium.org/2810014/diff/1/3#newcode882 base/process_util_posix.cc:882: nit: Remove redundant empty lines. http://codereview.chromium.org/2810014/diff/1/3#newcode885 base/process_util_posix.cc:885: // TODO(viettrungluu): Conceivably, we should have a timeout as well, so we nit: Is this comment still relevant? http://codereview.chromium.org/2810014/diff/1/3#newcode893 base/process_util_posix.cc:893: &unused_exit_code, -1); nit: use kNoTimeout instead of -1. http://codereview.chromium.org/2810014/diff/1/4 File base/process_util_unittest.cc (right): http://codereview.chromium.org/2810014/diff/1/4#newcode438 base/process_util_unittest.cc:438: TEST_F(ProcessUtilTest, GetAppOutput_Success) { Could you make sure these tests run on Windows too? Using a different version of the test for Windows is fine (it probably doesn't have /bin/echo). You'd probably have to use MULTIPROCESS_TEST. On the other hand, it may be easier to just have all these AppOutput tests cross-platform. http://codereview.chromium.org/2810014/diff/1/4#newcode456 base/process_util_unittest.cc:456: std::vector<std::string> argv; nit: Add comment above "Make sure we capture output in case of failure". http://codereview.chromium.org/2810014/diff/1/4#newcode464 base/process_util_unittest.cc:464: TEST_F(ProcessUtilTest, GetAppOutputWithTimeout_SuccessWithinTimout) { nit: Timeout (applies to other places too). http://codereview.chromium.org/2810014/diff/1/4#newcode474 base/process_util_unittest.cc:474: std::numeric_limits<std::size_t>::max())); nit: rather base::kNoTimeout. Also applies to other places in this file. http://codereview.chromium.org/2810014/diff/1/4#newcode498 base/process_util_unittest.cc:498: argv.push_back("sleep 100000"); // long enough to trigger gtest timeout? nit: I don't understand the comment. Could you clarify it or remove it? http://codereview.chromium.org/2810014/diff/1/4#newcode519 base/process_util_unittest.cc:519: EXPECT_EQ(-1, exit_code); // exit timeout nit: I'd prefer some constant instead of "-1".
My problem with this is that it doesn't really work: if the child is streaming output and the read() keeps completing, then it's fine. But otherwise it'll never check the timeout. Is there a specific case that this is intended for? A child process with the ideal behaviour? In that case it may be ok to add this function with sufficient warnings in the header about its limitations. However, it's really not that much code to do the timeout correctly with poll(2).
On Tue, Jun 22, 2010 at 19:21, <agl@chromium.org> wrote: > My problem with this is that it doesn't really work: if the child is > streaming > output and the read() keeps completing, then it's fine. But otherwise it'll > never check the timeout. > > Is there a specific case that this is intended for? A child process with > the > ideal behaviour? > This is going to be used in browser_tests, so the launched process is expected to misbehave and hang. > In that case it may be ok to add this function with sufficient warnings in > the > header about its limitations. However, it's really not that much code to do > the > timeout correctly with poll(2). > > > http://codereview.chromium.org/2810014/show >
On Tue, Jun 22, 2010 at 1:36 PM, Paweł Hajdan, Jr. <phajdan.jr@chromium.org> wrote: > This is going to be used in browser_tests, so the launched process is > expected to misbehave and hang. It needs to be written with poll(2) then. AGL
I've addressed all your comments but am still working on the Windows stuff. Please take another look. http://codereview.chromium.org/2810014/diff/1/2 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/1/2#newcode238 base/process_util.h:238: // Just like GetAppOutput() but you can specify a timeout (-1 for no timeout), On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > I'd prefer something more like "Similar to GetAppOutput. Puts the exit code in > |exit_code|, and waits no more than |timeout_milliseconds| for the launched > process to exit. After the timeout is triggered, the process is killed." > Done. > > Also, we'd need a Windows version of this function. It'd be good to do it now, > to make sure we can write an implementation for this interface on all platforms. I'll look into this. I should warn you that I haven't ever done any window dev (nor have I used windows in the last 6 years, gulp) so this might take me some time... (but I do want to learn so it's fine with me if it's fine with you). Do you want it in this CL or a separate one? http://codereview.chromium.org/2810014/diff/1/2#newcode250 base/process_util.h:250: On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: Please remove the empty line. Done. http://codereview.chromium.org/2810014/diff/1/3 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/1/3#newcode740 base/process_util_posix.cc:740: // Executes the application specified by |cl| and wait for it to exit. Stores On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > Let's put the info about timeout here. Like this: > > "... and wait for it to exit, but at most |timeout_milliseconds| (use kNoTimeout > for no timeout)." Done. http://codereview.chromium.org/2810014/diff/1/3#newcode741 base/process_util_posix.cc:741: // the output (stdout) in |output|. If |do_search_path| is set, it searches the On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > "... the output (stdout) in |output|, and the exit code in |exit_code|." Done. http://codereview.chromium.org/2810014/diff/1/3#newcode831 base/process_util_posix.cc:831: // This read is not interruptable, so we may exceed timout. On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: interruptable -> interruptible, timout -> timeout > > It seems dangerous though. I'm not sure how it's going to behave if the launched > process hangs or generally exceeds the timeout. Maybe we should set up some > timer to kill the process. I updated this to only read if poll() returns ready. (and poll() has a zero timeout so I don't think it will block). I guess we could still miss the timeout if it happened during a read. Do you have any idea how to write a test for that? http://codereview.chromium.org/2810014/diff/1/3#newcode833 base/process_util_posix.cc:833: std::min(output_buf_left, On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: I think it was fine before (on one line). Could you restore the previous > version? Not sure why I changed that. Oops. http://codereview.chromium.org/2810014/diff/1/3#newcode849 base/process_util_posix.cc:849: *exit_code = -1; // timeout On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: the "timeout" comment adds no value. Maybe we should use some constant, > like PROCESS_END_KILLED_BY_USER. Done. http://codereview.chromium.org/2810014/diff/1/3#newcode850 base/process_util_posix.cc:850: KillProcess(pid, *exit_code, true); On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > We should LOG(ERROR) that the process has exceeded the timeout and we're killing > it. Done. http://codereview.chromium.org/2810014/diff/1/3#newcode870 base/process_util_posix.cc:870: int unused_exit_code; On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: I'd prefer not to put "unused" in the name. Applies to other places in this > file. Done. Is this not the convention in chromium? Is there some other way that you mark variable as unused/ignored? http://codereview.chromium.org/2810014/diff/1/3#newcode882 base/process_util_posix.cc:882: On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: Remove redundant empty lines. Done. http://codereview.chromium.org/2810014/diff/1/3#newcode885 base/process_util_posix.cc:885: // TODO(viettrungluu): Conceivably, we should have a timeout as well, so we On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: Is this comment still relevant? Done. http://codereview.chromium.org/2810014/diff/1/3#newcode893 base/process_util_posix.cc:893: &unused_exit_code, -1); On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: use kNoTimeout instead of -1. Done. http://codereview.chromium.org/2810014/diff/1/4 File base/process_util_unittest.cc (right): http://codereview.chromium.org/2810014/diff/1/4#newcode438 base/process_util_unittest.cc:438: TEST_F(ProcessUtilTest, GetAppOutput_Success) { On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > Could you make sure these tests run on Windows too? Using a different version of > the test for Windows is fine (it probably doesn't have /bin/echo). You'd > probably have to use MULTIPROCESS_TEST. On the other hand, it may be easier to > just have all these AppOutput tests cross-platform. I'm working on this, but I don't have a windows machine yet so I'm depending on the trybots (might borrow someones windows machine today). http://codereview.chromium.org/2810014/diff/1/4#newcode456 base/process_util_unittest.cc:456: std::vector<std::string> argv; On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: Add comment above "Make sure we capture output in case of failure". Done. http://codereview.chromium.org/2810014/diff/1/4#newcode464 base/process_util_unittest.cc:464: TEST_F(ProcessUtilTest, GetAppOutputWithTimeout_SuccessWithinTimout) { On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: Timeout (applies to other places too). Done. http://codereview.chromium.org/2810014/diff/1/4#newcode474 base/process_util_unittest.cc:474: std::numeric_limits<std::size_t>::max())); On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: rather base::kNoTimeout. Also applies to other places in this file. But this case is supposed to finish within a timeout, not finish within no timeout. http://codereview.chromium.org/2810014/diff/1/4#newcode498 base/process_util_unittest.cc:498: argv.push_back("sleep 100000"); // long enough to trigger gtest timeout? On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: I don't understand the comment. Could you clarify it or remove it? Done. http://codereview.chromium.org/2810014/diff/1/4#newcode519 base/process_util_unittest.cc:519: EXPECT_EQ(-1, exit_code); // exit timeout On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > nit: I'd prefer some constant instead of "-1". Done.
http://codereview.chromium.org/2810014/diff/24001/10005 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/24001/10005#newcode741 base/process_util_posix.cc:741: // Executes the application specified by |cl| and wait for it to exit, but s/wait/waits/ http://codereview.chromium.org/2810014/diff/24001/10005#newcode831 base/process_util_posix.cc:831: struct pollfd fd = { pipe_fd[0], POLLIN, 0 }; (nit: fd is typically a file descriptor. I'd call this 'pfd'.) http://codereview.chromium.org/2810014/diff/24001/10005#newcode833 base/process_util_posix.cc:833: if (!poll(&fd, 1, 0)) This will busy loop. The third argument specifies a zero time, so poll will return immediately. If the child process doesn't exit, this will busy loop forever. I think you want something like: const TimeDelta remaining = timeout_time - time::Now() const int r = poll(&pfd, 1, remaining.InMilliseconds() + 1) DCHECK(r != -1 || errno == EINTR) Then move the timeout check before the read.
http://codereview.chromium.org/2810014/diff/1/2 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/1/2#newcode238 base/process_util.h:238: // Just like GetAppOutput() but you can specify a timeout (-1 for no timeout), On 2010/07/07 21:49:59, tessamac wrote: > On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > > Also, we'd need a Windows version of this function. It'd be good to do it now, > > to make sure we can write an implementation for this interface on all > platforms. > > I'll look into this. > > I should warn you that I haven't ever done any window dev (nor have I used > windows in the last 6 years, gulp) so this might take me some time... (but I do > want to learn so it's fine with me if it's fine with you). Do you want it in > this CL or a separate one? I'm fine with that. I'm not to familiar with the Windows APIs either, and that's one of the reasons I'd like to see how the code would look like (some changes to the function signature/contract might be needed). In case you have some questions about Windows, Carlos Pizano (cpu) is a great person to ask. So ideally it should be in this CL. In case there are some bigger delays with the Windows development machine, I'm fine with separate CLs though. http://codereview.chromium.org/2810014/diff/24001/10004#newcode92 base/process_util.h:92: ProcessId GetCurrentProcId(); I'm not sure it's the right place to add our custom exit code. Could you find out who wrote this and ask about that? http://codereview.chromium.org/2810014/diff/24001/10004#newcode245 base/process_util.h:245: // A restricted version of |GetAppOutput()| which (a) clears the environment, Could you also explicitly add information when that function returns true? http://codereview.chromium.org/2810014/diff/24001/10005 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/24001/10005#newcode855 base/process_util_posix.cc:855: << timeout_milliseconds << " ms."; nit: Add the "matching" right ")", like this: << timeout_milliseconds << " ms.)"; http://codereview.chromium.org/2810014/diff/24001/10006 File base/process_util_unittest.cc (right): http://codereview.chromium.org/2810014/diff/24001/10006#newcode474 base/process_util_unittest.cc:474: std::numeric_limits<std::size_t>::max())); The timeout_milliseconds isn't a size_t. Could you use a matching type for numeric_limits here? http://codereview.chromium.org/2810014/diff/24001/10006#newcode489 base/process_util_unittest.cc:489: std::numeric_limits<std::size_t>::max())); Same here.
http://codereview.chromium.org/2810014/diff/24001/10005 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/24001/10005#newcode853 base/process_util_posix.cc:853: KillProcess(pid, *exit_code, true); Ugh. KillProcess(..., true) can take up to about a minute to return. On the other hand, KillProcess(..., false) kills very softly with a single SIGTERM. This makes me kind of sad. I suppose you could impose more reasonable timeouts by doing the killing yourself using: kill(pid, SIGTERM); WaitpidWithTimeout(..., &success); if (!success) kill(pid, SIGKILL); (or something like that). Probably we should just have a KillProcessWithTimeout(), and you can leave this code as-is for now, though. http://codereview.chromium.org/2810014/diff/24001/10005#newcode855 base/process_util_posix.cc:855: << timeout_milliseconds << " ms."; On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > nit: Add the "matching" right ")", like this: > > << timeout_milliseconds << " ms.)"; Nit-squared: the ')' should precede the '.'. :)
http://codereview.chromium.org/2810014/diff/1/2 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/1/2#newcode238 base/process_util.h:238: // Just like GetAppOutput() but you can specify a timeout (-1 for no timeout), On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > On 2010/07/07 21:49:59, tessamac wrote: > > On 2010/06/19 08:34:43, Paweł Hajdan Jr. wrote: > > > Also, we'd need a Windows version of this function. It'd be good to do it > now, > > > to make sure we can write an implementation for this interface on all > > platforms. > > > > I'll look into this. > > > > I should warn you that I haven't ever done any window dev (nor have I used > > windows in the last 6 years, gulp) so this might take me some time... (but I > do > > want to learn so it's fine with me if it's fine with you). Do you want it in > > this CL or a separate one? > > I'm fine with that. I'm not to familiar with the Windows APIs either, and that's > one of the reasons I'd like to see how the code would look like (some changes to > the function signature/contract might be needed). In case you have some > questions about Windows, Carlos Pizano (cpu) is a great person to ask. > > So ideally it should be in this CL. In case there are some bigger delays with > the Windows development machine, I'm fine with separate CLs though. Ok, I'll talk to cpu@ about this on Monday. http://codereview.chromium.org/2810014/diff/24001/10004#newcode92 base/process_util.h:92: ProcessId GetCurrentProcId(); On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > I'm not sure it's the right place to add our custom exit code. Could you find > out who wrote this and ask about that? Meeting with cpu@ to discuss this tomorrow. I think the only tricky thing is keeping this in sync with the same enum in chrome/. http://codereview.chromium.org/2810014/diff/24001/10004#newcode245 base/process_util.h:245: // A restricted version of |GetAppOutput()| which (a) clears the environment, On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > Could you also explicitly add information when that function returns true? Done. The return value just matches what is done for GetAppOutput(): Returns true if process launched and exited cleanly, with exit code indicating success. http://codereview.chromium.org/2810014/diff/24001/10005 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/24001/10005#newcode741 base/process_util_posix.cc:741: // Executes the application specified by |cl| and wait for it to exit, but On 2010/07/07 22:02:29, agl wrote: > s/wait/waits/ Done. http://codereview.chromium.org/2810014/diff/24001/10005#newcode831 base/process_util_posix.cc:831: struct pollfd fd = { pipe_fd[0], POLLIN, 0 }; On 2010/07/07 22:02:29, agl wrote: > (nit: fd is typically a file descriptor. I'd call this 'pfd'.) Done. http://codereview.chromium.org/2810014/diff/24001/10005#newcode833 base/process_util_posix.cc:833: if (!poll(&fd, 1, 0)) On 2010/07/07 22:02:29, agl wrote: > This will busy loop. The third argument specifies a zero time, so poll will > return immediately. If the child process doesn't exit, this will busy loop > forever. > > I think you want something like: > const TimeDelta remaining = timeout_time - time::Now() > const int r = poll(&pfd, 1, remaining.InMilliseconds() + 1) > DCHECK(r != -1 || errno == EINTR) > > Then move the timeout check before the read. Aha, yes, I think I misunderstood poll(). Thanks for the help. http://codereview.chromium.org/2810014/diff/24001/10005#newcode853 base/process_util_posix.cc:853: KillProcess(pid, *exit_code, true); On 2010/07/08 13:41:52, viettrungluu wrote: > Ugh. KillProcess(..., true) can take up to about a minute to return. On the > other hand, KillProcess(..., false) kills very softly with a single SIGTERM. > This makes me kind of sad. > > I suppose you could impose more reasonable timeouts by doing the killing > yourself using: > kill(pid, SIGTERM); > WaitpidWithTimeout(..., &success); > if (!success) > kill(pid, SIGKILL); > (or something like that). > > Probably we should just have a KillProcessWithTimeout(), and you can leave this > code as-is for now, though. Should I just use KillProcess(..., false)? I'm not sure it's important to wait. http://codereview.chromium.org/2810014/diff/24001/10005#newcode855 base/process_util_posix.cc:855: << timeout_milliseconds << " ms."; On 2010/07/08 13:41:52, viettrungluu wrote: > On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > > nit: Add the "matching" right ")", like this: > > > > << timeout_milliseconds << " ms.)"; > > Nit-squared: the ')' should precede the '.'. :) Done. http://codereview.chromium.org/2810014/diff/24001/10005#newcode855 base/process_util_posix.cc:855: << timeout_milliseconds << " ms."; On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > nit: Add the "matching" right ")", like this: > > << timeout_milliseconds << " ms.)"; Oops. http://codereview.chromium.org/2810014/diff/24001/10006 File base/process_util_unittest.cc (right): http://codereview.chromium.org/2810014/diff/24001/10006#newcode474 base/process_util_unittest.cc:474: std::numeric_limits<std::size_t>::max())); On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > The timeout_milliseconds isn't a size_t. Could you use a matching type for > numeric_limits here? Done. Good catch! This also showed me that poll() take an int for the timeout instead of an int64 (could have been a nasty bug!) so I changed the timeout argument to int as well. http://codereview.chromium.org/2810014/diff/24001/10006#newcode489 base/process_util_unittest.cc:489: std::numeric_limits<std::size_t>::max())); On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > Same here. Done.
http://codereview.chromium.org/2810014/diff/24001/10005 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/24001/10005#newcode853 base/process_util_posix.cc:853: KillProcess(pid, *exit_code, true); On 2010/07/10 00:43:50, tessamac wrote: > On 2010/07/08 13:41:52, viettrungluu wrote: > > Ugh. KillProcess(..., true) can take up to about a minute to return. On the > > other hand, KillProcess(..., false) kills very softly with a single SIGTERM. > > This makes me kind of sad. > > > > I suppose you could impose more reasonable timeouts by doing the killing > > yourself using: > > kill(pid, SIGTERM); > > WaitpidWithTimeout(..., &success); > > if (!success) > > kill(pid, SIGKILL); > > (or something like that). > > > > Probably we should just have a KillProcessWithTimeout(), and you can leave > this > > code as-is for now, though. > > Should I just use KillProcess(..., false)? I'm not sure it's important to wait. It's not important to wait per se, but there are several problems: 1. KillProcess(..., false) does a soft kill with SIGTERM, which can be ignored (so the process may not end up dead). 2. KillProcess(..., false) doesn't call waitpid(), so you'd have to reap the child (otherwise a zombie would be left behind). Actually, reading the code for KillProcess(), it looks to me like it leaves a zombie process in the case that it resorts to SIGKILL, since it doesn't reap in that case. D'oh.
Only a minor comment. After we know what to do about the exit code constants, I'm fine with the code (agl and trungl may detect some additional issues here though). http://codereview.chromium.org/2810014/diff/29001/30002 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/29001/30002#newcode855 base/process_util_posix.cc:855: CHECK(!timed_out); This CHECK is a bit weird, as it will always be triggered. I suggest to remove it.
On 2010/07/10 06:21:40, viettrungluu wrote: > http://codereview.chromium.org/2810014/diff/24001/10005 > File base/process_util_posix.cc (right): > > http://codereview.chromium.org/2810014/diff/24001/10005#newcode853 > base/process_util_posix.cc:853: KillProcess(pid, *exit_code, true); > On 2010/07/10 00:43:50, tessamac wrote: > > On 2010/07/08 13:41:52, viettrungluu wrote: > > > Ugh. KillProcess(..., true) can take up to about a minute to return. On the > > > other hand, KillProcess(..., false) kills very softly with a single SIGTERM. > > > This makes me kind of sad. > > > > > > I suppose you could impose more reasonable timeouts by doing the killing > > > yourself using: > > > kill(pid, SIGTERM); > > > WaitpidWithTimeout(..., &success); > > > if (!success) > > > kill(pid, SIGKILL); > > > (or something like that). > > > > > > Probably we should just have a KillProcessWithTimeout(), and you can leave > > this > > > code as-is for now, though. > > > > Should I just use KillProcess(..., false)? I'm not sure it's important to > wait. > > It's not important to wait per se, but there are several problems: > 1. KillProcess(..., false) does a soft kill with SIGTERM, which can be ignored > (so the process may not end up dead). > 2. KillProcess(..., false) doesn't call waitpid(), so you'd have to reap the > child (otherwise a zombie would be left behind). > > Actually, reading the code for KillProcess(), it looks to me like it leaves a > zombie process in the case that it resorts to SIGKILL, since it doesn't reap in > that case. D'oh. BTW, I'm okay with leaving the KillProcess(..., true) as-is for now. Just put a note about it somewhere.
http://codereview.chromium.org/2810014/diff/24001/10004 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/24001/10004#newcode92 base/process_util.h:92: PROCESS_END_KILLED_BY_TIMEOUT = 3 On 2010/07/10 00:43:50, tessamac wrote: > On 2010/07/08 07:58:18, Paweł Hajdan Jr. wrote: > > I'm not sure it's the right place to add our custom exit code. Could you find > > out who wrote this and ask about that? > > Meeting with cpu@ to discuss this tomorrow. I think the only tricky thing is > keeping this in sync with the same enum in chrome/. Ok, here is the deal: This enum is duplicated and extended in /chrome/common/result_codes.h. Ick. http://src.chromium.org/svn/trunk/src/chrome/common/result_codes.h Because those values in result_codes.h are depended on (and read by humans) it's not reasonable to insert this one at 4 (it would change all those other values which would be confusing). Options: 1 - Give this new one a very high value (like 100) to avoid collisions. 2 - Change the return value of GetAppOutputWIthTimeout to return true/false based on whether the timeout occurred. And use exit_code to determine success/failure of the process. 3 - Something else? Although #2 means the return value of GetAppOutputWithTimeout() is not consistent with GetAppOutput(), I think it seems less messy. But if we'll ever want to add other exit codes it's probably better to bite the bullet and do #1. If anyone has a strong opinion either was or an alternative solution, I'm all ears. http://codereview.chromium.org/2810014/diff/24001/10005 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/24001/10005#newcode853 base/process_util_posix.cc:853: KillProcess(pid, *exit_code, true); On 2010/07/10 06:21:40, viettrungluu wrote: > On 2010/07/10 00:43:50, tessamac wrote: > > On 2010/07/08 13:41:52, viettrungluu wrote: > > > Ugh. KillProcess(..., true) can take up to about a minute to return. On the > > > other hand, KillProcess(..., false) kills very softly with a single SIGTERM. > > > This makes me kind of sad. > > > > > > I suppose you could impose more reasonable timeouts by doing the killing > > > yourself using: > > > kill(pid, SIGTERM); > > > WaitpidWithTimeout(..., &success); > > > if (!success) > > > kill(pid, SIGKILL); > > > (or something like that). > > > > > > Probably we should just have a KillProcessWithTimeout(), and you can leave > > this > > > code as-is for now, though. > > > > Should I just use KillProcess(..., false)? I'm not sure it's important to > wait. > > It's not important to wait per se, but there are several problems: > 1. KillProcess(..., false) does a soft kill with SIGTERM, which can be ignored > (so the process may not end up dead). > 2. KillProcess(..., false) doesn't call waitpid(), so you'd have to reap the > child (otherwise a zombie would be left behind). > > Actually, reading the code for KillProcess(), it looks to me like it leaves a > zombie process in the case that it resorts to SIGKILL, since it doesn't reap in > that case. D'oh. Ok, I'll leave this code as is with KillProcess(..., true) but I added a comment about the potential 1 minute delay. Should I also add a TODO to KillProcess() to suggest that it should reap the child process in the SIGKILL case? http://codereview.chromium.org/2810014/diff/29001/30002#newcode855 base/process_util_posix.cc:855: << timeout_milliseconds << " ms."; On 2010/07/10 18:53:49, Paweł Hajdan Jr. wrote: > This CHECK is a bit weird, as it will always be triggered. I suggest to remove > it. Oops. This was debugging code. But I guess none of the tests take this path. I'll see if I can think one up.
On 2010/07/13 17:55:49, tessamac wrote: > Because those values in result_codes.h are depended on (and read by humans) it's > not reasonable to insert this one at 4 (it would change all those other values > which would be confusing). > > Options: > > 1 - Give this new one a very high value (like 100) to avoid collisions. > > 2 - Change the return value of GetAppOutputWIthTimeout to return true/false > based on whether the timeout occurred. And use exit_code to determine > success/failure of the process. I'd prefer to make "true" return value mean "everything went fine", and false "something was broken". We can add another output parameter to indicate whether timeout was exceeded. But this makes the interface even more cluttered. > 3 - Something else? We can consider using a class instead of a function. Then we could have accessors instead of output parameters. I'm not sure if it really makes sense, but might be worth experimenting for a while.
On Tue, Jul 13, 2010 at 12:39, <phajdan.jr@chromium.org> wrote: > On 2010/07/13 17:55:49, tessamac wrote: > >> Because those values in result_codes.h are depended on (and read by >> humans) >> > it's > >> not reasonable to insert this one at 4 (it would change all those other >> values >> which would be confusing). >> > > Options: >> > > 1 - Give this new one a very high value (like 100) to avoid collisions. >> > > 2 - Change the return value of GetAppOutputWIthTimeout to return >> true/false >> based on whether the timeout occurred. And use exit_code to determine >> success/failure of the process. >> > > I'd prefer to make "true" return value mean "everything went fine", and > false > "something was broken". We can add another output parameter to indicate > whether > timeout was exceeded. But this makes the interface even more cluttered. > > The only reason I added exit_code was so that we could differentiate timeouts from other failures. But it turns out that was a bad solutions since there isn't a timeout exit code. So I think maybe it would be best just replace int *exit_code with bool* timed_out. It also occurred to me that PROCESS_END_PROCESS_WAS_HUNG may be intended to be used for timeouts. (It doesn't seem to be used anywhere though, so I'm not sure http://www.google.com/search?q=site:src.chromium.org+PROCESS_END_PROCESS_WAS_... ). I just uploaded these changes, let me know what you think. > 3 - Something else? >> > > We can consider using a class instead of a function. Then we could have > accessors instead of output parameters. I'm not sure if it really makes > sense, > but might be worth experimenting for a while. > > > http://codereview.chromium.org/2810014/show >
http://codereview.chromium.org/2810014/diff/43001/44001 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/43001/44001#newcode89 base/process_util.h:89: // so don't want to extend it here or beware collisions! nit: This "so don't ..." part of comment is slightly confusing. I think it can just be removed. Alternatively, we can say something about keeping these two files in sync. http://codereview.chromium.org/2810014/diff/43001/44001#newcode247 base/process_util.h:247: // Will wait upto 60 seconds for process to die so this may overrun nit: This is too implementation-specific, so let's just say that due to how we kill processes, it may take longer than |timeout_milliseconds| to complete. http://codereview.chromium.org/2810014/diff/43001/44002 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/43001/44002#newcode855 base/process_util_posix.cc:855: CHECK(!timed_out); I still don't understand the idea behind this CHECK. :-/
http://codereview.chromium.org/2810014/diff/43001/44001 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/43001/44001#newcode89 base/process_util.h:89: // so don't want to extend it here or beware collisions! On 2010/07/14 01:00:09, Paweł Hajdan Jr. wrote: > nit: This "so don't ..." part of comment is slightly confusing. I think it can > just be removed. Alternatively, we can say something about keeping these two > files in sync. Done. http://codereview.chromium.org/2810014/diff/43001/44001#newcode247 base/process_util.h:247: // Will wait upto 60 seconds for process to die so this may overrun On 2010/07/14 01:00:09, Paweł Hajdan Jr. wrote: > nit: This is too implementation-specific, so let's just say that due to how we > kill processes, it may take longer than |timeout_milliseconds| to complete. Done. http://codereview.chromium.org/2810014/diff/43001/44002 File base/process_util_posix.cc (right): http://codereview.chromium.org/2810014/diff/43001/44002#newcode855 base/process_util_posix.cc:855: CHECK(!timed_out); On 2010/07/14 01:00:09, Paweł Hajdan Jr. wrote: > I still don't understand the idea behind this CHECK. :-/ Done. For real this time.
http://codereview.chromium.org/2810014/diff/53001/54001 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/53001/54001#newcode90 base/process_util.h:90: // numbering didn't change, so if you add values here use high numbers. nit: Please remove the part about using high numbers, it sounds too dangerous to me.
Since there are no code changes (only comment changes) I didn't re-upload. But let me know if you'd like me to. http://codereview.chromium.org/2810014/diff/53001/54001 File base/process_util.h (right): http://codereview.chromium.org/2810014/diff/53001/54001#newcode90 base/process_util.h:90: // numbering didn't change, so if you add values here use high numbers. On 2010/07/14 18:52:44, Paweł Hajdan Jr. wrote: > nit: Please remove the part about using high numbers, it sounds too dangerous to > me. Done. Now says: // Warning: This enum is duplicated and extended in chrome/common/result_codes.h // The two must be kept in sync and users of that enum would prefer the // numbering didn't change.
LGTM
The linux and mac try bots came back green; win failed with "did not complete". I don't think the failure has anything to do with my change. Since I'm not yet a committer, would any of you like to commit this on my behalf? On Wed, Jul 14, 2010 at 12:28, <phajdan.jr@chromium.org> wrote: > LGTM > > > http://codereview.chromium.org/2810014/show >
I'd be happy to commit it for you when I have the chance and when the tree is open, though Paweł may well get to it first. For that, you'll need to upload the latest version of your change though.... On 2010/07/14 22:59:55, tessamac wrote: > The linux and mac try bots came back green; win failed with "did not > complete". I don't think the failure has anything to do with my change. > > Since I'm not yet a committer, would any of you like to commit this on my > behalf? > > On Wed, Jul 14, 2010 at 12:28, <mailto:phajdan.jr@chromium.org> wrote: > > > LGTM > > > > > > http://codereview.chromium.org/2810014/show > > >
On Wed, Jul 14, 2010 at 20:34, <viettrungluu@chromium.org> wrote: > I'd be happy to commit it for you when I have the chance and when the tree > is > open, though Paweł may well get to it first. For that, you'll need to > upload the > latest version of your change though.... Oh yes, thanks for the reminder. Done. > > > On 2010/07/14 22:59:55, tessamac wrote: > >> The linux and mac try bots came back green; win failed with "did not >> complete". I don't think the failure has anything to do with my change. >> > > Since I'm not yet a committer, would any of you like to commit this on my >> behalf? >> > > On Wed, Jul 14, 2010 at 12:28, <mailto:phajdan.jr@chromium.org> wrote: >> > > > LGTM >> > >> > >> > http://codereview.chromium.org/2810014/show >> > >> > > > > > http://codereview.chromium.org/2810014/show > |