|
|
Created:
6 years, 4 months ago by vadimt Modified:
6 years, 3 months ago Reviewers:
Ilya Sherman, brettw, Alexei Svitkine (slow), jar (doing other things), Ken Russell (switch to Gerrit) CC:
chromium-reviews, piman+watch_chromium.org, erikwright+watch_chromium.org, sadrul, jam, darin-cc_chromium.org, brettw, Mark Mentovai Base URL:
svn://svn.chromium.org/chrome/trunk/src Visibility:
Public. |
DescriptionCreating a framework for suppressing pollution of the profiler data and applying for know cases of pollution.
See the bug.
The CL introduces TaskStopwatch that has to be used to measure run time for tasks. It takes care of double-counting
run time in the nested-tasks case by subtracting run time of nested tasks from their parents. TaskStopwatch can be
also used for subtracting other nested intervals, such as the time while a nested message pump runs. ThreadData::TallyADeath
now takes a stopwatch parameter instead of start_time and end_time. This helps avoid mistakes when the interval passed up to
the parent for exclusion is different from the interval for the current task duration.
BUG=401560
Committed: https://crrev.com/12f0f7db8095f1a5e40fb7227b217ddfc1724116
Cr-Commit-Position: refs/heads/master@{#294865}
Patch Set 1 #
Total comments: 14
Patch Set 2 : jar@ comments #Patch Set 3 : Adding 1 forgotten time parameter. #
Total comments: 40
Patch Set 4 : jar@ comments #Patch Set 5 : Clarifying terminology #
Total comments: 10
Patch Set 6 : More jar@ comments #
Total comments: 8
Patch Set 7 : (presumably) last round of jar@'s comments #
Total comments: 4
Patch Set 8 : kbr@ comments #Patch Set 9 : Rebasing #Patch Set 10 : Fixing a unit test. #Patch Set 11 : Fixing unit tests again. #
Messages
Total messages: 50 (14 generated)
Folks, please take a look. Once you are done, I'll make sure to include remaining required owners.
It may be very reasonable to put some of the complexity of this change down in the ThreadData class (as I think you have). I'm not sure... I hadn't envisioned needing that... but it may prove better placed there. ...that said... The critical performance goal is to not significantly change the performance in the "simple" case, where no adjustments are needed (i.e., where there was no nested tasks). There should not be additional calls to the TLS system (such as ThreadData::Get()), and the code should be as clean to step through as possible. There are some delicate existing optimizations in place (e.g., passing around int32 rather than 64 bit quantities) that should probably be maintained. In the end, you should probably single step through the paths in assembly to be sure that you haven't inadvertently de-optimized the code. Most critically, there should be no significant cost to the common/fast path. The current code was optimized so that it would fly well under the radar. It might have included "premature optimization," but it achieved a remarkable goal... and it would be good to see that maintained. If the profiler slows down the product, this whole area of code will be abandoned... and I'd rather not see that happen. https://codereview.chromium.org/445413003/diff/1/base/message_loop/message_lo... File base/message_loop/message_loop.cc (left): https://codereview.chromium.org/445413003/diff/1/base/message_loop/message_lo... base/message_loop/message_loop.cc:455: start_time, tracked_objects::ThreadData::NowForEndOfRun()); It is pretty critical that NowForEndOFRun() is used, as there is a level of indirection used here. I'm wary that you may have dropped that level of indirection. ...but perhaps I've forgotten what this function did. It used to either use time, or allocation counts (as I recall). Perhaps your stop watch also uses these methods. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc File base/tracked_objects.cc (left): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#oldc... base/tracked_objects.cc:547: if (!end_of_run.is_null()) There seems to be a bunch of code dealing with the case where the end_of_run is null, just as there is code worrying about the start_of_run being null. I don't recall the scenarios where this happens (and maybe it doesn't anymore??), but I suspect it was put there carefully (by me). For instance, it may have to do with scenarios where the profiler is not initialized sufficiently, including the setting of the time-source, when the first tasks are run. That could cause some of the durations to be nulled out, and meaningless. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#newc... base/tracked_objects.cc:440: int32 run_duration = stopwatch.TaskDurationMs(); I don't think "run_duration" is always milliseconds. I think it is sometimes time, but sometimes allocation, so the word "duration" is probably better. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#newc... base/tracked_objects.cc:879: nested_stopwatches_duration_ms_ = 0; Given that you said that stopwatches could be started and stopped, I'm surprised that you're clearing out these values at a Start() call, as it would seemingly destroy all history from prior starts and stops. Maybe I was confused when you said they were "reusable," and they are not really reused in common lifetime usage. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#newc... base/tracked_objects.cc:882: current_thread_data_ = ThreadData::Get(); This is a TLS access, and hence plausibly expensive. I don't think it was necessary to access this for each start, and it could have been hidden in the cost of the access that was needed for calling the profiler. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h File base/tracked_objects.h (left): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h#oldco... base/tracked_objects.h:526: void TallyADeath(const Births& birth, int32 queue_duration, int32 duration); Care was taken to use a 32 bit quantity. As I recall, this was done for both atomicity of access, and also to provide for faster implementations (on platforms where 64 bit arithmetic is implemented by super expensive library calls!). https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h#newco... base/tracked_objects.h:743: // Stopwatch of which this one is an immediate child. The words "parent" and "child" seem confusing. I suspect you wanted to say; // Stopwatch which was running on our thread when this stopwatch was started. That preexisting stopwatch must be adjusted to the exclude the duration of existance of this stopwatch. Maybe you can define parent and child, but I suspect it will be even more complicated.
On performance. I added 4 new assignments and 1 check on Start, and 2 new assignments and 2 checks for Stop (I don't count code moved between places). Plus the price of construction and destruction of the object. Does this seem too much? I could slightly optimize this by NOT zeroing some variables in constructor, but this can cause confusion for someone who debugs this code, since before Start, these variables will be in strange states. Let me know if you believe I should do this. Note that with my latest (i.e. this) change, we have same number of TLS reads per task as originally. I.e. 1 per task. https://codereview.chromium.org/445413003/diff/1/base/message_loop/message_lo... File base/message_loop/message_loop.cc (left): https://codereview.chromium.org/445413003/diff/1/base/message_loop/message_lo... base/message_loop/message_loop.cc:455: start_time, tracked_objects::ThreadData::NowForEndOfRun()); No, it was just a stupid mistake. I wanted to use this, but it slipped through fingers. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc File base/tracked_objects.cc (left): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#oldc... base/tracked_objects.cc:547: if (!end_of_run.is_null()) I preserved all nullness checks: (1) You can see them preserved here, for 'queue_duration' calculations. (2) Checks for the run duration moved to TaskStopwatch::Stop(): if (!start_time_.is_null() && !end_time.is_null()) { wallclock_duration_ms_ = (end_time - start_time_).InMilliseconds(); } (wallclock_duration_ms_ os 0 prior to this code) https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#newc... base/tracked_objects.cc:440: int32 run_duration = stopwatch.TaskDurationMs(); Not sure I dotally got it. So, I'm listing some facts, hoping that they somehow addres your concerns. (1) in the old code, run_duration was always milliseconds, because it was always a result of calculation like this: run_duration = (end_of_run - start_of_run).InMilliseconds(); (2) I try including units in identifiers, since this reduce chance of mistakes. (3) I'll rename TaskDurationMs to RunDurationMs for consistency. (4) Are you suggesting renaming run_duration to duration? But in the old code, the parameter was named run_duration, and I keep the name and semantics. Again, probably I din't get it... https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#newc... base/tracked_objects.cc:879: nested_stopwatches_duration_ms_ = 0; Stopwatches indeed can be reused i.e.: Start Stop Tally task 1 Start Stop Tally task 2 ... Each start, we want to totally reset ourselves since measurements for task1 shouldn't affect task2. Resetting also clears the duration of nested stopwatches between Start and Stop. Note, however, that both measurements in the above example will add to 'nested_stopwatches_duration_ms_' of the PARENT task for them, and resetting won't affect THAT value. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.cc#newc... base/tracked_objects.cc:882: current_thread_data_ = ThreadData::Get(); Ah, OK. I assumed that TLS is cheap (like you get a variable pointed by some register, add some offset, and you are done). Probably, I was wrong. I modified methods from the Tally* family to use the thread data stored in the stopwatch. Now, we have 1 TLS access per task, like before (if kTrackParentChildLinks == false). https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h File base/tracked_objects.h (left): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h#oldco... base/tracked_objects.h:526: void TallyADeath(const Births& birth, int32 queue_duration, int32 duration); TaskStopwatch still uses int32, so this is taken care of. https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/1/base/tracked_objects.h#newco... base/tracked_objects.h:743: // Stopwatch of which this one is an immediate child. On 2014/08/08 01:34:48, jar wrote: > The words "parent" and "child" seem confusing. I suspect you wanted to say; > > // Stopwatch which was running on our thread when this stopwatch was started. > That preexisting stopwatch must be adjusted to the exclude the duration of > existance of this stopwatch. > > Maybe you can define parent and child, but I suspect it will be even more > complicated. Done.
Just wanted to double-chek what's the current state. Is everyone happy, so I can extend the review to include file owners, or I should expect more comments from the current reviewers?
On 2014/08/12 19:19:47, vadimt wrote: > Just wanted to double-chek what's the current state. Is everyone happy, so I can > extend the review to include file owners, or I should expect more comments from > the current reviewers? I'm deferring to Jim, as he's far more familiar with this code than I am.
This is looking pretty good... I'm always nervous about perf regressions... so we'll see how it goes. https://codereview.chromium.org/445413003/diff/40001/base/threading/worker_po... File base/threading/worker_pool_win.cc (right): https://codereview.chromium.org/445413003/diff/40001/base/threading/worker_po... base/threading/worker_pool_win.cc:28: tracked_objects::TaskStopwatch stopwatch; nit/Question: Is there ever a case where we don't construct a TaskStopwatch, and then immediately call its Start() method? I was wondering why the constructor doesn't take the argument to the Start() method, and be done with it?? https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:867: DCHECK(state_ != RUNNING); This is adding a lot of conditional test code, which makes reading the real code more difficult. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:882: current_thread_data_ = ThreadData::Get(); This is a nice example of something that probably should have been done in the constructor... no reason to keep asking (if we were to use this again!). IMO, almost all of this should be in the constructor, and you probably won't need a Start() method. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:883: if (current_thread_data_) { nit: Personal preference: Earlier returns make code more readable. Better may be: if (!current_thread_data_) return; ...and no need to indent the rest of the method. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:907: if (current_thread_data_) { nit: early return is probably cleaner. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:911: if (parent_stopwatch_) { nit: early return again. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:918: wallclock_duration_ms_; I'm pretty sure this is the correct way to calculate the sum... as it accumulates all nested stopwatches (via wallclock duration), rather the the header comment's note about only immediate nested stop watches being excluded. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:936: return wallclock_duration_ms_ - nested_stopwatches_duration_ms_; This is the more common (by far) use of wallclock_duration_ms. As a result, it is probably better to calculate it here, when needed, than it is to store it away hoping it will be used again (super rare case). If it is needed, then it can be recalculated. You might have a tiny local function that does this... or a private method... and hope the compiler inlines it. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:710: void Start(const TrackedTime& start_time); nit: I mentioned this elsewhere... Any reason why we don't feed the start_time into the constructor, and avoid having this method? https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:713: void Start() { Start(ThreadData::Now()); } nit: You can't overload... but I suspect that only one of these methods is needed (or maybe just the constructor will do it all by default). https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:715: // Stops stopwatch with the specified |start_time|. nit: typo: start_time --> end_time https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:727: // spent in immediately nested stopwatches. nits: There is no real guarantee this is "wallclock time" so perhaps: Task's durataion is calculated as the wallclock time difference between stopping the stopwatch and starting it, minus time spent in immediately nested stopwatches. Might become: Task's duration is calculated as the duration between starting and stopping this stopwatch, minus the lifetime durations of any other instances constructed and destroyed on this thread during that period. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:731: ThreadData * GetThreadData() const; nit: no space before the "*". https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:738: int32 wallclock_duration_ms_; I don't see why this is stored, rather than being calculated dynamically at the one point where it is requested. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:744: // this one. I expect you'll really need to sum (recursively) all stopwatches on this thread, not just the directly nested ones. I think you do it correctly... but this prose seems confusing. Perhaps get rid of the word "directly". WDYT? https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:745: int32 nested_stopwatches_duration_ms_; nit: better name might be "excluded_duration_" https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:749: // of this stopwatch. IMO, should be: ...exclude the duration the lifetime of this stopwatch (which may be more than the duration of this stopwatch!). https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:763: TaskStopwatch* running_child_; nit: My preference would be to avoid putting test code into the class. If you really want these backwards pointers, you could (in your tests) maintain an external set of pointers to the nested stopwatches, etc. The code will get especially messy when you need to optionally provide constructors to initialize this pointer... but even if you used something fancier, it would make the live code messier. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects_uni... File base/tracked_objects_unittest.cc (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects_uni... base/tracked_objects_unittest.cc:609: return; nit: When the conditional of the if runs across more than one line, folks seem to use curlies around conditioned statement (just as though the statement ran for several lines). I couldn't find justification in the style guide... but I've heard this discussed often... and could have sworn it was there :-/. IMO, the code looks pretty funky as is.... and braces would only help ;-).
https://codereview.chromium.org/445413003/diff/40001/base/threading/worker_po... File base/threading/worker_pool_win.cc (right): https://codereview.chromium.org/445413003/diff/40001/base/threading/worker_po... base/threading/worker_pool_win.cc:28: tracked_objects::TaskStopwatch stopwatch; No, but since there is method Stop, I thought it would be strange to have: tracked_objects::TaskStopwatch stopwatch(time); stopwatch.Stop() You might argue that the stopwatch could be stopped inside TallyRunOnWorkerThreadIfTracking(), but this would make code harder to understand. Basically, if you start the stopwatch in constructor, you should stop it in destructor, which is not the case here. Note that we have asserts that the stopwatch passed to TallyRunOnWorkerThreadIfTracking() was started and stopped. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:867: DCHECK(state_ != RUNNING); I believe it's still worth having this code, because: (1) It actually helps understanding by giving an idea about the object state transitions; (2) Saves you from headaches by detecting issues earlier, just after you changed the code. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:882: current_thread_data_ = ThreadData::Get(); See my first answer about Start and Stop calls. Note that reusability was just a side effect of that decision, not one of primary goals. But since the object is reusable, I can imagine that it could be used and reused by a task runner on worked threads, and each time it would be another worker thread. So, it's dangerous to get thread data in constructor. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:883: if (current_thread_data_) { On 2014/08/26 04:09:09, jar (gone till 9-27-2014) wrote: > nit: Personal preference: Earlier returns make code more readable. Better may > be: > > if (!current_thread_data_) > return; > > ...and no need to indent the rest of the method. Done. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:907: if (current_thread_data_) { On 2014/08/26 04:09:09, jar (gone till 9-27-2014) wrote: > nit: early return is probably cleaner. Done. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:911: if (parent_stopwatch_) { On 2014/08/26 04:09:09, jar (gone till 9-27-2014) wrote: > nit: early return again. Done. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:918: wallclock_duration_ms_; This code calculates as described in the header, i.e. only immediately nested message loops and nested tasks are excluded. parent_stopwatch is our immediate parent, and we don't go to a grandparent. Example: Task A ran 10 ms Task B is nested in A, and ran 8 ms Task C is nested in B, and ran 6 ms Task B is immediately nested in A, and C is nested in A, but not immediately. We subtract only B from A, leaving A with 2 ms. If we subtracted both B and C, we'd end up with negative -4 ms. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.cc#... base/tracked_objects.cc:936: return wallclock_duration_ms_ - nested_stopwatches_duration_ms_; Note that we don't store end time, so by eliminating this member, we'd have to introduce another one. What is the motivation behind this? https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:710: void Start(const TrackedTime& start_time); See my answer in another comment. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:713: void Start() { Start(ThreadData::Now()); } On 2014/08/26 04:09:10, jar (gone till 9-27-2014) wrote: > nit: You can't overload... but I suspect that only one of these methods is > needed (or maybe just the constructor will do it all by default). Done. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:715: // Stops stopwatch with the specified |start_time|. On 2014/08/26 04:09:09, jar (gone till 9-27-2014) wrote: > nit: typo: start_time --> end_time Done. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:727: // spent in immediately nested stopwatches. Done, but I added a passage about being directly nested. Another comment explains why I need to care about direct nesting. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:731: ThreadData * GetThreadData() const; On 2014/08/26 04:09:09, jar (gone till 9-27-2014) wrote: > nit: no space before the "*". Done. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:738: int32 wallclock_duration_ms_; Then I'd have to replace it with storing end_time, which has higher overhead. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:744: // this one. I believe summing only first-level nested stopwatches is correct, otherwise, as shown in another comment, we can get strange things like negative task times. The code works as the comment says, i.e. doesn't do it recursively. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:745: int32 nested_stopwatches_duration_ms_; Done, but I added units to the name. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:749: // of this stopwatch. I added "of the lifetime". It's not clear what's difference between duration of the stopwatch and duration of the lifetime of the stopwatch, though. Could you clarify? https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:763: TaskStopwatch* running_child_; This is not a test code. This is a code that helps quickly catching bugs while you are in a debug mode. It will be MUCH-MUCH_MUCH harder to find issues after you've missed the chance to get them diagnosed as an assert in debug mode. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects_uni... File base/tracked_objects_unittest.cc (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects_uni... base/tracked_objects_unittest.cc:609: return; Agree, braces help when 'if' with condition takes several lines.
I suspect we're talking at cross purposes due to names/wording. I think getting mutually agreeable wording in variable names, and comments, will help us reach closure, and make it a lot easier for future readers. https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:749: // of this stopwatch. On 2014/08/26 19:15:05, vadimt wrote: > I added "of the lifetime". > > It's not clear what's difference between duration of the stopwatch and duration > of the lifetime of the stopwatch, though. Could you clarify? I meant "lifetime" to be time from construction to destruction, with no adjustments. I meant "duration of the stopwatch" to be the return of the RunDuration() method, which excludes time(s) in any nested stopwatch. The "lifetime" implicitly sums (includes?) RunDuration() plus all the durations in nested stopwatches (without any need for recursive arithmetic).
https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/40001/base/tracked_objects.h#n... base/tracked_objects.h:749: // of this stopwatch. OK. Since I believe that there are reasons to have Start and Stop methods, I suggest to aboid using "lifetime" word. I suggest to use: (1) "Wallclock duration" for non-adjusted diration; (2) "Duration" for adjusted duration. I'm starting changing names and comments according to these, but feel free to interrupt me with better suggestions.
Made upload #4 that clarifies terminology.
OK... I'm fine will wallclock and duration as wording. Thanks. I don't see usefulness of reusability of a stopwatch instance, and think it has slightly wrong semantics, since "stop" is mixed in with "tally wallclock time, and exclude from parent." They should IMO have different final times. I think you want "Stop" to be the point at which you get the duration (for things like the current task tally). I think you want the wallclock calculation (in the rare case where it is needed) to include all time for the life of the stopwatch. As a result, I think it should be tallied at destruction time for the instance. This will avoid counting profiler time in a sub-task as part of the time in the parent stopwatch. This approach for wallclock time calculation (at destruction) also argues towards starting the wallclock exclusion time at construction (you mentioned this concern when discussing getting rid of the Start() method). If you go this route, then the Stop() method also becomes useless (since it does not have the side-effect on Wallclock calculation), and you can just call Duration(). I think this would lead to a much more compact CL. When writing a profiler, it is always good to try to avoid including "profiler overhead" in any tallies. In this case, the goal is to get the finish time for duration asap. I do like your Stopwatch class, as it simplifies the stack-tallying (as compared with my initial suggestion for using a new slot in the message loop or message pump). https://codereview.chromium.org/445413003/diff/80001/base/profiler/scoped_pro... File base/profiler/scoped_profile.cc (right): https://codereview.chromium.org/445413003/diff/80001/base/profiler/scoped_pro... base/profiler/scoped_profile.cc:24: stopwatch_.Stop(ThreadData::NowForEndOfRun()); Why do you need to stop the stopwatch if it is never tallied? ... i.e., why not move this t oafter line 27?
Reusability is not a primary goal, but since we have Start/Stop methods, we get it practically for free. I can remove it, but I don’t expect any benefits from this. As far as I understand, you argue for this approach: (1) Lifetime starts (2) Start tallying a task (3) Stop tallying a task (4) Lifetime ends; subtract (4)-(1) from the parent. I see the following issue: Suppose the parent task took 12ms. Now consider the sequence for the nested task: (1) 2ms (2) 6ms (3) 2ms (4) We’ll end up with the nested task that took 6 ms, and the outer task that took (as tallied) 12 -(2+6+2) = 2 ms. So, we have a 6ms task and 2ms (8ms total) tasks covering a period of 12ms. I.e., we lost 4 ms. You might argue that these 2ms intervals are “profiler overhead”. But if the overhead happens, it should be counted! At the end, the user won’t feel better if for some reason the profiler overhead would cause the UI to freeze, and we’d explain them that "no worries, it’s just a profiler overhead"! What I do is I count first and last 2 ms as a time spent in the parent task, which is OK, since the nested task haven’t started yet (or already finished). P.S. To create a framework for future arguments, and to avoid going in circles, I’m including the chain of reasoning that led to my design: (A) It’s a good idea to pass the stopwatch to TallyAtDeath methods. As opposed to passing a time delta, this guarantees that we always use stopwatch, which takes care of exclusions. (B) A => [follows that] you have to have a Stop method that you call before TallyAtDeath. This makes the code readable, and allows to pass test time in unit tests without using complexities of using a mock time framework. (C) B => you have to have a Start method, since starting the stopwatch in constructor and ending in a Stop method is ugly. (D) B & C => begin and start intervals for tallying a current task, and for exclusion from a parent task should be same, otherwise we’ll have non-counted periods of time, which will be felt by the user, but hidden from us. So, constructor and destructor shouldn’t deal with time. In addition, retrieving time in 4 points (constr, destr, Start, Stop) is bad, compared to 2 points (Start/Stop) because on Android ARM getting current time is expensive. https://codereview.chromium.org/445413003/diff/80001/base/profiler/scoped_pro... File base/profiler/scoped_profile.cc (right): https://codereview.chromium.org/445413003/diff/80001/base/profiler/scoped_pro... base/profiler/scoped_profile.cc:24: stopwatch_.Stop(ThreadData::NowForEndOfRun()); Because Start and Stop should be strictly paired. Start was called unconditionally. This means that Stop should be called unconditionally.
As I said, I'm convinced that your StopWatch class, tied tightly to the profiler, is a nice approach. You asked for some specific/concrete suggestions for how to simplify the API. Here is some more background on what has been done historically, and how that approach would probably tie into your class (adding simplifications). https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... File base/threading/sequenced_worker_pool.cc (right): https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... base/threading/sequenced_worker_pool.cc:760: tracked_objects::ThreadData::NowForStartOfRun(task.birth_tally)); The reason there is an API called NowForStartOfRun(), instead of just calling tracked_objects::Now() (a special flavor of Now()) is to make the API concise (one call, induces side effects, and gets the time value that we need). We could have used two function calls... but that would be less concise, an error prone. That is indeed also why we have NowForEndOfRun() (gets time, and induces side effect (of which there are currently none!)). That also allowed us to sandwich as tightly as possible the acquisition of the times around the Run() event we wanted to time. Given that the StopWatch class is for the exclusive use of the profiler, it can acquire scoped_objects::Now() directly (no need to even pass in timer values). It can always get them as needed from the Now() function. As a result, the above lines should probably read: tracked_objects::ThreadData::PrepareForStartOfRun(task.birth_tally); tracked_objects::TaskStopwatch stopwatch; // Implicitly starts timer. https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... base/threading/sequenced_worker_pool.cc:762: stopwatch.Stop(tracked_objects::ThreadData::NowForEndOfRun()); This call can then be simply: stopwatch.Stop(); It may even be that we don't need to call this, and the first line of the next method *could* be the call to Stop(). In general, it is nicest to tightly sandwidth the Run() with the start (via construction?) and the Stop (via explicit call). https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... base/threading/sequenced_worker_pool.cc:765: task, stopwatch); If you're willing to embed the Stop() into the above call, we'd instead pass in: &stopwatch. The big question is to look at all call sites, and see if there are other calculations done during the call (as args are passed in). It would be best to do the stop very shortly after the task completes. https://codereview.chromium.org/445413003/diff/80001/base/threading/worker_po... File base/threading/worker_pool_posix.cc (right): https://codereview.chromium.org/445413003/diff/80001/base/threading/worker_po... base/threading/worker_pool_posix.cc:105: pending_task.birth_tally, TrackedTime(pending_task.time_posted), *IF* you pushed the Stop() call into TallyRunOnWorkerThreadIfTracking() then it *might* be better to construct this TrackedTime() before you created/started the stop watch. I don't know how long the conversion takes :-/. It might not be that bad... but on some platforms, 64bit arithmetic calls are done via slow library calls :-(.
https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... File base/threading/sequenced_worker_pool.cc (right): https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... base/threading/sequenced_worker_pool.cc:760: tracked_objects::ThreadData::NowForStartOfRun(task.birth_tally)); On 2014/09/02 16:52:16, jar (gone till 9-27-2014) wrote: > The reason there is an API called NowForStartOfRun(), instead of just calling > tracked_objects::Now() (a special flavor of Now()) is to make the API concise > (one call, induces side effects, and gets the time value that we need). We > could have used two function calls... but that would be less concise, an error > prone. > > That is indeed also why we have NowForEndOfRun() (gets time, and induces side > effect (of which there are currently none!)). > > That also allowed us to sandwich as tightly as possible the acquisition of the > times around the Run() event we wanted to time. > > > Given that the StopWatch class is for the exclusive use of the profiler, it can > acquire scoped_objects::Now() directly (no need to even pass in timer values). > It can always get them as needed from the Now() function. > > As a result, the above lines should probably read: > tracked_objects::ThreadData::PrepareForStartOfRun(task.birth_tally); > tracked_objects::TaskStopwatch stopwatch; // Implicitly starts timer. Done. https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... base/threading/sequenced_worker_pool.cc:762: stopwatch.Stop(tracked_objects::ThreadData::NowForEndOfRun()); Done. Given that there are 3 methods that can follow stopwatch.Stop (TallyRunOnNamedThreadIfTracking, TallyRunOnWorkerThreadIfTracking and TallyRunInAScopedRegionIfTracking), it's not elegant to call Stop in all 3 of them. So I kept a separate Stop method. https://codereview.chromium.org/445413003/diff/80001/base/threading/sequenced... base/threading/sequenced_worker_pool.cc:765: task, stopwatch); See the above answer. https://codereview.chromium.org/445413003/diff/80001/base/threading/worker_po... File base/threading/worker_pool_posix.cc (right): https://codereview.chromium.org/445413003/diff/80001/base/threading/worker_po... base/threading/worker_pool_posix.cc:105: pending_task.birth_tally, TrackedTime(pending_task.time_posted), IF ;-)
I still really don't like the debug-only members and methods (and would prefer test code be written without using ifdef) ... but I can live with them. Two smaller items below. I can also live without the cleaning of the Get() shown below, as it is not (currently) perf critical... but I'd rather see the code consistently upgraded to handle all the profiling flavors. https://codereview.chromium.org/445413003/diff/100001/base/run_loop.cc File base/run_loop.cc (right): https://codereview.chromium.org/445413003/diff/100001/base/run_loop.cc#newcode50 base/run_loop.cc:50: { nit: You probably don't need curlies, unless the destructor has side-effects. https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.cc... base/tracked_objects.cc:764: ThreadData* current_thread_data = Get(); I liked the way you replaced several other Get() calls (which use TLS) with calls that harvest the data from a StopWatch (so we don't have to call this twice). Is there any nice way we can share this Get() with the one needed for the StopWatch? Perhaps return the ThreadData* and use it to construct the StopWatch()? Perhaps the StopWatch constructor can be overloaded to optionally take this as a constructor arg? ... or we can always pass it in as a constructor arg? https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.cc... base/tracked_objects.cc:882: #ifndef NDEBUG nit: We probably don't need the ifdef around DCHECKs https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.h#... base/tracked_objects.h:756: TaskStopwatch* running_child_; nit: Clean up names. This is really a backward link on a doubly linked list. The forward link is currently called: parent_stopwatch_ but this is called running_child_ Note that the child might not be "running" as it may have a running child. The names should probably just be: parent_ child_ ...or should have the same form (style guide suggest not using the type "stopwatch" as a postfix).
https://codereview.chromium.org/445413003/diff/100001/base/run_loop.cc File base/run_loop.cc (right): https://codereview.chromium.org/445413003/diff/100001/base/run_loop.cc#newcode50 base/run_loop.cc:50: { On 2014/09/05 04:36:05, jar (gone till 9-27-2014) wrote: > nit: You probably don't need curlies, unless the destructor has side-effects. Done. https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.cc... base/tracked_objects.cc:764: ThreadData* current_thread_data = Get(); My understanding is that parent-child tracking is disabled. If you enable it, I understand that it will hurt perf much worse than double-accessing TLS. Given that all options you mentioned lead to complicating the code, I'd prefer to leave everything as is, at least until parent-child tracking is enabled in official builds. https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.cc... base/tracked_objects.cc:882: #ifndef NDEBUG We actually need it, since otherwise we get compiler errors in release mode. This is because DCHECK references its condition even in release mode, to avoid "unused variable" warnings. https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.h File base/tracked_objects.h (right): https://codereview.chromium.org/445413003/diff/100001/base/tracked_objects.h#... base/tracked_objects.h:756: TaskStopwatch* running_child_; On 2014/09/05 04:36:05, jar (gone till 9-27-2014) wrote: > nit: Clean up names. This is really a backward link on a doubly linked list. > The forward link is currently called: > parent_stopwatch_ > but this is called > running_child_ > > Note that the child might not be "running" as it may have a running child. > > The names should probably just be: > parent_ > child_ > > ...or should have the same form (style guide suggest not using the type > "stopwatch" as a postfix). Done.
LGTM You'll probably need a few owner reviews, but they should be rubber stamps
vadimt@chromium.org changed reviewers: + brettw@chromium.org
brettw@chromium.org: Please provide OWNER's approval for all the files. I already got jar@'s LGTM for the changes in task tallying. The changes are related to the jank detection effort. They reduce pollution of profiler data by eliminating double-counting nested tasks and counting long nestes loops as jank. Also, the CL specifically doesn't count as jank the time spend in waiting for DirectX initialization, since this happens before any pixels are shown.
vadimt@chromium.org changed reviewers: + mark@chromium.org - brettw@chromium.org
vadimt@chromium.org changed reviewers: + piman@chromium.org
vadimt@chromium.org changed reviewers: + jbauman@chromium.org - piman@chromium.org
vadimt@chromium.org changed reviewers: + kbr@chromium.org - jbauman@chromium.org
kbr@, please provide OWNER's approval for content/browser/gpu/browser_gpu_channel_host_factory.cc mark@ please provide OWNER's approval for all other files. Please let me know if you are busy and recommending to reassign the review. See the previous and the first comments for the motivation. Thanks!
I’m a little swamped right now, can you find another base OWNER?
content/browser/gpu LGTM, but I have a few comments about the new DCHECKs in TaskStopwatch. https://codereview.chromium.org/445413003/diff/120001/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/120001/base/tracked_objects.cc... base/tracked_objects.cc:861: #ifndef NDEBUG You should guard these with #if DCHECK_IS_ON instead. https://codereview.chromium.org/445413003/diff/120001/base/tracked_objects.cc... base/tracked_objects.cc:882: #ifndef NDEBUG Remove NDEBUG; not necessary with above if-guard fixes.
vadimt@chromium.org changed reviewers: + thakis@chomium.org - mark@chromium.org
Thanks Ken! thakis@, please provide OWNER's approval for base/* files, or let me know if you are too busy to do this.
On 2014/09/10 20:53:38, vadimt wrote: > Thanks Ken! > thakis@, please provide OWNER's approval for base/* files, or let me know if you > are too busy to do this. @vadimt: I'd request you consider fixing the #if guards in the TaskStopwatch before pursuing OWNERS approvals.
Yeah, this was a race condition between threads of conversation :) 1. I meant “thanks Mark”. 2. kbr@, thanks too! 3. kbr@, your comments are now addressed. 4. To be clear, thakis@, you are still invited to review base/; thanks! https://codereview.chromium.org/445413003/diff/120001/base/tracked_objects.cc File base/tracked_objects.cc (right): https://codereview.chromium.org/445413003/diff/120001/base/tracked_objects.cc... base/tracked_objects.cc:861: #ifndef NDEBUG On 2014/09/10 20:47:50, Ken Russell wrote: > You should guard these with #if DCHECK_IS_ON instead. Done. https://codereview.chromium.org/445413003/diff/120001/base/tracked_objects.cc... base/tracked_objects.cc:882: #ifndef NDEBUG I still need #if because DCHECK, even in 'OFF' mode, always references its condition to avoid 'unreferenced variable' warnings.
Thanks. LGTM again.
brettw@chromium.org changed reviewers: + brettw@chromium.org
LGTM based mostly on jar's review.
brettw@, thanks! thakis@, your review is not required anymore, but, yeah, feel free to comment. Or not :)
The CQ bit was checked by vadimt@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/445413003/140001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/linux_gpu/builds/...) mac_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/mac_gpu/builds/54366) win_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/win_gpu/builds/59296) android_arm64_dbg_recipe on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_arm64_d...) android_chromium_gn_compile_rel on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_chromiu...) android_clang_dbg_recipe on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_clang_d...) android_dbg_tests_recipe on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_dbg_tes...) chromium_presubmit on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...) linux_chromium_chromeos_rel_swarming on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_gn_rel on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_rel_swarming on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) ios_dbg_simulator on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_dbg_simulator...) ios_rel_device on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device/bu...) ios_rel_device_ninja on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device_ni...) mac_chromium_compile_dbg on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_swarming on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...) win8_chromium_rel on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win8_chromium_rel...) win_chromium_compile_dbg on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_comp...) win_chromium_rel_swarming on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_...) win_chromium_x64_rel_swarming on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was checked by vadimt@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/445413003/160001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: linux_chromium_rel_swarming on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by vadimt@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/445413003/180001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: mac_chromium_rel_swarming on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by vadimt@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patchset/445413003/200001
Message was sent while issue was closed.
Committed patchset #11 (id:200001) as 309a5966ac9f7330db59189040215db6e9bb1a55
Message was sent while issue was closed.
Patchset 11 (id:??) landed as https://crrev.com/12f0f7db8095f1a5e40fb7227b217ddfc1724116 Cr-Commit-Position: refs/heads/master@{#294865} |