|
|
Created:
4 years, 8 months ago by shivanisha Modified:
4 years, 6 months ago CC:
chromium-reviews, csharrison+watch_chromium.org, loading-reviews+metrics_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionThis fix uses Optional TimeDeltas since they may be non existent instead
of using zero as a non-existent value.
It uses base::Optional class (https://codereview.chromium.org/1245163002/)
BUG=596367
Apart from the initial fix for this change which involves TimeDelta values in
PageLoadExtraInfo to be converted to use base::Optional, this CL has the following
3 more changes:
1. Converted the foreground event condition from event_time < background_time to
event_time <= background_time. This made all tests that have an event occuring
before calling WasHidden (which fills the background_time as Now()) to be
deterministic in assuming that the event will be logged in a foreground histogram
irrespective of the clock resolution. This is also a better logical choice since
the event has not been subjected to any background-specific functionality if it
occurred at the same time as backgrounding. This fixed the first failure reported
on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent.
The tests that have event occurring after background will be non
deterministic as they cannot be guaranteed to have event time strictly greater
than background time.
So I have added an interface in the test
framework that lets the test retrieve the value of PageLoadExtraInfo for a
committed load. I have not added the same for provisional loads/aborted loads but
could be added in future if need be.
2. For the failure resulting from inter process time tick skew, conditional
clamping is introduced only if the clock is not high resolution. The testing
strategy for that is that the existing unit tests should all pass on a low
resolution clock system. Failures from this should result in a DCHECK getting
hit.
(I have not added specific tests for testing this because
it will be difficult to test all possible code paths in specific tests and mocking
inter process time tick skew would require mocking isHighResolution at the least.
The complexity is not worth it since all code paths will not get covered in a few
tests anyways.)
Committed: https://crrev.com/271f0336503725749c86df213f6710c6423f9b5f
Cr-Commit-Position: refs/heads/master@{#401022}
Patch Set 1 #Patch Set 2 : Need to add Optional changes as well in this issue otherwise bots fail in compilation #
Total comments: 12
Patch Set 3 : Review comments incorporation #
Total comments: 5
Patch Set 4 : Rebased with origin/master and latest changes from 1245163002. #Patch Set 5 : Rebased with origin/master and latest changes from 1245163002. #Patch Set 6 : Condition for foreground event changed to consider equality with background time. #Patch Set 7 : Inter process time ticks skew related error fixed, Rebased #
Total comments: 4
Patch Set 8 : Fixed failing test, deterministic test added with framework changes. #Patch Set 9 : Rebased with latest, fixed a test, added a clock resolution based test with framework changes. #Patch Set 10 : Rebased with latest, fixed a test, added a clock resolution dependent test with framework changes. #Patch Set 11 : Fixed more tests to be deterministic. #Patch Set 12 : Fixed more tests to be deterministic. #Patch Set 13 : Fixed non-deterministic tests and Bryan's review comments. #Patch Set 14 : Rebased, Fixed non-deterministic tests and Bryan's review comments. #
Total comments: 45
Patch Set 15 : Incorporated feedback from Charles. #Patch Set 16 : Rebased #Patch Set 17 : Rebased. #Patch Set 18 : rebased again to make sure win_clang bot failure is not a merging issue #
Total comments: 4
Patch Set 19 : Called RecordInternalError and updated in histograms.xml. #
Total comments: 15
Patch Set 20 : Rebased, incorporated Bryan's feedback and fixed failing test. #Patch Set 21 : Rebased, incorporated Bryan's feedback and fixed failing test. #Patch Set 22 : Fixed a few fromGwS tests #Patch Set 23 : Fixed a couple fromGWS tests #Patch Set 24 : Rebased #
Total comments: 2
Patch Set 25 : added dcheck when isHighResolution is true. #Messages
Total messages: 82 (20 generated)
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 The following files are part of issue https://codereview.chromium.org/1245163002/ and I needed to add them here for bots to compile. Please ignore these files for review: base/base.gyp base/base.gypi base/BUILD.gn base/optional.h base/optional_unittest.cc ==========
shivanisha@chromium.org changed reviewers: + bmcquade@chromium.org, csharrison@chromium.org
https://codereview.chromium.org/1837233002/diff/20001/base/BUILD.gn File base/BUILD.gn (right): https://codereview.chromium.org/1837233002/diff/20001/base/BUILD.gn#newcode567 base/BUILD.gn:567: "observer_list_threadsafe.h", Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/20001/base/base.gyp File base/base.gyp (right): https://codereview.chromium.org/1837233002/diff/20001/base/base.gyp#newcode500 base/base.gyp:500: 'native_library_unittest.cc', Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/20001/base/base.gypi File base/base.gypi (right): https://codereview.chromium.org/1837233002/diff/20001/base/base.gypi#newcode447 base/base.gypi:447: 'observer_list_threadsafe.h', Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/20001/base/optional.h File base/optional.h (right): https://codereview.chromium.org/1837233002/diff/20001/base/optional.h#newcode1 base/optional.h:1: // Copyright 2016 The Chromium Authors. All rights reserved. Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/20001/base/optional_unittest.cc File base/optional_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/20001/base/optional_unittest.... base/optional_unittest.cc:1: // Copyright 2016 The Chromium Authors. All rights reserved. Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/
shivanisha@chromium.org changed reviewers: + rdsmith@chromium.org
I am looking into the Windows Compilation failures.
bmcquade@chromium.org changed reviewers: + jkarlin@chromium.org
Thanks! This looks great. https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/browser/page_load_metrics_observer.h (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/browser/page_load_metrics_observer.h:55: PageLoadExtraInfo(base::Optional<base::TimeDelta> first_background_time, given that base::Optional is bigger than a pointer here (bool+double), should we pass it by const reference rather than by value? https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/browser/page_load_metrics_util.cc (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/browser/page_load_metrics_util.cc:24: return info.started_in_foreground && event && could this become: return event && WasStartedInForegroundEventInForeground(event.value(), info); https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/common/page_load_timing.h (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/common/page_load_timing.h:29: // @shivanisha TODO Issue 596367 shows that it is possible for a existent style nit: TODO(shivanisha): Issue ...
https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/browser/page_load_metrics_observer.h (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/browser/page_load_metrics_observer.h:55: PageLoadExtraInfo(base::Optional<base::TimeDelta> first_background_time, On 2016/03/29 at 18:33:43, Bryan McQuade wrote: > given that base::Optional is bigger than a pointer here (bool+double), should we pass it by const reference rather than by value? Makes sense. will do https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/browser/page_load_metrics_util.cc (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/browser/page_load_metrics_util.cc:24: return info.started_in_foreground && event && On 2016/03/29 at 18:33:43, Bryan McQuade wrote: > could this become: > return event && WasStartedInForegroundEventInForeground(event.value(), info); Since WasStartedInForegroundEventInForeground is returning false if event.is_zero() is true, it will be an anti-pattern to this fix. https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/common/page_load_timing.h (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/common/page_load_timing.h:29: // @shivanisha TODO Issue 596367 shows that it is possible for a existent On 2016/03/29 at 18:33:43, Bryan McQuade wrote: > style nit: > TODO(shivanisha): Issue ... updating the comment
https://codereview.chromium.org/1837233002/diff/40001/base/BUILD.gn File base/BUILD.gn (right): https://codereview.chromium.org/1837233002/diff/40001/base/BUILD.gn#newcode567 base/BUILD.gn:567: "observer_list_threadsafe.h", Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/40001/base/base.gyp File base/base.gyp (right): https://codereview.chromium.org/1837233002/diff/40001/base/base.gyp#newcode502 base/base.gyp:502: 'observer_list_unittest.cc', Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/40001/base/base.gypi File base/base.gypi (right): https://codereview.chromium.org/1837233002/diff/40001/base/base.gypi#newcode447 base/base.gypi:447: 'observer_list_threadsafe.h', Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/40001/base/optional.h File base/optional.h (right): https://codereview.chromium.org/1837233002/diff/40001/base/optional.h#newcode1 base/optional.h:1: // Copyright 2016 The Chromium Authors. All rights reserved. Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/ https://codereview.chromium.org/1837233002/diff/40001/base/optional_unittest.cc File base/optional_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/40001/base/optional_unittest.... base/optional_unittest.cc:1: // Copyright 2016 The Chromium Authors. All rights reserved. Ignore this file for review. Added here for compilation, part of https://codereview.chromium.org/1245163002/
You should be able to change this to be a dependent CL, so that when it is approved you can land it without modification after the Optional change lands. Ping me if you need help setting that up.
lgtm https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... File components/page_load_metrics/browser/page_load_metrics_util.cc (right): https://codereview.chromium.org/1837233002/diff/20001/components/page_load_me... components/page_load_metrics/browser/page_load_metrics_util.cc:24: return info.started_in_foreground && event && On 2016/03/29 at 18:54:25, shivanisha wrote: > On 2016/03/29 at 18:33:43, Bryan McQuade wrote: > > could this become: > > return event && WasStartedInForegroundEventInForeground(event.value(), info); > > Since WasStartedInForegroundEventInForeground is returning false if event.is_zero() is true, it will be an anti-pattern to this fix. Ah, I missed that. Good catch. Thanks!
hi. https://codereview.chromium.org/1245163002/ is almost landed can you rebase this CL, and the base/* parts with the latest version of https://codereview.chromium.org/1245163002/ and run it on the win_chromium_x64_rel_ng bot, then I can pull the isolated input and test it on the GCE bots manually to verify the fix. thanks.
On 2016/04/15 at 20:42:36, wfh wrote: > hi. https://codereview.chromium.org/1245163002/ is almost landed > > can you rebase this CL, and the base/* parts with the latest version of https://codereview.chromium.org/1245163002/ and run it on the win_chromium_x64_rel_ng bot, then I can pull the isolated input and test it on the GCE bots manually to verify the fix. thanks. Sure, I will do that. Thanks.
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 The following files are part of issue https://codereview.chromium.org/1245163002/ and I needed to add them here for bots to compile. Please ignore these files for review: base/base.gyp base/base.gypi base/BUILD.gn base/optional.h base/optional_unittest.cc ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 The following files are part of issue https://codereview.chromium.org/1245163002/ and I needed to add them here for bots to compile. Please ignore these files for review: base/base.gyp base/base.gypi base/BUILD.gn base/optional.h base/optional_unittest.cc ==========
Rebased with origin/master and latest changes from CL 1245163002.
On 2016/04/18 at 15:14:29, shivanisha wrote: > Rebased with origin/master and latest changes from CL 1245163002. wfh@ The windows bots are running fine after rebasing.
On 2016/04/18 16:26:15, shivanisha wrote: > On 2016/04/18 at 15:14:29, shivanisha wrote: > > Rebased with origin/master and latest changes from CL 1245163002. > > wfh@ > > The windows bots are running fine after rebasing. Hi I ran unit_tests on the Win10 GCE bots and still get one failure: https://chromium-swarm-dev.appspot.com/user/task/2e43aa9004c83510 CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvents (e:\b\build\slave\win\build\src\chrome\browser\page_load_metrics\observers\core_page_load_metrics_observer_unittest.cc:218) [ RUN ] CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvents e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count Actual: 1 Expected: expected_count Which is: 0 Histogram "PageLoad.Timing2.NavigationToCommit.Background" does not have the right total number of samples (0). It has (1). e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count Actual: 1 Expected: expected_count Which is: 0 Histogram "PageLoad.Timing2.NavigationToDOMContentLoadedEventFired.Background" does not have the right total number of samples (0). It has (1). e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(79): error: Value of: 0 Expected: count Which is: 1 Histogram "PageLoad.Timing2.NavigationToCommit" does not exist. e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(79): error: Value of: 0 Expected: count Which is: 1 Histogram "PageLoad.Timing2.NavigationToDOMContentLoadedEventFired" does not exist. e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(59): error: Expected: (static_cast<base::HistogramBase*>(0)) != (histogram), actual: NULL vs NULL Histogram "PageLoad.Timing2.NavigationToDOMContentLoadedEventFired" does not exist. [ FAILED ] CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvents (19 ms)
On 2016/04/18 at 19:46:44, wfh wrote: > On 2016/04/18 16:26:15, shivanisha wrote: > > On 2016/04/18 at 15:14:29, shivanisha wrote: > > > Rebased with origin/master and latest changes from CL 1245163002. > > > > wfh@ > > > > The windows bots are running fine after rebasing. > > Hi I ran unit_tests on the Win10 GCE bots and still get one failure: > > https://chromium-swarm-dev.appspot.com/user/task/2e43aa9004c83510 > > CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvents (e:\b\build\slave\win\build\src\chrome\browser\page_load_metrics\observers\core_page_load_metrics_observer_unittest.cc:218) > > [ RUN ] CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvents > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count > > Actual: 1 > > Expected: expected_count > > Which is: 0 > > Histogram "PageLoad.Timing2.NavigationToCommit.Background" does not have the right total number of samples (0). It has (1). > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count > > Actual: 1 > > Expected: expected_count > > Which is: 0 > > Histogram "PageLoad.Timing2.NavigationToDOMContentLoadedEventFired.Background" does not have the right total number of samples (0). It has (1). > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(79): error: Value of: 0 > > Expected: count > > Which is: 1 > > Histogram "PageLoad.Timing2.NavigationToCommit" does not exist. > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(79): error: Value of: 0 > > Expected: count > > Which is: 1 > > Histogram "PageLoad.Timing2.NavigationToDOMContentLoadedEventFired" does not exist. > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(59): error: Expected: (static_cast<base::HistogramBase*>(0)) != (histogram), actual: NULL vs NULL > > Histogram "PageLoad.Timing2.NavigationToDOMContentLoadedEventFired" does not exist. > > [ FAILED ] CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvents (19 ms) Thanks for the update. It looks like what is happening is this: Due to less clock granularity both time_to_commit and first_background_time are equal. Similarly, both dom_content_loaded_event_start and first_background_time are also equal. Since equality => event occurred in background, we are seeing the background histograms incremented while the foreground histograms are not. The test expects the opposite. Looking into how to make the test more deterministic.
To address the timing dependent failure, this patch makes the condition for an event being foreground also include equality with background time. From a histogram correctness perspective also, it makes sense for such events to be considered foreground because any background specific handling would not yet have been applied to that event.
On 2016/04/20 at 17:33:17, shivanisha wrote: > To address the timing dependent failure, this patch makes the condition for an event being foreground also include equality with background time. > > From a histogram correctness perspective also, it makes sense for such events to be considered foreground because any background specific handling would not yet have been applied to that event. Hi Will, Do you see any more failing test on Win10 GCE bot with the latest patch? Thanks, Shivani
On 2016/04/20 17:55:34, shivanisha wrote: > On 2016/04/20 at 17:33:17, shivanisha wrote: > > To address the timing dependent failure, this patch makes the condition for an > event being foreground also include equality with background time. > > > > From a histogram correctness perspective also, it makes sense for such events > to be considered foreground because any background specific handling would not > yet have been applied to that event. > > Hi Will, > > Do you see any more failing test on Win10 GCE bot with the latest patch? > > Thanks, > Shivani Once the tests have been swarmed to the isolate server, I will be able to run them on the Win10 GCE bots and let you know.
On 2016/04/20 at 18:35:54, wfh wrote: > On 2016/04/20 17:55:34, shivanisha wrote: > > On 2016/04/20 at 17:33:17, shivanisha wrote: > > > To address the timing dependent failure, this patch makes the condition for an > > event being foreground also include equality with background time. > > > > > > From a histogram correctness perspective also, it makes sense for such events > > to be considered foreground because any background specific handling would not > > yet have been applied to that event. > > > > Hi Will, > > > > Do you see any more failing test on Win10 GCE bot with the latest patch? > > > > Thanks, > > Shivani > > Once the tests have been swarmed to the isolate server, I will be able to run them on the Win10 GCE bots and let you know. Will is still seeing a failure on Win 10 GCE Bot (different failure from the one fixed above). https://chromium-swarm-dev.appspot.com/user/task/2e536f0b9d845e10 The failing browser tests (seem random tests to me) are hitting the assertion DCHECK_GE(commit_time_, navigation_start_) in PageLoadTracker::~PageLoadTracker() commit_time_ is filled in PageLoadTracker::Commit() and navigation_start_ is filled in PageLoadTracker::PageLoadTracker() Looking at the code, there doesn't seem a code path that can possibly invert the order of filling navigation_start_ and commit_time_. Any thoughts what could be causing this. The 2 times this assertion failed, the values of commit_time_ and navigation_start_ were: [124:3668:0421/212153:FATAL:metrics_web_contents_observer.cc(389)] Check failed: commit_time_ >= navigation_start_ (160582000 bogo-microseconds vs. 160585000 bogo-microseconds) and [3652:3448:0421/212400:FATAL:metrics_web_contents_observer.cc(389)] Check failed: commit_time_ >= navigation_start_ (287129000 bogo-microseconds vs. 287130000 bogo-microseconds)
*sigh*, you could be seeing interprocess time ticks skew for navigation starts that are coming from the renderer process. I would clamp the commit time to navigation start in OnCommit.
Can we only do that if we don't have a cross process monotonic click on the system otherwise dcheck? On Fri, Apr 22, 2016, 1:09 PM <csharrison@chromium.org> wrote: > *sigh*, you could be seeing interprocess time ticks skew for navigation > starts > that are coming from the renderer process. > > I would clamp the commit time to navigation start in OnCommit. > > https://codereview.chromium.org/1837233002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
I'm surprised the clock could change like this between processes. The clock across all processes should always either stay the same or increment, even on GCE instances (where it might not increment for long(er) periods than a normal system). It should certainly never go backwards. Is that what is being seen here?
On 2016/04/22 at 17:56:08, wfh wrote: > I'm surprised the clock could change like this between processes. The clock across all processes should always either stay the same or increment, even on GCE instances (where it might not increment for long(er) periods than a normal system). It should certainly never go backwards. Is that what is being seen here? Thanks Charles. Yes, this does look like inter process time ticks skew. As per https://code.google.com/p/chromium/codesearch#chromium/src/content/common/int... "On Windows, TimeTicks are not consistent between processes."
On 2016/04/22 18:09:59, shivanisha wrote: > On 2016/04/22 at 17:56:08, wfh wrote: > > I'm surprised the clock could change like this between processes. The clock > across all processes should always either stay the same or increment, even on > GCE instances (where it might not increment for long(er) periods than a normal > system). It should certainly never go backwards. Is that what is being seen > here? > > Thanks Charles. Yes, this does look like inter process time ticks skew. > > As per > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/int... > "On Windows, TimeTicks are not consistent between processes." For more details, see crbug.com/394757. Also see the histogram Navigation.Start.RendererBrowserDifference.Negative which tracks skew after an IPC is sent.
On 2016/04/22 at 18:13:17, csharrison wrote: > On 2016/04/22 18:09:59, shivanisha wrote: > > On 2016/04/22 at 17:56:08, wfh wrote: > > > I'm surprised the clock could change like this between processes. The clock > > across all processes should always either stay the same or increment, even on > > GCE instances (where it might not increment for long(er) periods than a normal > > system). It should certainly never go backwards. Is that what is being seen > > here? > > > > Thanks Charles. Yes, this does look like inter process time ticks skew. > > > > As per > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/int... > > "On Windows, TimeTicks are not consistent between processes." > > For more details, see crbug.com/394757. Also see the histogram > Navigation.Start.RendererBrowserDifference.Negative which tracks skew after an IPC is sent. This issue can hit not just time_to_commit but also other time deltas that are computed as the delta between a browser populated timestamp and navigation_start (which could be populated either in browser or renderer based on whether the navigation is browser initiated or not). Some of those are: - time to abort - time to first background - time to foreground Looking into either of the 2 solutions: - Clamping the values that are expected to be >= navigation_start to navigation_start if they don't satisfy the assumption. - Looking into the code for InterProcessTimeTicksConverter to see if that could be helpful here.
On 2016/04/22 at 20:31:27, shivanisha wrote: > On 2016/04/22 at 18:13:17, csharrison wrote: > > On 2016/04/22 18:09:59, shivanisha wrote: > > > On 2016/04/22 at 17:56:08, wfh wrote: > > > > I'm surprised the clock could change like this between processes. The clock > > > across all processes should always either stay the same or increment, even on > > > GCE instances (where it might not increment for long(er) periods than a normal > > > system). It should certainly never go backwards. Is that what is being seen > > > here? > > > > > > Thanks Charles. Yes, this does look like inter process time ticks skew. > > > > > > As per > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/int... > > > "On Windows, TimeTicks are not consistent between processes." > > > > For more details, see crbug.com/394757. Also see the histogram > > Navigation.Start.RendererBrowserDifference.Negative which tracks skew after an IPC is sent. > > This issue can hit not just time_to_commit but also other time deltas that are computed as the delta between a browser populated timestamp and navigation_start (which could be populated either in browser or renderer based on whether the navigation is browser initiated or not). Some of those are: > - time to abort > - time to first background > - time to foreground > > Looking into either of the 2 solutions: > - Clamping the values that are expected to be >= navigation_start to navigation_start if they don't satisfy the assumption. > - Looking into the code for InterProcessTimeTicksConverter to see if that could be helpful here. I'd like to avoid clamping on platforms that have truly monotonic clocks, since on those platforms we should never need to clamp and thus clamping there could hide what would be a real bug possibly due to some other issue. On those platforms I'd like to DCHECK so we find bugs early. Do we have a way to clamp if the clock on the given platform is known to be non-monotonic, otherwise don't clamp? csharrison did some prior work that used the IsHighResolution method which IIRC if it returns true also implies that the clock is system-wide monotonic. Could we use this method? https://code.google.com/p/chromium/codesearch#chromium/src/base/time/time.h&l...
On 2016/04/26 14:35:14, Bryan McQuade wrote: > On 2016/04/22 at 20:31:27, shivanisha wrote: > > On 2016/04/22 at 18:13:17, csharrison wrote: > > > On 2016/04/22 18:09:59, shivanisha wrote: > > > > On 2016/04/22 at 17:56:08, wfh wrote: > > > > > I'm surprised the clock could change like this between processes. The > clock > > > > across all processes should always either stay the same or increment, even > on > > > > GCE instances (where it might not increment for long(er) periods than a > normal > > > > system). It should certainly never go backwards. Is that what is being > seen > > > > here? > > > > > > > > Thanks Charles. Yes, this does look like inter process time ticks skew. > > > > > > > > As per > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/int... > > > > "On Windows, TimeTicks are not consistent between processes." > > > > > > For more details, see crbug.com/394757. Also see the histogram > > > Navigation.Start.RendererBrowserDifference.Negative which tracks skew after > an IPC is sent. > > > > This issue can hit not just time_to_commit but also other time deltas that are > computed as the delta between a browser populated timestamp and navigation_start > (which could be populated either in browser or renderer based on whether the > navigation is browser initiated or not). Some of those are: > > - time to abort > > - time to first background > > - time to foreground > > > > Looking into either of the 2 solutions: > > - Clamping the values that are expected to be >= navigation_start to > navigation_start if they don't satisfy the assumption. > > - Looking into the code for InterProcessTimeTicksConverter to see if that > could be helpful here. > > I'd like to avoid clamping on platforms that have truly monotonic clocks, since > on those platforms we should never need to clamp and thus clamping there could > hide what would be a real bug possibly due to some other issue. On those > platforms I'd like to DCHECK so we find bugs early. > > Do we have a way to clamp if the clock on the given platform is known to be > non-monotonic, otherwise don't clamp? > > csharrison did some prior work that used the IsHighResolution method which IIRC > if it returns true also implies that the clock is system-wide monotonic. Could > we use this method? > https://code.google.com/p/chromium/codesearch#chromium/src/base/time/time.h&l... IsHighResolution() might work but it's a coarse measure (not just taking into account monotonicity). I know there was an idea a while back about adding "IsSaneHighResolution", which properly tests some of these things.
On 2016/04/26 at 15:02:43, csharrison wrote: > On 2016/04/26 14:35:14, Bryan McQuade wrote: > > On 2016/04/22 at 20:31:27, shivanisha wrote: > > > On 2016/04/22 at 18:13:17, csharrison wrote: > > > > On 2016/04/22 18:09:59, shivanisha wrote: > > > > > On 2016/04/22 at 17:56:08, wfh wrote: > > > > > > I'm surprised the clock could change like this between processes. The > > clock > > > > > across all processes should always either stay the same or increment, even > > on > > > > > GCE instances (where it might not increment for long(er) periods than a > > normal > > > > > system). It should certainly never go backwards. Is that what is being > > seen > > > > > here? > > > > > > > > > > Thanks Charles. Yes, this does look like inter process time ticks skew. > > > > > > > > > > As per > > > > > > > https://code.google.com/p/chromium/codesearch#chromium/src/content/common/int... > > > > > "On Windows, TimeTicks are not consistent between processes." > > > > > > > > For more details, see crbug.com/394757. Also see the histogram > > > > Navigation.Start.RendererBrowserDifference.Negative which tracks skew after > > an IPC is sent. > > > > > > This issue can hit not just time_to_commit but also other time deltas that are > > computed as the delta between a browser populated timestamp and navigation_start > > (which could be populated either in browser or renderer based on whether the > > navigation is browser initiated or not). Some of those are: > > > - time to abort > > > - time to first background > > > - time to foreground > > > > > > Looking into either of the 2 solutions: > > > - Clamping the values that are expected to be >= navigation_start to > > navigation_start if they don't satisfy the assumption. > > > - Looking into the code for InterProcessTimeTicksConverter to see if that > > could be helpful here. > > > > I'd like to avoid clamping on platforms that have truly monotonic clocks, since > > on those platforms we should never need to clamp and thus clamping there could > > hide what would be a real bug possibly due to some other issue. On those > > platforms I'd like to DCHECK so we find bugs early. > > > > Do we have a way to clamp if the clock on the given platform is known to be > > non-monotonic, otherwise don't clamp? > > > > csharrison did some prior work that used the IsHighResolution method which IIRC > > if it returns true also implies that the clock is system-wide monotonic. Could > > we use this method? > > https://code.google.com/p/chromium/codesearch#chromium/src/base/time/time.h&l... > > IsHighResolution() might work but it's a coarse measure (not just taking into account monotonicity). I know there was an idea a while back about adding "IsSaneHighResolution", which properly tests some of these things. Makes sense. Maybe we can try to use IsHighResolution for now and open a separate bug to add the IsSaneHighResolution method & migrate to that once available? This assumes that IsHighResolution returns fall on the platform where we're hitting test failures currently. In general I'm really hesitant to clamp in all cases to address a bug that only shows up in GCE. The more we clamp/sanitize our data, the more likely that we're masking or hiding bugs in the code that is populating that data.
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 The following files are part of issue https://codereview.chromium.org/1245163002/ and I needed to add them here for bots to compile. Please ignore these files for review: base/base.gyp base/base.gypi base/BUILD.gn base/optional.h base/optional_unittest.cc ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 ==========
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class which will be added via https://codereview.chromium.org/1245163002/ BUG=596367 ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 ==========
Patch 7 has the following changes over earlier patches: - Added a Windows-only function to check if the clock is not possibly system-wide monotonic. If not and if the timetick value is < navigation_start_ then make it equal to navigation_start_. - This function is getting called when a timestamp in PageLoadTracker is assigned the value of Now(). - Rebased with latest. - As part of rebasing, new code in from_gws_page_load_metrics_observer.cc changed to make use of Optional timedeltas in PageLoadExtraInfo. - Fixed a bug in from_gws_page_load_metrics_observer.cc in WasAbortedInForeground. Earlier it did not consider cases where info.startedInForeground was false. - Added asserts for PageLoadTracker's first_foreground_time_ and first_background_time_ to be >= navigation_start_ if they exist. Earlier similar asserts existed only for abort_time_ and commit_time_.
I still get a failure on unit_tests on GCE: [ RUN ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count Actual: 1 Expected: expected_count Which is: 0 Histogram "PageLoad.Timing2.NavigationToFailedProvisionalLoad" does not have the right total number of samples (0). It has (1). [ FAILED ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad (22 ms) https://chromium-swarm-dev.appspot.com/user/task/2e7d5e3cda388510
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 ==========
On 2016/04/30 at 01:02:16, wfh wrote: > I still get a failure on unit_tests on GCE: > > [ RUN ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count > > Actual: 1 > > Expected: expected_count > > Which is: 0 > > Histogram "PageLoad.Timing2.NavigationToFailedProvisionalLoad" does not have the right total number of samples (0). It has (1). > > [ FAILED ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad (22 ms) > > https://chromium-swarm-dev.appspot.com/user/task/2e7d5e3cda388510 Thanks Will, for the update. Looking into this failure.
On 2016/05/02 17:46:08, shivanisha wrote: > On 2016/04/30 at 01:02:16, wfh wrote: > > I still get a failure on unit_tests on GCE: > > > > [ RUN ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad > > > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: > Value of: actual_count > > > > Actual: 1 > > > > Expected: expected_count > > > > Which is: 0 > > > > Histogram "PageLoad.Timing2.NavigationToFailedProvisionalLoad" does not have > the right total number of samples (0). It has (1). > > > > [ FAILED ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad > (22 ms) > > > > https://chromium-swarm-dev.appspot.com/user/task/2e7d5e3cda388510 > > Thanks Will, for the update. Looking into this failure. I just reproed this failure locally by patching IsBuggyAthlon to return true as per directions in the bug. How are you doing your local testing?
thanks for the updates! added a couple comments. https://codereview.chromium.org/1837233002/diff/120001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/120001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:382: base::TimeTicks& event_time) { style guide wants 'out' params (or inout params) to be pointers rather than references. This is mostly just convention to allow readers of the code familiar with this convention to easily see that the param may be modified by a function as part of reading a call site. More detail here: https://google.github.io/styleguide/cppguide.html#Reference_Arguments https://codereview.chromium.org/1837233002/diff/120001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:397: event_time = navigation_start_; if we still have some kind of uma counter for errors, it might be interesting to also count how often this is happening.
On 2016/04/30 at 01:02:16, wfh wrote: > I still get a failure on unit_tests on GCE: > > [ RUN ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count > > Actual: 1 > > Expected: expected_count > > Which is: 0 > > Histogram "PageLoad.Timing2.NavigationToFailedProvisionalLoad" does not have the right total number of samples (0). It has (1). > > [ FAILED ] CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad (22 ms) > > https://chromium-swarm-dev.appspot.com/user/task/2e7d5e3cda388510 The above failure is happening because of the following sequence of events in the test: rfh_tester->SimulateNavigationStart(url); -- starts navigation web_contents()->WasHidden(); -- backgrounds it, calls TimeTicks::Now() rfh_tester->SimulateNavigationError(url, net::ERR_TIMED_OUT); -- fails navigation, calls TimeTicks::Now() which on Win 10 GCE could be same as the above and since the condition for an event to be foreground now includes equality with first_background_time_, the failure event is logged in a foreground histogram while on other systems the two timeticks are not equal. Looking into how to make this and similar tests deterministic on both platforms.
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that background the page, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparison with background_time will not happen. All the other tests are deterministic because they either hard-coded the values in PageLoadTiming or started the navigation itself in background thus making the comparison with background time not applicable. There are some timings that cannot be hard-coded as they do not belong in PageLoadTiming like commit_time_ So I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. ==========
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that background the page, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparison with background_time will not happen. All the other tests are deterministic because they either hard-coded the values in PageLoadTiming or started the navigation itself in background thus making the comparison with background time not applicable. There are some timings that cannot be hard-coded as they do not belong in PageLoadTiming like commit_time_ So I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that background the page, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparison with background_time will not happen. All the other tests are deterministic because they either hard-coded the values in PageLoadTiming or started the navigation itself in background thus making the comparison with background time not applicable. There are some timings that cannot be hard-coded as they do not belong in PageLoadTiming like commit_time_ So I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) ==========
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that background the page, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparison with background_time will not happen. All the other tests are deterministic because they either hard-coded the values in PageLoadTiming or started the navigation itself in background thus making the comparison with background time not applicable. There are some timings that cannot be hard-coded as they do not belong in PageLoadTiming like commit_time_ So I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that background the page, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparison with background_time will not happen. All the other tests are deterministic because they either hard-coded the values in PageLoadTiming or started the navigation itself in background thus making the comparison with background time not applicable. There are some timings that cannot be hard-coded as they do not belong in PageLoadTiming like commit_time_ So I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) ==========
shivanisha@chromium.org changed reviewers: - rdsmith@chromium.org
This patch has the following updates to the last patch: - Renamed the clamping function from RebaseBrowserTimestampIfInterProcessTimeTickSkew to ClampBrowserTimestampIfInterProcessTimeTickSkew - Made the clamping function an all platform function instead of just for Windows since clamping anyways happens conditionally based on IsHighResolution() and IsHighResolution() is available for all platforms. - As per the failing tests, here is a summary of investigation and changes: Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after backgrounding theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that use backgrounding, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparision with background_time will not happen. All the other tests are deterministic because they either hardcoded the values in PageLoadTiming or started the navigation itself in background thus making the comparision with background time not applicable. There are some timings that cannot be hardcoded as they do not belong in PageLoadTiming like commit_time_ so I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) whf@, I was not able to test using IsBuggyAthlon as I do not have a Windows machine. Please let me know what you see on Win10 GCE bot.
On 2016/05/12 at 19:50:55, shivanisha wrote: > This patch has the following updates to the last patch: > - Renamed the clamping function from RebaseBrowserTimestampIfInterProcessTimeTickSkew to ClampBrowserTimestampIfInterProcessTimeTickSkew > - Made the clamping function an all platform function instead of just for Windows since clamping anyways happens conditionally based on IsHighResolution() and IsHighResolution() is available for all platforms. > - As per the failing tests, here is a summary of investigation and changes: > Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: > 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. > > The tests that have event occurring after backgrounding theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that use backgrounding, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparision with background_time will not happen. All the other tests are deterministic because they either hardcoded the values in PageLoadTiming or started the navigation itself in background thus making the comparision with background time not applicable. > > There are some timings that cannot be hardcoded as they do not belong in PageLoadTiming like commit_time_ so I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. > > 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) > > whf@, I was not able to test using IsBuggyAthlon as I do not have a Windows machine. > Please let me know what you see on Win10 GCE bot. wfh@ I realized that I do not need a Win machine for running the Win10GCE swarming bot and I can reuse the binaries from the win_x64 bot. I am running the tests in the Win10 GCE swarming bot and will update the results. Thanks.
On 2016/05/13 at 17:06:45, shivanisha wrote: > On 2016/05/12 at 19:50:55, shivanisha wrote: > > This patch has the following updates to the last patch: > > - Renamed the clamping function from RebaseBrowserTimestampIfInterProcessTimeTickSkew to ClampBrowserTimestampIfInterProcessTimeTickSkew > > - Made the clamping function an all platform function instead of just for Windows since clamping anyways happens conditionally based on IsHighResolution() and IsHighResolution() is available for all platforms. > > - As per the failing tests, here is a summary of investigation and changes: > > Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: > > 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. > > > > The tests that have event occurring after backgrounding theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that use backgrounding, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparision with background_time will not happen. All the other tests are deterministic because they either hardcoded the values in PageLoadTiming or started the navigation itself in background thus making the comparision with background time not applicable. > > > > There are some timings that cannot be hardcoded as they do not belong in PageLoadTiming like commit_time_ so I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. > > > > 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) > > > > whf@, I was not able to test using IsBuggyAthlon as I do not have a Windows machine. > > Please let me know what you see on Win10 GCE bot. > > wfh@ > I realized that I do not need a Win machine for running the Win10GCE swarming bot and I can reuse the binaries from the win_x64 bot. I am running the tests in the Win10 GCE swarming bot and will update the results. Thanks. Some failures are coming on the Win10 GCE bot. I am looking into them.
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background theoretically will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. On scanning all of the existing tests that background the page, only one was non deterministic which showed up in the failures:CorePageLoadMetricsObserverTest.FailedBackgroundProvisionalLoad. That test is fixed by starting the navigation itself in background and thus the comparison with background_time will not happen. All the other tests are deterministic because they either hard-coded the values in PageLoadTiming or started the navigation itself in background thus making the comparison with background time not applicable. There are some timings that cannot be hard-coded as they do not belong in PageLoadTiming like commit_time_ So I have added a test that starts the navigation in foreground and commits in background and also added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. So I have added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) ==========
This patch has the following fixes: - Fixed more non-deterministic tests. Failing tests' details are given later in the message. Win10 GCE bots are happy now. There are no related failures seen over many runs. Here are links to the successful bots: unit_tests: https://chromium-swarm-dev.appspot.com/user/task/2edcef51eb5da010 https://chromium-swarm-dev.appspot.com/user/task/2edcef3edcf2a810 https://chromium-swarm-dev.appspot.com/user/task/2edcef2277bfbb10 https://chromium-swarm-dev.appspot.com/user/task/2edd38941c898d10 https://chromium-swarm-dev.appspot.com/user/task/2edd38a3aaec9b10 browser_tests: https://chromium-swarm-dev.appspot.com/user/task/2edd595839609d10 https://chromium-swarm-dev.appspot.com/user/task/2edd5972588be710 https://chromium-swarm-dev.appspot.com/user/task/2edd5989f218ce10 component_unittests: https://chromium-swarm-dev.appspot.com/user/task/2edd6c1613beac10 https://chromium-swarm-dev.appspot.com/user/task/2edd6c2f5524cb10 https://chromium-swarm-dev.appspot.com/user/task/2edd6c3d290eb310 - Incorporated Bryan's feedback to have a UMA that logs inter-process time tick skew counts and converted an in-out argument from a reference to a pointer. Details of tests that failed: Failing Swarming bot link: https://chromium-swarm-dev.appspot.com/user/task/2ec3db2132d9c010 and https://chromium-swarm-dev.appspot.com/user/task/2ed8cec906051e10 All of the failing test cases had the same root cause: They were dependent on the comparison between timing.first_paint or some other timedelta which was hard-coded as 1 microsecond and another timedelta like time_to_abort_ or first_background_time_ which was assumed to be > 1 micro-sec but in systems with low resolution clocks , it was not, hence failing the tests. Did a scan of all tests that were using FromMicroseconds to hardcode a timedelta and fixed them to be deterministic by using the new test framework that lets tests to retrieve the PageLoadExtraInfo for a committed load. Couple of them could not be fixed using the framework so I have relaxed their success criteria. They have code comments to explain that. https://codereview.chromium.org/1837233002/diff/120001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/120001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:382: base::TimeTicks& event_time) { On 2016/05/02 at 18:32:40, Bryan McQuade wrote: > style guide wants 'out' params (or inout params) to be pointers rather than references. This is mostly just convention to allow readers of the code familiar with this convention to easily see that the param may be modified by a function as part of reading a call site. More detail here: https://google.github.io/styleguide/cppguide.html#Reference_Arguments done. https://codereview.chromium.org/1837233002/diff/120001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:397: event_time = navigation_start_; On 2016/05/02 at 18:32:40, Bryan McQuade wrote: > if we still have some kind of uma counter for errors, it might be interesting to also count how often this is happening. done. Added a new enumeration for this case.
First pass, looks good! Mostly nits. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.cc:181: DCHECK(extra_info.time_to_abort); Just to confirm. Does DCHECKing an Optional check to make sure it has a value? https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc:156: // If the system clock is low resolution Looks like you can fit more words on some of these lines. Also, can we similarly include aborts if they occur <= first paint (like we do for backgrounds)? Seems like similar logic applies. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc:355: const base::TimeDelta& first_background_time = Don't take const references of time values. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc:392: const base::TimeDelta& time_to_commit = info.time_to_commit.value(); Ditto. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc:264: // PageLoadTracker's commit_time_ may be = background_time_. Looks like you can fit more words in this comment. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc:304: // PageLoadTracker's first_background_time_ may be same as other times ditto. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc:406: GURL url(kDefaultTestUrl); Include a comment here why we need to start the navigation in the bg (and why bg == abort time is not a bg abort) https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:171: const base::Optional<base::TimeDelta>& time_to_abort, Is it best practice to take const references of optionals? https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:182: const base::TimeDelta& time_to_abort_val = time_to_abort.value(); no const ref of TimeDelta. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:185: base::TimeDelta bg_abort_delta = time_to_abort_val - time_to_first_bg; Don't use "bg" to abbreviate background. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc:351: // If the system clock is low resolution Can fit more words into comment. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc:533: // If the system clock is low resolution Ditto. Also can you update the explanation about the abort time being "computed and destroyed"? I think it's okay to just say that we don't consider it an abort if the abort time is > the paint. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/page_load_metrics_observer_test_harness.h (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/page_load_metrics_observer_test_harness.h:45: const PageLoadExtraInfo GetPageLoadExtraInfoCommittedLoad(); Can you explain in a comment what this does? Also the name is a bit awkward. What committed load? https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:324: if (!background_time_.is_null()) { Why not convert fg and bg time to optionals too? Up to you, but maybe it's clearer than a TimeTicks with a null value. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:418: // Windows 10 GCE bot non-deterministically failed because Looks like you can fit more words on this line. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:420: // was lesser than navigation_start_ that was populated in the renderer s/lesser/less https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:426: // which also implies it is system-wide monotonic (on Windows). Are you sure this check is always correct in finding system-wide monotonicity? https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.h (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.h:90: // A timetick in the browser process has value lesser than navigation_start_. nit: A TimeTicks value in the browser process has a value less than navigation_start_. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.h:153: bool is_browser_timetick); Let's replace "timetick" with "timestamp". Also include a comment why we need this. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_observer.cc:29: // Using Optional <T> for the fields made clang complaint about not having Suggestion: This comment isn't that useful. You can remove it if you want. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/page_load_metrics_observer.h (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_observer.h:84: // aborted, |abort_type| will be |ABORT_NONE| Add a period at the end of the sentence. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/page_load_metrics_util.cc (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_util.cc:21: // TODO (@shivanisha) Since the above function is called for "TODO(shivanisha):" Is the preferred syntax. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_util.cc:22: // TimeDelta coming over IPC (PageLoadTiming) and since Optional is not This sentence is a little awkward. What about: "The above function is only used for TimeDeltas coming over IPC. Merge these two when page_load_metrics only handles Optional values in the browser process.
Thanks! Incorporated the feedback. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer.cc:181: DCHECK(extra_info.time_to_abort); On 2016/05/19 at 18:20:48, csharrison wrote: > Just to confirm. Does DCHECKing an Optional check to make sure it has a value? That's right. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc:156: // If the system clock is low resolution On 2016/05/19 at 18:20:48, csharrison wrote: > Looks like you can fit more words on some of these lines. Also, can we similarly include aborts if they occur <= first paint (like we do for backgrounds)? Seems like similar logic applies. Updated the comment formatting. The current logic does include abort if they are <= paint for logging but this test earlier assumed that abort will be strictly > first_paint which may not be true. Also updated the comment to clarify this. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc:355: const base::TimeDelta& first_background_time = On 2016/05/19 at 18:20:48, csharrison wrote: > Don't take const references of time values. done https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer.cc:392: const base::TimeDelta& time_to_commit = info.time_to_commit.value(); On 2016/05/19 at 18:20:48, csharrison wrote: > Ditto. done https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc:264: // PageLoadTracker's commit_time_ may be = background_time_. On 2016/05/19 at 18:20:48, csharrison wrote: > Looks like you can fit more words in this comment. done https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc:304: // PageLoadTracker's first_background_time_ may be same as other times On 2016/05/19 at 18:20:48, csharrison wrote: > ditto. done https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/core_page_load_metrics_observer_unittest.cc:406: GURL url(kDefaultTestUrl); On 2016/05/19 at 18:20:48, csharrison wrote: > Include a comment here why we need to start the navigation in the bg (and why bg == abort time is not a bg abort) Since we start the navigation in background there is no comparison between bg and abort time that happens as there is no first_background_time_. Many tests have this pattern of starting the navigation in background. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:171: const base::Optional<base::TimeDelta>& time_to_abort, On 2016/05/19 at 18:20:48, csharrison wrote: > Is it best practice to take const references of optionals? Since base::Optional <base::TimeDelta> is bigger than an integer: contains and integer and a boolean, thus passing by reference. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:182: const base::TimeDelta& time_to_abort_val = time_to_abort.value(); On 2016/05/19 at 18:20:48, csharrison wrote: > no const ref of TimeDelta. done https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:185: base::TimeDelta bg_abort_delta = time_to_abort_val - time_to_first_bg; On 2016/05/19 at 18:20:48, csharrison wrote: > Don't use "bg" to abbreviate background. changed https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc:351: // If the system clock is low resolution On 2016/05/19 at 18:20:48, csharrison wrote: > Can fit more words into comment. done. https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc:533: // If the system clock is low resolution On 2016/05/19 at 18:20:48, csharrison wrote: > Ditto. Also can you update the explanation about the abort time being "computed and destroyed"? I think it's okay to just say that we don't consider it an abort if the abort time is > the paint. done. Also updated in aborts*unit_test https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/page_load_metrics_observer_test_harness.h (right): https://codereview.chromium.org/1837233002/diff/260001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/page_load_metrics_observer_test_harness.h:45: const PageLoadExtraInfo GetPageLoadExtraInfoCommittedLoad(); On 2016/05/19 at 18:20:48, csharrison wrote: > Can you explain in a comment what this does? Also the name is a bit awkward. What committed load? Since MetricsWebContentsObserver may have many provisional loads or aborted provisional loads and a single committed load at a time and this function retrieves the PageLoadExtraInfo for just the committed load, hence "committed load" in the name. Changed the name to GetPageLoadExtraInfoForCommittedLoad and added a comment. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:324: if (!background_time_.is_null()) { On 2016/05/19 at 18:20:49, csharrison wrote: > Why not convert fg and bg time to optionals too? Up to you, but maybe it's clearer than a TimeTicks with a null value. Either way is fine, Since TimeTicks has the semantics of is_null representing an empty TimeTicks we are good without using Optional, for now. It will be more robust though, to use Optionals to reduce bugs which could crop up if a code doesn't consider 0 valued TimeTicks as empty. Adding this to be considered while working on the related bug: https://bugs.chromium.org/p/chromium/issues/detail?id=590212 https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:418: // Windows 10 GCE bot non-deterministically failed because On 2016/05/19 at 18:20:49, csharrison wrote: > Looks like you can fit more words on this line. done https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:420: // was lesser than navigation_start_ that was populated in the renderer On 2016/05/19 at 18:20:49, csharrison wrote: > s/lesser/less done https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:426: // which also implies it is system-wide monotonic (on Windows). On 2016/05/19 at 18:20:48, csharrison wrote: > Are you sure this check is always correct in finding system-wide monotonicity? That's how it is currently implemented on Windows. It returns true if underlying clock uses QPC which is syetm-wide monotonic as mentioned here (https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx) else it returns false. On other platforms it returns true. https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.h (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.h:90: // A timetick in the browser process has value lesser than navigation_start_. On 2016/05/19 at 18:20:49, csharrison wrote: > nit: A TimeTicks value in the browser process has a value less than navigation_start_. done https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.h:153: bool is_browser_timetick); On 2016/05/19 at 18:20:49, csharrison wrote: > Let's replace "timetick" with "timestamp". Also include a comment why we need this. done https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_observer.cc:29: // Using Optional <T> for the fields made clang complaint about not having On 2016/05/19 at 18:20:49, csharrison wrote: > Suggestion: This comment isn't that useful. You can remove it if you want. removed https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/page_load_metrics_observer.h (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_observer.h:84: // aborted, |abort_type| will be |ABORT_NONE| On 2016/05/19 at 18:20:49, csharrison wrote: > Add a period at the end of the sentence. done https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... File components/page_load_metrics/browser/page_load_metrics_util.cc (right): https://codereview.chromium.org/1837233002/diff/260001/components/page_load_m... components/page_load_metrics/browser/page_load_metrics_util.cc:22: // TimeDelta coming over IPC (PageLoadTiming) and since Optional is not On 2016/05/19 at 18:20:49, csharrison wrote: > This sentence is a little awkward. What about: > > "The above function is only used for TimeDeltas coming over IPC. Merge these two when page_load_metrics only handles Optional values in the browser process. sounds good. done
Thanks, it looks pretty good now. I'll do a final pass when you fix the patch issues.
Merged latest changes from master.
https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:439: UMA_HISTOGRAM_ENUMERATION(internal::kErrorEvents, Use RecordInternalError to avoid inlining a bunch of histogram code. https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.h (right): https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.h:94: ERR_INTER_PROCESS_TIME_TICK_SKEW, Looks like you need to update histograms.xml with this new enum value.
https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:439: UMA_HISTOGRAM_ENUMERATION(internal::kErrorEvents, On 2016/05/24 at 19:32:01, csharrison wrote: > Use RecordInternalError to avoid inlining a bunch of histogram code. done https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.h (right): https://codereview.chromium.org/1837233002/diff/340001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.h:94: ERR_INTER_PROCESS_TIME_TICK_SKEW, On 2016/05/24 at 19:32:01, csharrison wrote: > Looks like you need to update histograms.xml with this new enum value. Done. Thanks.
Thanks! LGTM
The CQ bit was checked by shivanisha@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from bmcquade@chromium.org Link to the patchset: https://codereview.chromium.org/1837233002/#ps360001 (title: "Called RecordInternalError and updated in histograms.xml.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1837233002/360001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1837233002/360001
The CQ bit was unchecked by shivanisha@chromium.org
shivanisha@chromium.org changed reviewers: + holte@chromium.org
holte@, PTAL for histograms.xml, Thanks!
bmcquade@, the CL has been updated since you last lgtm'ed. Do you want to take another look? Thanks.
still seeing (at least) one failing test on the GCE bots from ps19: [ RUN ] FromGWSPageLoadMetricsObserverTest.SearchPreviousCommittedUrlBackgroundLater e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count Actual: 1 Expected: expected_count Which is: 0 Histogram "PageLoad.Clients.FromGoogleSearch.PaintTiming.NavigationToFirstTextPaint" does not have the right total number of samples (0). It has (1). [ FAILED ] FromGWSPageLoadMetricsObserverTest.SearchPreviousCommittedUrlBackgroundLater (32 ms) https://chromium-swarm-dev.appspot.com/user/task/2efd850c9b852910
On 2016/05/24 22:35:28, Will Harris (slow thru 23 May) wrote: > still seeing (at least) one failing test on the GCE bots from ps19: > > [ RUN ] > FromGWSPageLoadMetricsObserverTest.SearchPreviousCommittedUrlBackgroundLater > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value > of: actual_count > > Actual: 1 > > Expected: expected_count > > Which is: 0 > > Histogram > "PageLoad.Clients.FromGoogleSearch.PaintTiming.NavigationToFirstTextPaint" does > not have the right total number of samples (0). It has (1). > > [ FAILED ] > FromGWSPageLoadMetricsObserverTest.SearchPreviousCommittedUrlBackgroundLater (32 > ms) > > https://chromium-swarm-dev.appspot.com/user/task/2efd850c9b852910 Sorry Shivani. Definitely check with the GCE bots before landing because you've rebased on more page load metrics tests in the old style.
histograms lgtm
On 2016/05/24 at 22:35:28, wfh wrote: > still seeing (at least) one failing test on the GCE bots from ps19: > > [ RUN ] FromGWSPageLoadMetricsObserverTest.SearchPreviousCommittedUrlBackgroundLater > > e:\b\build\slave\win\build\src\base\test\histogram_tester.cc(170): error: Value of: actual_count > > Actual: 1 > > Expected: expected_count > > Which is: 0 > > Histogram "PageLoad.Clients.FromGoogleSearch.PaintTiming.NavigationToFirstTextPaint" does not have the right total number of samples (0). It has (1). > > [ FAILED ] FromGWSPageLoadMetricsObserverTest.SearchPreviousCommittedUrlBackgroundLater (32 ms) > > https://chromium-swarm-dev.appspot.com/user/task/2efd850c9b852910 Since the metrics logging in FromGWSPageLoadMetricsObserver has changed from being logged in OnComplete() to being logged immediately when the timing update happens, the semantics of the test changed after I re-based leading to this error. Looking into the unit tests for FromGWSPageLoadMetricsObserver.
Thanks for this change! Basically looks good, just a few small things. https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc:167: EXPECT_TRUE(counts_map.empty() || counts_map.begin()->second == 1); this is great - i've wanted something like CountsMap in other tests but I didn't know about it. let's also verify that there's only a single entry in the map. maybe EXPECT_TRUE(counts_map.empty() || (counts_map.size() == 1 && counts_map.begin()->second == 1); https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:409: const base::Optional<base::TimeDelta>& time_to_abort = let's pull this up above the 'bool aborted_in_foreground =' and use it in the WasAbortedInForeground call https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc:596: EXPECT_TRUE(counts_map.empty() || counts_map.begin()->second == 1); same - should also check size() == 1 https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:325: if (!background_time_.is_null()) { I think the call to OnHidden for each observer at the bottom of this function expects to be invoked every time WebContentsHidden gets invoked, so we can't early return here. https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:341: if (!foreground_time_.is_null()) { same https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:425: // Windows 10 GCE bot non-deterministically failed because TimeTicks::Now() just to be thorough let's add DCHECK(event_time != nullptr); that way we'll at least blow up early if someone ever passes a null TimeTicks https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:439: RecordInternalError(ERR_INTER_PROCESS_TIME_TICK_SKEW); awesome, glad to add this as an additional error to track. really helpful to have this! https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:693: navigation_handle->NavigationStart(), false); i think the value for NavigationStart() can sometimes be set in the browser process, so technically the false value for is_browser_timestamp may sometimes be incorrect. that's fine since you'll clamp it any time this is false, and it's not harmful to clamp timestamps from the browser process. but it's worth adding a comment here to explain this, in case it confuses someone in the future. you might also consider renaming is_browser_timestamp to is_possibly_non_browser_timestamp which inverts the meaning of the bool but makes the specific case here where NavigationStart() may be from a different process consistent with the API. I don't feel strongly though (and it's worth double checking with charles whether NavigationStart() can in fact we set in the browser process).
This patch took a while to be uploaded as the Win10 GCE swarming bots are not working. I am in touch with hinoka@ and maruel@ for this and it seems they are working on it but it is taking time. As discussed with Charles and Josh, the plan is to land this CL after LGTMs and not be blocked on swarming bots. The reason is that while this CL is in progress , there are a lot of metrics CLs being landed which does not use Optional and thus conflicts with this CL. Landing this will make sure that subsequent changes are using Optional. If there are more failures reported from the Win10 GCE bot when its up, they will be addressed in a follow up CL. In addition to applying Bryan's last feedback, resolving conflicts with master and fixing the test reported by wfh(not able to test it though), this CL also reverts a change done by https://codereview.chromium.org/1984173002/ in WasAbortedInForeground (from_gws_page_load_metrics_observer.cc) where it wanted to differentiate between non-existent and zero values of time_to_abort. https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/aborts_page_load_metrics_observer_unittest.cc:167: EXPECT_TRUE(counts_map.empty() || counts_map.begin()->second == 1); On 2016/05/26 at 00:10:04, Bryan McQuade wrote: > this is great - i've wanted something like CountsMap in other tests but I didn't know about it. > > let's also verify that there's only a single entry in the map. maybe > EXPECT_TRUE(counts_map.empty() || (counts_map.size() == 1 && counts_map.begin()->second == 1); done https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc (right): https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer.cc:409: const base::Optional<base::TimeDelta>& time_to_abort = On 2016/05/26 at 00:10:04, Bryan McQuade wrote: > let's pull this up above the 'bool aborted_in_foreground =' and use it in the WasAbortedInForeground call done https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... File chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc (right): https://codereview.chromium.org/1837233002/diff/360001/chrome/browser/page_lo... chrome/browser/page_load_metrics/observers/from_gws_page_load_metrics_observer_unittest.cc:596: EXPECT_TRUE(counts_map.empty() || counts_map.begin()->second == 1); On 2016/05/26 at 00:10:04, Bryan McQuade wrote: > same - should also check size() == 1 done https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:325: if (!background_time_.is_null()) { On 2016/05/26 at 00:10:05, Bryan McQuade wrote: > I think the call to OnHidden for each observer at the bottom of this function expects to be invoked every time WebContentsHidden gets invoked, so we can't early return here. Good point. done. https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:341: if (!foreground_time_.is_null()) { On 2016/05/26 at 00:10:05, Bryan McQuade wrote: > same done. https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:425: // Windows 10 GCE bot non-deterministically failed because TimeTicks::Now() On 2016/05/26 at 00:10:05, Bryan McQuade wrote: > just to be thorough let's add > DCHECK(event_time != nullptr); > > that way we'll at least blow up early if someone ever passes a null TimeTicks done. https://codereview.chromium.org/1837233002/diff/360001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:693: navigation_handle->NavigationStart(), false); That's right. Also confirmed with Charles that it is possible for navigation_start to be in browser process. The assumption in this patch is that an event time will be clamped if its for sure set in the browser process but if it is NavigationStart(), then do not clamp it. So in case it is NavigationStart() set in the browser it will not be clamped. Changing the variable name to is_certainly_browser_timestamp (as opposed to is_possibly_non_browser_timestamp) to keep it consistent with its usage before calling ClampBrowserTimestampIfInterProcessTimeTickSkew(). Also added a comment here. This also needs to be explained in the ClampBrowserTimestampIfInterProcessTimeTickSkew() function since that also assumes that navigation_start_ will always be set in renderer. Clamping against a browser set navigation_start_ should ideally never happen due to the condition that browser_event_time should be strictly less than browser set navigation_start_ which is not possible. But let's say it happens due to a code error , it should also happen on other platforms(high resolution) where it will be caught with Dchecks. So adding this in the comment in ClampBrowserTimestampIfInterProcessTimeTickSkew.
Fixed a couple of fromGWS tests though their root cause was not this CL but a new CL. We will need to continue fixing tests after this CL if new changes in the code land without getting tested on the Win10 GCE bots. Since this CL is probably a blocker for GCE bots becoming part of the CQ, its important to land this CL and not address any new code related failures here anymore.
On 2016/06/08 at 20:01:35, shivanisha wrote: > Fixed a couple of fromGWS tests though their root cause was not this CL but a new CL. > We will need to continue fixing tests after this CL if new changes in the code land > without getting tested on the Win10 GCE bots. Since this CL is probably a blocker for > GCE bots becoming part of the CQ, its important to land this CL and not address any > new code related failures here anymore. GCE Bots are running fine without any page load metrics related failures: unit_tests: https://chromium-swarm-dev.appspot.com/user/task/2f4e5a06ff9ef410 https://chromium-swarm-dev.appspot.com/user/task/2f4e5a20e53eb910 https://chromium-swarm-dev.appspot.com/user/task/2f4e5a38c9e42d10 https://chromium-swarm-dev.appspot.com/user/task/2f4e5a4189773210 https://chromium-swarm-dev.appspot.com/user/task/2f4e5a5593859610
excellent! do you think this CL can be committed soon?
On 2016/06/10 at 22:52:18, wfh wrote: > excellent! do you think this CL can be committed soon? Yes, it can be committed soon. Bryan, the last patch has your feedback addressed, PTAL, thanks !
Rebased with latest.
Thanks for this fix! LGTM. https://codereview.chromium.org/1837233002/diff/460001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/460001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:445: return; could we also add: DCHECK(event_time->is_null() || *event_time >= navigation_start_); before the return?
https://codereview.chromium.org/1837233002/diff/460001/components/page_load_m... File components/page_load_metrics/browser/metrics_web_contents_observer.cc (right): https://codereview.chromium.org/1837233002/diff/460001/components/page_load_m... components/page_load_metrics/browser/metrics_web_contents_observer.cc:445: return; On 2016/06/21 at 00:59:46, Bryan McQuade wrote: > could we also add: > DCHECK(event_time->is_null() || *event_time >= navigation_start_); > before the return? Good point. done.
The CQ bit was checked by shivanisha@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from holte@chromium.org, csharrison@chromium.org, bmcquade@chromium.org Link to the patchset: https://codereview.chromium.org/1837233002/#ps480001 (title: "added dcheck when isHighResolution is true.")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1837233002/480001
Message was sent while issue was closed.
Committed patchset #25 (id:480001)
Message was sent while issue was closed.
Description was changed from ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. So I have added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) ========== to ========== This fix uses Optional TimeDeltas since they may be non existent instead of using zero as a non-existent value. It uses base::Optional class (https://codereview.chromium.org/1245163002/) BUG=596367 Apart from the initial fix for this change which involves TimeDelta values in PageLoadExtraInfo to be converted to use base::Optional, this CL has the following 3 more changes: 1. Converted the foreground event condition from event_time < background_time to event_time <= background_time. This made all tests that have an event occuring before calling WasHidden (which fills the background_time as Now()) to be deterministic in assuming that the event will be logged in a foreground histogram irrespective of the clock resolution. This is also a better logical choice since the event has not been subjected to any background-specific functionality if it occurred at the same time as backgrounding. This fixed the first failure reported on this CL :CorePageLoadMetricsObserverTest.OnlyBackgroundLaterEvent. The tests that have event occurring after background will be non deterministic as they cannot be guaranteed to have event time strictly greater than background time. So I have added an interface in the test framework that lets the test retrieve the value of PageLoadExtraInfo for a committed load. I have not added the same for provisional loads/aborted loads but could be added in future if need be. 2. For the failure resulting from inter process time tick skew, conditional clamping is introduced only if the clock is not high resolution. The testing strategy for that is that the existing unit tests should all pass on a low resolution clock system. Failures from this should result in a DCHECK getting hit. (I have not added specific tests for testing this because it will be difficult to test all possible code paths in specific tests and mocking inter process time tick skew would require mocking isHighResolution at the least. The complexity is not worth it since all code paths will not get covered in a few tests anyways.) Committed: https://crrev.com/271f0336503725749c86df213f6710c6423f9b5f Cr-Commit-Position: refs/heads/master@{#401022} ==========
Message was sent while issue was closed.
Patchset 25 (id:??) landed as https://crrev.com/271f0336503725749c86df213f6710c6423f9b5f Cr-Commit-Position: refs/heads/master@{#401022} |