Issue 1084943002: Add UMA stats for hard-fault counts for Windows 7 and greater.

Issue 1084943002: Add UMA stats for hard-fault counts for Windows 7 and greater. (Closed)

Created:
5 years, 8 months ago by chrisha

Modified:
5 years, 8 months ago

Reviewers:
erikchen, Alexei Svitkine (slow), jeremy

CC:
chromium-reviews, asvitkine+watch_chromium.org

Base URL:
https://chromium.googlesource.com/chromium/src.git@master

Target Ref:
refs/pending/heads/master

Project:
chromium

Visibility:
Public.

More Reviews

Description

Add UMA stats for hard-fault counts for Windows 7 and greater. If this stat is strongly bimodal it will be used to distinguish warm from cold for various startup metrics. BUG=476923 TBR=jeremy@chromium.org Committed: https://crrev.com/e2f7ec4b272226a67a3c5e66ed3d955e2a33227e Cr-Commit-Position: refs/heads/master@{#325337}

Patch Set 1 #

Total comments: 3

Patch Set 2 : Fix for non Win32 platforms. #

Total comments: 7

Patch Set 3 : Fix warning on Win64. #

Patch Set 4 : Use static_cast instead of reinterpret_cast. #

Patch Set 5 : Addressed comments from Erik's review. #

Total comments: 9

Patch Set 6 : Addressed Alexei's review comments. #

Patch Set 7 : Correct use of NT_SUCCESS. #

Total comments: 7

Patch Set 8 : Rebased and addressed nits. #

Created: 5 years, 8 months ago

Download [raw] [tar.bz2]

		Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+181 lines, -0 lines)			Patch
	M	components/startup_metric_utils/startup_metric_utils.cc	View	1 2 3 4 5 6 7	3 chunks	+151 lines, -0 lines	0 comments	Download
	M	tools/metrics/histograms/histograms.xml	View		1 chunk	+30 lines, -0 lines	0 comments	Download

Messages

Total messages: 22 (4 generated)

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages

chrisha

Not sure if startup_metric_utils is the right place or not. PTAL?

5 years, 8 months ago (2015-04-14 14:07:08 UTC) #2

chrisha

+Erik Chen, as the likely next OWNER of the startup metrics component.

5 years, 8 months ago (2015-04-14 16:49:58 UTC) #4

Alexei Svitkine (slow)

lgtm https://codereview.chromium.org/1084943002/diff/1/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/1/components/startup_metric_utils/startup_metric_utils.cc#newcode68 components/startup_metric_utils/startup_metric_utils.cc:68: sizeof(SYSTEM_PROCESS_INFORMATION_EX) == 184, Is the size the same ...

5 years, 8 months ago (2015-04-14 16:53:51 UTC) #5

chrisha

On 2015/04/14 16:53:51, Alexei Svitkine wrote: > lgtm > > https://codereview.chromium.org/1084943002/diff/1/components/startup_metric_utils/startup_metric_utils.cc > File components/startup_metric_utils/startup_metric_utils.cc (right): ...

5 years, 8 months ago (2015-04-14 16:56:17 UTC) #7

chrisha

https://codereview.chromium.org/1084943002/diff/1/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/1/components/startup_metric_utils/startup_metric_utils.cc#newcode68 components/startup_metric_utils/startup_metric_utils.cc:68: sizeof(SYSTEM_PROCESS_INFORMATION_EX) == 184, On 2015/04/14 16:53:51, Alexei Svitkine wrote: ...

5 years, 8 months ago (2015-04-14 17:09:07 UTC) #8

erikchen

I didn't review the windows-specific code. https://codereview.chromium.org/1084943002/diff/20001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/20001/components/startup_metric_utils/startup_metric_utils.cc#newcode181 components/startup_metric_utils/startup_metric_utils.cc:181: // (Observed to ...

5 years, 8 months ago (2015-04-14 23:01:52 UTC) #9

chrisha

Another look? https://codereview.chromium.org/1084943002/diff/20001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/20001/components/startup_metric_utils/startup_metric_utils.cc#newcode181 components/startup_metric_utils/startup_metric_utils.cc:181: // (Observed to vary from 1000 to ...

5 years, 8 months ago (2015-04-15 13:04:30 UTC) #10

Another look?

https://codereview.chromium.org/1084943002/diff/20001/components/startup_metr...
File components/startup_metric_utils/startup_metric_utils.cc (right):

https://codereview.chromium.org/1084943002/diff/20001/components/startup_metr...
components/startup_metric_utils/startup_metric_utils.cc:181: // (Observed to
vary from 1000 to 10000 on various test machines and
On 2015/04/14 23:01:52, erikchen wrote:
>  If you're observing 10000 on a test machine, it's going to go way higher in
the
> mild. Perhaps make the upper limit 200000, and the lower limit 100?

I suppose it could go drastically higher, but it would be non-sensical. In an
absolute worst case scenario this value is realistically bounded above by about
17000. This is assuming that Chrome and every single one of its dependencies
(totaling about 70MB) is entirely faulted in using 4KB hard page faults. This is
quite unlikely. I could see maybe doubling this to ensure that we catch some of
the long tail... thoughts?

As to the lower bound, we can and will see actual values of 0 for a completely
warm start. I'd like to catch those explicitly.

https://codereview.chromium.org/1084943002/diff/20001/components/startup_metr...
components/startup_metric_utils/startup_metric_utils.cc:190:
"Startup.BrowserMessageLoopStartHardFaultCount",
On 2015/04/14 23:01:52, erikchen wrote:
> The description you give for this metric implies it's always recorded, but
here
> you only record it on non-first run. I think it makes more sense to always
> record it, but either way the behavior and description should be consistent.

I was just being consistent with how
Startup.BrowserMessageLoopStartTimeFromMainEntry and
Startup.BrowserMessageLoopStartTimeFromMainEntry.FirstRun work, elsewhere in
this file. I've modified the descriptions to be consistent.

https://codereview.chromium.org/1084943002/diff/20001/components/startup_metr...
components/startup_metric_utils/startup_metric_utils.cc:272:
RecordHardFaultHistogram(is_first_run);
On 2015/04/14 23:01:52, erikchen wrote:
> As a quick sanity check, can you confirm that this method is fast via local
> instrumentation? 

This was benchmarked across various versions of the OS and across both
high-powered and low-powered machines. The results of the benchmarking are
linked to in the attached bug. This was seen to vary from tens of microseconds
to about 2 ms in the worst case.

erikchen

histograms lgtm. https://codereview.chromium.org/1084943002/diff/20001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/20001/components/startup_metric_utils/startup_metric_utils.cc#newcode181 components/startup_metric_utils/startup_metric_utils.cc:181: // (Observed to vary from 1000 to ...

5 years, 8 months ago (2015-04-15 17:27:00 UTC) #11

Alexei Svitkine (slow)

https://codereview.chromium.org/1084943002/diff/80001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/80001/components/startup_metric_utils/startup_metric_utils.cc#newcode83 components/startup_metric_utils/startup_metric_utils.cc:83: bool GetHardFaultCountForCurrentProcess(uint32* hard_fault_count, Nit: uint32_t is best practice for ...

5 years, 8 months ago (2015-04-15 17:36:19 UTC) #12

chrisha

Thanks. Another look? https://codereview.chromium.org/1084943002/diff/80001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/80001/components/startup_metric_utils/startup_metric_utils.cc#newcode83 components/startup_metric_utils/startup_metric_utils.cc:83: bool GetHardFaultCountForCurrentProcess(uint32* hard_fault_count, On 2015/04/15 17:36:19, ...

5 years, 8 months ago (2015-04-15 17:56:01 UTC) #13

Alexei Svitkine (slow)

lgtm % comments below https://codereview.chromium.org/1084943002/diff/80001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/80001/components/startup_metric_utils/startup_metric_utils.cc#newcode174 components/startup_metric_utils/startup_metric_utils.cc:174: success); On 2015/04/15 17:56:01, chrisha ...

5 years, 8 months ago (2015-04-15 18:08:10 UTC) #14

lgtm % comments below

https://codereview.chromium.org/1084943002/diff/80001/components/startup_metr...
File components/startup_metric_utils/startup_metric_utils.cc (right):

https://codereview.chromium.org/1084943002/diff/80001/components/startup_metr...
components/startup_metric_utils/startup_metric_utils.cc:174: success);
On 2015/04/15 17:56:01, chrisha wrote:
> On 2015/04/15 17:36:19, Alexei Svitkine wrote:
> > Since there are different cases available, I suggest making a custom enum
> > histogram outlining the possible ones, which would provide all the different
> > things we might be interested in.
> > 
> > i.e.
> > 
> > NO_OS_SUPPORT,
> > SUCCEEDED,
> > FAILED,
> 
> The "no OS support" case is completely deterministic (depends only on the OS
> version), and doesn't really provide any useful information. I'm not sure what
> use it would have?

I was thinking it would be simpler to reason about what percentage of users fall
into each case - but that has more value when this actually affects behavior -
since we're only logging the value, current set up seems fine to me now that I
think about it again. Thanks.

https://codereview.chromium.org/1084943002/diff/120001/components/startup_met...
File components/startup_metric_utils/startup_metric_utils.cc (right):

https://codereview.chromium.org/1084943002/diff/120001/components/startup_met...
components/startup_metric_utils/startup_metric_utils.cc:131: while (index <
buffer.size()) {
Nit: Maybe add:

DCHECK_LE(index + sizeof(SYSTEM_PROCESS_INFORMATION_EX), buffer.size());

https://codereview.chromium.org/1084943002/diff/120001/components/startup_met...
components/startup_metric_utils/startup_metric_utils.cc:138: // The list ends
with NextEntryOffset is zero. This also prevents busy
Nit: "ends with" -> "ends when"

https://codereview.chromium.org/1084943002/diff/120001/components/startup_met...
components/startup_metric_utils/startup_metric_utils.cc:140: if
(proc_info->NextEntryOffset == 0)
Nit: Maybe as a safety check change to <= 0, which would safeguard against
cycles in case that's in the returned data.

https://codereview.chromium.org/1084943002/diff/120001/tools/metrics/histogra...
File tools/metrics/histograms/histograms.xml (right):

https://codereview.chromium.org/1084943002/diff/120001/tools/metrics/histogra...
tools/metrics/histograms/histograms.xml:38057: +    start of the main thread's
message loop, not including first runs of the
Nit: "first runs of the browser" sounds a little confusing. I would change
wording to something like "not including the first run after install, which is
logged separately via <name of other metric>."

chrisha

Thanks Alexei. jeremy: A final look as OWNER? https://codereview.chromium.org/1084943002/diff/120001/components/startup_metric_utils/startup_metric_utils.cc File components/startup_metric_utils/startup_metric_utils.cc (right): https://codereview.chromium.org/1084943002/diff/120001/components/startup_metric_utils/startup_metric_utils.cc#newcode131 components/startup_metric_utils/startup_metric_utils.cc:131: while ...

5 years, 8 months ago (2015-04-15 19:35:02 UTC) #15

chrisha

erik is to be the new owner, and he's lgtm'd. jeremy is the current owner ...

5 years, 8 months ago (2015-04-15 21:06:12 UTC) #16

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1084943002/140001

5 years, 8 months ago (2015-04-15 21:08:06 UTC) #19

commit-bot: I haz the power

Patchset 8 (id:??) landed as https://crrev.com/e2f7ec4b272226a67a3c5e66ed3d955e2a33227e Cr-Commit-Position: refs/heads/master@{#325337}

5 years, 8 months ago (2015-04-15 23:18:56 UTC) #21

Message was sent while issue was closed.

LGTM

Expand Messages | Collapse Messages | Show Generated Messages | Hide Generated Messages