Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(540)

Issue 25484004: Don't commit prefs::kStabilitySessionEndCompleted change immediately on chromeos (Closed)

Created:
7 years, 2 months ago by oshima
Modified:
7 years, 2 months ago
CC:
chromium-reviews
Visibility:
Public.

Description

Don't commit prefs::kStabilitySessionEndCompleted change immediately on chromeos Chrome on ChromeOS gets killed and generate dump when hangs or didn't respond in time, so this isn't necessary. BUG=302578 Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=229535

Patch Set 1 #

Total comments: 1

Patch Set 2 : #

Patch Set 3 : #

Total comments: 4

Patch Set 4 : #

Patch Set 5 : #

Total comments: 2

Patch Set 6 : don't commit prefs::kStabilitySessionEndCompleted change immediately on chromeos #

Total comments: 2

Patch Set 7 : fix define #

Unified diffs Side-by-side diffs Delta from patch set Stats (+4 lines, -1 line) Patch
M chrome/browser/browser_process_impl.cc View 1 2 3 4 5 6 1 chunk +4 lines, -1 line 0 comments Download

Messages

Total messages: 31 (0 generated)
oshima
7 years, 2 months ago (2013-10-01 20:58:00 UTC) #1
Ilya Sherman
I don't think this change is correct. While MetricsService:: RecordStartOfSessionEnd() does force a commit, it ...
7 years, 2 months ago (2013-10-01 22:40:23 UTC) #2
oshima
On 2013/10/01 22:40:23, Ilya Sherman wrote: > I don't think this change is correct. While ...
7 years, 2 months ago (2013-10-01 22:58:15 UTC) #3
jar (doing other things)
https://codereview.chromium.org/25484004/diff/1/chrome/browser/browser_process_impl.cc File chrome/browser/browser_process_impl.cc (right): https://codereview.chromium.org/25484004/diff/1/chrome/browser/browser_process_impl.cc#newcode365 chrome/browser/browser_process_impl.cc:365: // MetricsService::RecordStartOfSessionEnd writes to prefs immediately. Can you point ...
7 years, 2 months ago (2013-10-02 01:18:30 UTC) #4
oshima
On 2013/10/02 01:18:30, jar wrote: > https://codereview.chromium.org/25484004/diff/1/chrome/browser/browser_process_impl.cc > File chrome/browser/browser_process_impl.cc (right): > > https://codereview.chromium.org/25484004/diff/1/chrome/browser/browser_process_impl.cc#newcode365 > ...
7 years, 2 months ago (2013-10-02 22:59:05 UTC) #5
oshima
ping?
7 years, 2 months ago (2013-10-04 02:58:27 UTC) #6
Ilya Sherman
On 2013/10/02 22:59:05, oshima wrote: > On 2013/10/02 01:18:30, jar wrote: > > > https://codereview.chromium.org/25484004/diff/1/chrome/browser/browser_process_impl.cc ...
7 years, 2 months ago (2013-10-04 03:22:09 UTC) #7
Ilya Sherman
Some comments on the details, though I don't have enough context to comment on the ...
7 years, 2 months ago (2013-10-04 03:26:59 UTC) #8
oshima
https://codereview.chromium.org/25484004/diff/16001/chrome/browser/metrics/metrics_service.cc File chrome/browser/metrics/metrics_service.cc (left): https://codereview.chromium.org/25484004/diff/16001/chrome/browser/metrics/metrics_service.cc#oldcode1612 chrome/browser/metrics/metrics_service.cc:1612: // (and that we don't use some alternate path, ...
7 years, 2 months ago (2013-10-04 06:03:13 UTC) #9
oshima
On 2013/10/04 03:22:09, Ilya Sherman wrote: > On 2013/10/02 22:59:05, oshima wrote: > > On ...
7 years, 2 months ago (2013-10-04 06:09:21 UTC) #10
jar (doing other things)
On 2013/10/04 06:09:21, oshima wrote: > On 2013/10/04 03:22:09, Ilya Sherman wrote: > > On ...
7 years, 2 months ago (2013-10-05 01:07:56 UTC) #11
oshima
On 2013/10/05 01:07:56, jar wrote: > On 2013/10/04 06:09:21, oshima wrote: > > On 2013/10/04 ...
7 years, 2 months ago (2013-10-08 23:02:40 UTC) #12
oshima
On 2013/10/08 23:02:40, oshima wrote: > On 2013/10/05 01:07:56, jar wrote: > > On 2013/10/04 ...
7 years, 2 months ago (2013-10-08 23:33:08 UTC) #13
jar (doing other things)
IMO, we really need to look at how the prefs settings are "harvested" to deduce ...
7 years, 2 months ago (2013-10-11 22:55:43 UTC) #14
oshima
Maybe my problem is that I don't fully understand why the order (or the commit ...
7 years, 2 months ago (2013-10-17 00:21:37 UTC) #15
jar (doing other things)
Easy answer with regard to flags vs crash dumps: We don't always get dumps. We ...
7 years, 2 months ago (2013-10-17 16:24:03 UTC) #16
oshima
ChromeOS sends SIGFPE/SIGABORT when chrome hangs or didn't respond in time hence we do get ...
7 years, 2 months ago (2013-10-17 16:38:15 UTC) #17
oshima
On 2013/10/17 16:38:15, oshima wrote: > ChromeOS sends SIGFPE/SIGABORT when chrome hangs or didn't respond ...
7 years, 2 months ago (2013-10-17 16:52:01 UTC) #18
jar (doing other things)
I'm fine with skipping the second commit if you feel you have adequate coverage. IMO, ...
7 years, 2 months ago (2013-10-18 01:01:53 UTC) #19
jar (doing other things)
lgtm
7 years, 2 months ago (2013-10-18 01:02:50 UTC) #20
oshima
On 2013/10/18 01:02:50, jar wrote: > lgtm thanks!
7 years, 2 months ago (2013-10-18 01:33:17 UTC) #21
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/oshima@chromium.org/25484004/49001
7 years, 2 months ago (2013-10-18 02:07:10 UTC) #22
commit-bot: I haz the power
Retried try job too often on chromium_presubmit for step(s) presubmit http://build.chromium.org/p/tryserver.chromium/buildstatus?builder=chromium_presubmit&number=31163
7 years, 2 months ago (2013-10-18 02:33:10 UTC) #23
oshima
+thestig for OWNERS approval
7 years, 2 months ago (2013-10-18 16:57:47 UTC) #24
Lei Zhang
https://codereview.chromium.org/25484004/diff/49001/chrome/browser/browser_process_impl.cc File chrome/browser/browser_process_impl.cc (right): https://codereview.chromium.org/25484004/diff/49001/chrome/browser/browser_process_impl.cc#newcode376 chrome/browser/browser_process_impl.cc:376: #if !defined(OP_CHROMEOS) OS_CHROMEOS
7 years, 2 months ago (2013-10-18 21:31:02 UTC) #25
oshima
https://codereview.chromium.org/25484004/diff/49001/chrome/browser/browser_process_impl.cc File chrome/browser/browser_process_impl.cc (right): https://codereview.chromium.org/25484004/diff/49001/chrome/browser/browser_process_impl.cc#newcode376 chrome/browser/browser_process_impl.cc:376: #if !defined(OP_CHROMEOS) On 2013/10/18 21:31:02, Lei Zhang wrote: > ...
7 years, 2 months ago (2013-10-18 21:36:05 UTC) #26
Lei Zhang
lgtm
7 years, 2 months ago (2013-10-18 21:36:36 UTC) #27
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/oshima@chromium.org/25484004/162001
7 years, 2 months ago (2013-10-18 22:49:51 UTC) #28
commit-bot: I haz the power
Change committed as 229535
7 years, 2 months ago (2013-10-19 10:24:40 UTC) #29
oshima
On Thu, Oct 17, 2013 at 6:01 PM, Jim Roskind <jar@chromium.org> wrote: > I'm fine ...
7 years, 2 months ago (2013-10-23 16:45:17 UTC) #30
jar (doing other things)
7 years, 2 months ago (2013-10-23 19:48:00 UTC) #31
UMA crash stats (dirty shutdown) are the primary metric for estimating
stability. It is used on a daily and weekly basis to decide go/no-go for
new releases.  Stats are backed by crash dumps, which are uploaded (and
often conflict).

We have code that crashes during hangs, and even code that snapshots the
stack (without crashing) when we hang various interesting threads.


On Wed, Oct 23, 2013 at 9:44 AM, oshima <oshima@chromium.org> wrote:

>
> On Thu, Oct 17, 2013 at 6:01 PM, Jim Roskind <jar@chromium.org> wrote:
>
>> I'm fine with skipping the second commit if you feel you have adequate
>> coverage.
>>
>> IMO, we should *really* document what the heck we're doing, and that will
>> help to preserve the semantics, and not be overly constrained (I'm wary
>> that you may be right... and I'm being too conservative... but we really
>> need a clear list of what we're doing here to nail things down.  I'd also
>> add that the code is often way too cryptic, and difficult to follow (you
>> should see how bad it was in years of old when we first started fixing it
>> up.).  We're always between a rock and a hard place, wanting to clean it
>> up... and not wanting to destroy the current crash signals which are
>> directly compared with "yesterday's" crash signals :-(.
>>
>
> Do you know who is in charge of this stats now (in other words, who is
> monitoring this)? I'd like to know if this is being used, or was useful to
> find issues during shutdown.
> I agree that it's bad if signals are different between today and
> yesterday, but that's true only if it's used in meaningful way. Maybe the
> better way is to use the approach used in cros (generate dump when it
> hangs).
>
>
>>
>> If you land a CrOS only patch... be super careful to see if the stats
>> suddenly change in some way... as it is oh so easy to corrupt critical
>> signals as we change metrics.
>>
>
> I'm pretty sure CrOS isn't using this metrics. I'm actually the person
> who's looking into issues during shutdown and I've never seen or used it,
> as we always get crash dump for hangs during shutdown.
>
>
>
>>
>> Jim
>>
>>
>> On Thu, Oct 17, 2013 at 9:37 AM, oshima <oshima@chromium.org> wrote:
>>
>>>
>>> ChromeOS sends SIGFPE/SIGABORT when chrome hangs or didn't respond in
>>> time
>>> hence we do get crash dump for these cases (and this CL is to address
>>> this type of crash).
>>> Can we skip the 2nd commit on ChromeOS then?
>>>
>>> On Thu, Oct 17, 2013 at 9:24 AM, Jim Roskind <jar@chromium.org> wrote:
>>>
>>>> Easy answer with regard to flags vs crash dumps:
>>>>
>>>> We don't always get dumps.  We often get hangs at various points, and
>>>> hence no crash dump whatever.
>>>>
>>>> We use these flags to get information about where the "dirty shutdown"
>>>> (a.k.a., crash; hang; process-killed; power off) happened.
>>>>
>>>>
>>>> On Wed, Oct 16, 2013 at 5:21 PM, <oshima@chromium.org> wrote:
>>>>
>>>>>
>>>>> Maybe my problem is that I don't fully understand why the order (or
>>>>> the commit
>>>>> before
>>>>> kStabilityExitedCleanly) is important. Here is my current
>>>>> understanding.
>>>>>
>>>>> First of all, ImportantFileWrite *always* post task to
>>>>> SequencedWorkerPool, so
>>>>> you can't deduce the crash location using value in prefs. In addition,
>>>>> RecordCompletedSessionEnd doesn't commit immediately, so the condition
>>>>> kStabilitySessionEndCompleted=**true can't be used to deduce crash
>>>>> location,
>>>>> even if ImportantFileWriter were to commit the change immediately.
>>>>> Lastly crash location is available in crash reports. It tells exactly
>>>>> where it
>>>>> crashed, not in between location, which makes these flags less useful.
>>>>>
>>>>> The committing kStabilityExitedCleanly as soon as possible is
>>>>> important because
>>>>> we don't want to show crash info bar when chrome crashes after this
>>>>> point. This
>>>>> is best effort and may fail (if chrome crashes after
>>>>> CommitPendingWrite, but
>>>>> before
>>>>> ImportantFileWrite writes the file). Also it's quite unlikely that
>>>>> chrome
>>>>> crashes
>>>>> between them without us noticing it (we can tell from crash reports).
>>>>>
>>>>> With this, I thought this isn't useful.
>>>>>
>>>>> Am I missing something? Is this stats really used now to tell where
>>>>> crash is
>>>>> happening?
>>>>>
>>>>>
>>>>> On 2013/10/11 22:55:43, jar wrote:
>>>>>
>>>>>> IMO, we really need to look at how the prefs settings are "harvested"
>>>>>> to
>>>>>>
>>>>> deduce
>>>>>
>>>>>> crash location.  That would make it clearer whether your changes will
>>>>>> impact
>>>>>> crash stats.
>>>>>>
>>>>>
>>>>>  I think the plan needs to be documented.  If it was possible, it
>>>>>> would be very
>>>>>> nice to have a unit test that validated the exact shutdown
>>>>>> sequence... but
>>>>>>
>>>>> there
>>>>>
>>>>>> is not (currently) a clean way to inject dependency on test code.
>>>>>>
>>>>>
>>>>>
>>>>> https://codereview.chromium.**org/25484004/diff/31001/**
>>>>>
chrome/browser/metrics/**metrics_service.cc<https://codereview.chromium.org/25484004/diff/31001/chrome/browser/metrics/metrics_service.cc>
>>>>>
>>>>>> File chrome/browser/metrics/**metrics_service.cc (right):
>>>>>>
>>>>>
>>>>>
>>>>> https://codereview.chromium.**org/25484004/diff/31001/**
>>>>>
chrome/browser/metrics/**metrics_service.cc#newcode1608<https://codereview.chromium.org/25484004/diff/31001/chrome/browser/metrics/metrics_service.cc#newcode1608>
>>>>>
>>>>>> chrome/browser/metrics/**metrics_service.cc:1608:
>>>>>>
>>>>> MarkAppCleanShutdownAndCommit(**);
>>>>>
>>>>>> I'm trying to understand what you are doing here, vs what the code
>>>>>> used to do.
>>>>>>
>>>>>
>>>>>  In the past, we *first* did a LogCleanShutdown() on lines 740 or 745.
>>>>>>  This
>>>>>> function did a MarkAppCleanShutdownAndCommit(**), and *then* set the
>>>>>> pref.
>>>>>>
>>>>>
>>>>>  Now, you first change the prefs, and then call
>>>>>>
>>>>> MarkAppCleanShutdownAndCommit(**).
>>>>>
>>>>>  It appears you have changed the order (setting the state, before the
>>>>>> commit).
>>>>>>
>>>>>
>>>>>
>>>>>  IMO, it would be great to add a file-level comment indicating the
>>>>>> exact
>>>>>>
>>>>> shutdown
>>>>>
>>>>>> sequence, and exactly what state is observable (via prefs) to deduce
>>>>>> where (if
>>>>>> any) the previous run of the browser crashed (stopped updating prefs
>>>>>> files).
>>>>>>
>>>>> We
>>>>>
>>>>>> could use that as the spec, verify that it is indeed what was
>>>>>> currently
>>>>>>
>>>>> done....
>>>>>
>>>>>> and then verify that it is preserved by your change.
>>>>>>
>>>>>
>>>>>  WDYT?
>>>>>>
>>>>>
>>>>>
>>>>> https://codereview.chromium.**org/25484004/diff/31001/**
>>>>>
chrome/browser/metrics/**metrics_service.h<https://codereview.chromium.org/25484004/diff/31001/chrome/browser/metrics/metrics_service.h>
>>>>>
>>>>>> File chrome/browser/metrics/**metrics_service.h (right):
>>>>>>
>>>>>
>>>>>
>>>>> https://codereview.chromium.**org/25484004/diff/31001/**
>>>>>
chrome/browser/metrics/**metrics_service.h#newcode371<https://codereview.chromium.org/25484004/diff/31001/chrome/browser/metrics/metrics_service.h#newcode371>
>>>>>
>>>>>> chrome/browser/metrics/**metrics_service.h:371: //
>>>>>> |**kStatabilitySessionEndComplete**d|.
>>>>>> nit: I think you didn't finish this sentence.
>>>>>>
>>>>>
>>>>>  Also, when you use the "|" around a name, it usually means the
>>>>>> variable by
>>>>>>
>>>>> said
>>>>>
>>>>>> name.  In this case, you're talking about a pref that has a given
>>>>>> name, and
>>>>>> setting its value.  Clearer might be:
>>>>>>
>>>>>
>>>>>  It also sets the boolean preference "**kStatabilitySessionEndComplete
>>>>>> **d" to the
>>>>>> specified |value|.
>>>>>>
>>>>>
>>>>>  It is probably critical to say which was done first, especially if
>>>>>> one of them
>>>>>> commits preference values (writes them to disk).
>>>>>>
>>>>>
>>>>>
>>>>>
https://codereview.chromium.**org/25484004/<https://codereview.chromium.org/2...
>>>>>
>>>>
>>>>
>>>
>>
>

To unsubscribe from this group and stop receiving emails from it, send an email
to chromium-reviews+unsubscribe@chromium.org.

Powered by Google App Engine
This is Rietveld 408576698