|
|
Created:
4 years, 1 month ago by lawrencewu Modified:
4 years ago CC:
chromium-reviews, asvitkine+watch_chromium.org Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAdd debug info for field trial allocation crash
We're somehow adding the same field trial but with different group names
to the allocator. This CL adds a check for when that happens, and aliases
the relevant data so we can see it in the minidump.
BUG=666230
Committed: https://crrev.com/4e835c3b0ff28e8214bb41cb1995d9fb33b7fbe7
Cr-Commit-Position: refs/heads/master@{#433887}
Patch Set 1 #Patch Set 2 : add check when adding a duplicate to allocator #
Total comments: 12
Patch Set 3 : iterate through allocator to check for existing allocator #
Total comments: 4
Patch Set 4 : address nits #Patch Set 5 : check the allocator after we verify it's nullness #
Total comments: 5
Patch Set 6 : address comments #
Total comments: 2
Messages
Total messages: 30 (15 generated)
Description was changed from ========== Add debug info for field trial allocation crash BUG=666230 ========== to ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 ==========
lawrencewu@chromium.org changed reviewers: + asvitkine@chromium.org
Description was changed from ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 ========== to ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 ==========
Verified that the first part (when we're getting the trials from shm in child process) correctly aliases in the debugger: http://i.imgur.com/x3V6Awf.png Curerently building and testing the part where we're adding duplicates in the browser process.
lawrencewu@chromium.org changed reviewers: + brucedawson@chromium.org
https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:863: const size_t buf_size = 100; Nit: constexpr https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:865: char trial_name_c_str[buf_size]; = {0}; On all of these. Actually, you only really need it for existing_group_name_c_str once you move it up, since it may not always be initialized - but suggest doing it for all. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:866: strncpy(trial_name_c_str, trial_name_string.c_str(), buf_size); Use strlcpy() instead. strncpy will not terminate the string if it takes the full size. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:869: char group_name_c_str[buf_size]; Nit: Suggest moving all the buffer declarations to the start of the {} block together and then you can debug::Alias() them all. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:877: char existing_group_name_c_str[buf_size]; This will go out of scope when we hit the crash and therefor may not be valid anymore. Move it up to the outer scope. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:941: global_->PreLockedFind(field_trial->trial_name()); Wouldn't this find the same |field_trial| object as being passed in? In what cases where you thinking this would find a different one? I guess if |field_trial| is not registered in the global_? (If so, maybe that's what cases your crash and the fix would be to only add it to the map if it's registered.) (My originally suggestion for this check was to actually iterate through the allocator objects.)
Address comments. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:863: const size_t buf_size = 100; On 2016/11/21 16:26:33, Alexei Svitkine (slow) wrote: > Nit: constexpr Done. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:865: char trial_name_c_str[buf_size]; On 2016/11/21 16:26:33, Alexei Svitkine (slow) wrote: > = {0}; > > On all of these. > > Actually, you only really need it for existing_group_name_c_str once you move it > up, since it may not always be initialized - but suggest doing it for all. Done. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:866: strncpy(trial_name_c_str, trial_name_string.c_str(), buf_size); On 2016/11/21 16:26:32, Alexei Svitkine (slow) wrote: > Use strlcpy() instead. > > strncpy will not terminate the string if it takes the full size. Done. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:869: char group_name_c_str[buf_size]; On 2016/11/21 16:26:32, Alexei Svitkine (slow) wrote: > Nit: Suggest moving all the buffer declarations to the start of the {} block > together and then you can debug::Alias() them all. Done. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:877: char existing_group_name_c_str[buf_size]; On 2016/11/21 16:26:32, Alexei Svitkine (slow) wrote: > This will go out of scope when we hit the crash and therefor may not be valid > anymore. Move it up to the outer scope. Done. https://codereview.chromium.org/2511413002/diff/20001/base/metrics/field_tria... base/metrics/field_trial.cc:941: global_->PreLockedFind(field_trial->trial_name()); On 2016/11/21 16:26:32, Alexei Svitkine (slow) wrote: > Wouldn't this find the same |field_trial| object as being passed in? > > In what cases where you thinking this would find a different one? I guess if > |field_trial| is not registered in the global_? (If so, maybe that's what cases > your crash and the fix would be to only add it to the map if it's registered.) > > (My originally suggestion for this check was to actually iterate through the > allocator objects.) Looks like that was what was causing the crash after all. (Hopefully). I still wrote a check here going through the allocator, though, in case that doesn't solve our problem.
The CQ bit was checked by lawrencewu@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
lgtm https://codereview.chromium.org/2511413002/diff/40001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/40001/base/metrics/field_tria... base/metrics/field_trial.cc:190: std::string trial_name, Nit: const& Same for the next param. https://codereview.chromium.org/2511413002/diff/40001/base/metrics/field_tria... base/metrics/field_trial.cc:202: &group_name_to_check)) Nit: {}'s
Address nits and fix bug by moving checking the allocator to after checking that the allocator actually exists. https://codereview.chromium.org/2511413002/diff/40001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/40001/base/metrics/field_tria... base/metrics/field_trial.cc:190: std::string trial_name, On 2016/11/21 20:11:33, Alexei Svitkine (slow) wrote: > Nit: const& > > Same for the next param. Done. https://codereview.chromium.org/2511413002/diff/40001/base/metrics/field_tria... base/metrics/field_trial.cc:202: &group_name_to_check)) On 2016/11/21 20:11:33, Alexei Svitkine (slow) wrote: > Nit: {}'s Done.
https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... base/metrics/field_trial.cc:206: std::string existing_trial_name = trial_name_to_check.as_string(); Why copy these to std::string objects? This requires allocating memory and I'm not clear what it adds - why not strlcpy from the StringPiece objects? https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... base/metrics/field_trial.cc:221: debug::Alias(group_name_c_str); For consistency and to minimize confusion keep the same order in the debug::Alias calls as for the declarations and strlcpy calls (i.e.; move group_name_c_str up one).
The CQ bit was checked by lawrencewu@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Address comments. https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... base/metrics/field_trial.cc:206: std::string existing_trial_name = trial_name_to_check.as_string(); On 2016/11/21 22:07:00, brucedawson wrote: > Why copy these to std::string objects? This requires allocating memory and I'm > not clear what it adds - why not strlcpy from the StringPiece objects? I think it's more readable this way, but yeah, since this method will be called often we should probably focus more on resource usage. Fixed. https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... base/metrics/field_trial.cc:221: debug::Alias(group_name_c_str); On 2016/11/21 22:07:00, brucedawson wrote: > For consistency and to minimize confusion keep the same order in the > debug::Alias calls as for the declarations and strlcpy calls (i.e.; move > group_name_c_str up one). Done.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...) ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...)
The CQ bit was checked by lawrencewu@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from asvitkine@chromium.org Link to the patchset: https://codereview.chromium.org/2511413002/#ps100001 (title: "address comments")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 100001, "attempt_start_ts": 1479831992448090, "parent_rev": "a55d10dbe34b4c4ad9290e4681bd8f6f56772390", "commit_rev": "96526bd2e0f299ce0b26a16bb71a4a9491ad7af4"}
Message was sent while issue was closed.
Description was changed from ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 ========== to ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 ==========
Message was sent while issue was closed.
Committed patchset #6 (id:100001)
Message was sent while issue was closed.
Description was changed from ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 ========== to ========== Add debug info for field trial allocation crash We're somehow adding the same field trial but with different group names to the allocator. This CL adds a check for when that happens, and aliases the relevant data so we can see it in the minidump. BUG=666230 Committed: https://crrev.com/4e835c3b0ff28e8214bb41cb1995d9fb33b7fbe7 Cr-Commit-Position: refs/heads/master@{#433887} ==========
Message was sent while issue was closed.
Patchset 6 (id:??) landed as https://crrev.com/4e835c3b0ff28e8214bb41cb1995d9fb33b7fbe7 Cr-Commit-Position: refs/heads/master@{#433887}
Message was sent while issue was closed.
https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/80001/base/metrics/field_tria... base/metrics/field_trial.cc:206: std::string existing_trial_name = trial_name_to_check.as_string(); On 2016/11/22 15:17:54, lawrencewu wrote: > On 2016/11/21 22:07:00, brucedawson wrote: > > Why copy these to std::string objects? This requires allocating memory and I'm > > not clear what it adds - why not strlcpy from the StringPiece objects? > > I think it's more readable this way, but yeah, since this method will be called > often we should probably focus more on resource usage. Fixed. Thanks. Given the separate thread going on about the importance of reducing allocation churn it seems particularly important to avoid unnecessary allocations. https://codereview.chromium.org/2511413002/diff/100001/base/metrics/field_tri... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/100001/base/metrics/field_tri... base/metrics/field_trial.cc:219: group_name_to_check.as_string().c_str(), buf_size); Unfortunately, calling as_string() is exactly as expensive as copying the StringPiece to a std::string object. Since it's done twice it ends up being more expensive. If this code will be deleted soon then I guess it doesn't matter, but any of this code which sticks around should be audited to ensure that we aren't creating std::string objects purely for avoidable convenience.
Message was sent while issue was closed.
https://codereview.chromium.org/2511413002/diff/100001/base/metrics/field_tri... File base/metrics/field_trial.cc (right): https://codereview.chromium.org/2511413002/diff/100001/base/metrics/field_tri... base/metrics/field_trial.cc:219: group_name_to_check.as_string().c_str(), buf_size); On 2016/11/22 17:55:24, brucedawson wrote: > Unfortunately, calling as_string() is exactly as expensive as copying the > StringPiece to a std::string object. Since it's done twice it ends up being more > expensive. > > If this code will be deleted soon then I guess it doesn't matter, but any of > this code which sticks around should be audited to ensure that we aren't > creating std::string objects purely for avoidable convenience. +1 You should be able to strlcpy() from the StringPiece without as_string(). However, I agree that we may be able to remove this debug code if we confirm the crashes have all been fixed.
Message was sent while issue was closed.
I expect this code can be removed by Friday. If we still need it afterwards, then I'll make it strlcpy() directly from the StringPiece. On 2016/11/22 18:49:30, Alexei Svitkine (slow) wrote: > https://codereview.chromium.org/2511413002/diff/100001/base/metrics/field_tri... > File base/metrics/field_trial.cc (right): > > https://codereview.chromium.org/2511413002/diff/100001/base/metrics/field_tri... > base/metrics/field_trial.cc:219: group_name_to_check.as_string().c_str(), > buf_size); > On 2016/11/22 17:55:24, brucedawson wrote: > > Unfortunately, calling as_string() is exactly as expensive as copying the > > StringPiece to a std::string object. Since it's done twice it ends up being > more > > expensive. > > > > If this code will be deleted soon then I guess it doesn't matter, but any of > > this code which sticks around should be audited to ensure that we aren't > > creating std::string objects purely for avoidable convenience. > > +1 > > You should be able to strlcpy() from the StringPiece without as_string(). > > However, I agree that we may be able to remove this debug code if we confirm the > crashes have all been fixed.
Message was sent while issue was closed.
On 2016/11/22 22:48:06, lawrencewu wrote: > I expect this code can be removed by Friday. If we still need it afterwards, > then I'll make it strlcpy() directly from the StringPiece. Thanks. I notice that there is a TODO to remove some of the as_string calls that are not in the debugging code, but there are quite a few other as_string calls that are not tagged as such and are not in the debugging code. Some of them (FieldTrialList::GetActiveFieldTrialGroupsFromString) seem to result in making multiple copies of the string. It would be nice to get rid of these impedance mismatches in future versions of the code.
Message was sent while issue was closed.
For GetActiveFieldTrialGroupsFromString() specifically, it's only called once per child process startup and in fact is going away when Lawrence's shared memory work is enabled, so not a big deal. (It's also not copying the same string over.) However, probably the general problem is FieldTrial::State is using StringPiece for its members - originally to minimize copying - but in practice seems to result in these temporaries having to be created in a bunch of places. Instead, I think we should just change its members to just be const std::string* instead. I've filed crbug.com/668538 for this with more details. On Tue, Nov 22, 2016 at 8:00 PM, <brucedawson@chromium.org> wrote: > On 2016/11/22 22:48:06, lawrencewu wrote: > > I expect this code can be removed by Friday. If we still need it > afterwards, > > then I'll make it strlcpy() directly from the StringPiece. > > Thanks. > > I notice that there is a TODO to remove some of the as_string calls that > are not > in the debugging code, but there are quite a few other as_string calls > that are > not tagged as such and are not in the debugging code. Some of them > (FieldTrialList::GetActiveFieldTrialGroupsFromString) seem to result in > making > multiple copies of the string. It would be nice to get rid of these > impedance > mismatches in future versions of the code. > > > https://codereview.chromium.org/2511413002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org. |