|
|
Created:
9 years, 2 months ago by brettw Modified:
9 years, 1 month ago CC:
chromium-reviews, James Su, brettw-cc_chromium.org Base URL:
svn://svn.chromium.org/chrome/trunk/src/ Visibility:
Public. |
DescriptionAdd a field trial for using lower sqlite cache sizes.
There are a few relevant histograms that I annotated with the field trial name, and I added a new AddPage histogram.
TEST=none
BUG=none
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=111386
Patch Set 1 #Patch Set 2 : '' #
Total comments: 12
Patch Set 3 : '' #Patch Set 4 : '' #
Total comments: 9
Messages
Total messages: 25 (0 generated)
Peter: autocomplete OWNERS Jar: Field trial mechanics (history_field_trial.cc) Scott: Database changes (the other files)
http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:28: // Try to give the user a consistent experience, if possible. Why do we want "consistent" experience? Are you expecting the change to be that dramatic? Note that we get "better" data (better randomization, and hence lower variance I think) if each use is allowed to randomly be in/out of the experiment with each process launch (unless we allow users to self-select out, by restarting, or such). http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... File chrome/browser/history/in_memory_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... chrome/browser/history/in_memory_database.cc:96: "History.InMemoryDBPopulate" + HistoryFieldTrial::GetGroupSuffix(), There is a nicer(?) or at least common patterns for doing this. Typically you leave the existing histogram in place (so that it can be seen, unchanged), and add a test for the existence of the field trial (saving it into a static). IFF the field trial is active, you can then use the MakeName() method to do the concatenation. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... File chrome/browser/history/text_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... chrome/browser/history/text_database.cc:147: db_.set_cache_size(512); Rather than having exactly two choices, have you consider a ew choices in a range, each in a field trial group? The sizes you chose differ by more than an order of magnitude, and I'm guessing you're curious about where the optimum paramaters are. I'm not suggesting 50... but perhaps 3-5??
http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... File chrome/browser/history/history_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... chrome/browser/history/history_database.cc:90: } Yeah, I'm with jar, this range is pretty huge. I suspect that 6000 was not providing much utility (most likely a small part of it was interior nodes, with a ton of randomly-selected other pages), but since pages are atomic, when you start to get so small as 64 you might start to see problems with thrashing. It might make sense to first prove a case for going below 500 (already saving 20M), and worry about going further at that point if benefits haven't been capped. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:28: // Try to give the user a consistent experience, if possible. On 2011/10/24 18:40:55, jar wrote: > Why do we want "consistent" experience? Are you expecting the change to be > that dramatic? > Note that we get "better" data (better randomization, and hence lower variance I > think) if each use is allowed to randomly be in/out of the experiment with each > process launch (unless we allow users to self-select out, by restarting, or > such). I would rather have a poor experience some of the time than a poor experience every time, myself. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... File chrome/browser/history/text_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... chrome/browser/history/text_database.cc:147: db_.set_cache_size(512); Wait - from our chat I thought this was 32, which is why I suggested that it might want to be raised a little. I'm less confident going from 512 to 40.
Rubber stamp LGTM for autocomplete/
Now with a new histogram for AddPage and more conservative values. I'm actually quite optimistic about lowering the cache sizes based on how people's browsers perform with history sizes many times larger than our current cache sizes. But we can try an additional more aggressive trial after this assuming the results are good. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:28: // Try to give the user a consistent experience, if possible. I actually wasn't entirely sure if this was good or bad, I just copied it from other field trials. I removed it. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... File chrome/browser/history/in_memory_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... chrome/browser/history/in_memory_database.cc:96: "History.InMemoryDBPopulate" + HistoryFieldTrial::GetGroupSuffix(), I guess that sounds worse to my thinking. Wouldn't I want the old histogram name to have a consistent meaning (in my case, the old cache sizes), rather than doing a weighted average of people in and out of the trial? http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... File chrome/browser/history/text_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... chrome/browser/history/text_database.cc:147: db_.set_cache_size(512); How about I'll change this to 128. Depending on what those histograms look like, we'll try some lower values in the future.
jar/shess: Ping
LGTM, I can't help nit-picking. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... File chrome/browser/history/in_memory_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... chrome/browser/history/in_memory_database.cc:96: "History.InMemoryDBPopulate" + HistoryFieldTrial::GetGroupSuffix(), On 2011/10/25 01:55:02, brettw wrote: > I guess that sounds worse to my thinking. Wouldn't I want the old histogram name > to have a consistent meaning (in my case, the old cache sizes), rather than > doing a weighted average of people in and out of the trial? Can you have it both ways? Old histogram for non-trial users, new histogram for trial users? Doh. I see what you mean - if not in the trial, you're appending string(). http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... File chrome/browser/history/text_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/text_... chrome/browser/history/text_database.cc:147: db_.set_cache_size(512); On 2011/10/25 01:55:02, brettw wrote: > How about I'll change this to 128. Depending on what those histograms look like, > we'll try some lower values in the future. OK, seems reasonable. One "thing" about having just two values is we won't know the shape of the curve. You will get some threshold information (is it too horrible to keep or not), but you probably won't get any info about how 64 or 256 will work. Of course, it's entirely possible that 128, 256 and 512 would all have the same results, in which case more than two also won't tell you anything! [That's a happy result, actually.] http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:16: static const char kHistoryFieldTrialName[] = "History"; AFAICT this is your namespace for field trials. Unless you're sure you'll only ever want one, it might be reasonable to have it be History_CacheSize or something. http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:26: kHistoryFieldTrialName, 1000, "Inactive", 2012, 2, 1)); Since I don't see this stored anywhere, am I safe in assuming that FieldTrialList keeps a reference and this ref is only here to prevent a race condition before the end of the fn? http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:40: return g_low_mem_trial ? std::string("_LowMem") : std::string(); Seems like it might be reasonable to have this construct the histogram name, rather than having the caller combine this with the histogram name. I mean like: std::string MakeName(const std::string& name_prefix) { if (!g_log_mem_trial_group) return name_prefix; return FieldTrial::MakeName(name_prefix, kLowMemGroupName); } With obvious tweak to AppendGroup() call. I'm not sure if that's what jar was getting at?
Comments below. You should either respond with a delta, or say why not.... and either way I'll give my blessings looks good blessings. http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... File chrome/browser/history/in_memory_database.cc (right): http://codereview.chromium.org/8379009/diff/5001/chrome/browser/history/in_me... chrome/browser/history/in_memory_database.cc:96: "History.InMemoryDBPopulate" + HistoryFieldTrial::GetGroupSuffix(), The usual pattern is to keep the old histogram unchanged.... and then have a set of histograms (generated from a single HISTOGRAM macro), that are for the distinct field trial groups. This way you have you cake (old clean histograms) and you eat it too (you have user runs partitioned among the set of field trial histograms). On 2011/10/26 21:44:59, shess wrote: > On 2011/10/25 01:55:02, brettw wrote: > > I guess that sounds worse to my thinking. Wouldn't I want the old histogram > name > > to have a consistent meaning (in my case, the old cache sizes), rather than > > doing a weighted average of people in and out of the trial? > > Can you have it both ways? Old histogram for non-trial users, new histogram for > trial users? > > Doh. I see what you mean - if not in the trial, you're appending string(). http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:26: kHistoryFieldTrialName, 1000, "Inactive", 2012, 2, 1)); The real field trial is actually ref-counted, and registered globally. This uses a refptr to hold it briefly. As it turns out, I'm pretty sure we deliberately leak these objects (to be sure they are around long into the shutdown actions). On 2011/10/26 21:44:59, shess wrote: > Since I don't see this stored anywhere, am I safe in assuming that > FieldTrialList keeps a reference and this ref is only here to prevent a race > condition before the end of the fn? http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:40: return g_low_mem_trial ? std::string("_LowMem") : std::string(); +1 The MakeName() function was crafted to be nicely used in a HISTOGRAM macro. That is the common pattern to see... and is probably safer/more readably correct. Common use is: UMA_HISTOGRAM_STUFF(MakeName("Group.HistogramName", "FieldTrialName"), sample); On 2011/10/26 21:44:59, shess wrote: > Seems like it might be reasonable to have this construct the histogram name, > rather than having the caller combine this with the histogram name. I mean > like: > > std::string MakeName(const std::string& name_prefix) { > if (!g_log_mem_trial_group) > return name_prefix; > return FieldTrial::MakeName(name_prefix, kLowMemGroupName); > } > > With obvious tweak to AppendGroup() call. I'm not sure if that's what jar was > getting at?
http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:40: return g_low_mem_trial ? std::string("_LowMem") : std::string(); I feel like I'm missing something fundamental that's making us talk past each other. The way this works is that I'm taking 5% of the users and making them use smaller caches. I then plan to look at the histograms to see if there is a significant difference in performance from the regular users. So I've got two histograms, one old with the same meaning and name it always had for users not in the field trial (big cache), and one for users in the field trial with _LowMem appended to the name. You're asking me to keep the old histogram with the old meaning. This is what I feel like I'm doing. MakeName doesn't do what I want because then I won't get the old histogram, since everything will have some suffix on it. If I added 3 more "levels" of cache size in the field trial, I would then have one original histogram, and 3 different suffixes for the field trial. If I were to count the "no change" as part of the field trial and keep the old histogram like you were suggesting, it seems like this would just diplicate the no change case for a particular version of Chrome.
lgtm, because I think it does what you want and at this point the concerns are kinda carping. http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:40: return g_low_mem_trial ? std::string("_LowMem") : std::string(); On 2011/10/28 22:30:40, brettw wrote: > I feel like I'm missing something fundamental that's making us talk past each > other. Probably! At least from my end, my suggestion was because I get uncomfortable seeing ad-hoc solutions in areas like this where there are interesting things done with the results (like caching in a global types of things). I'll defer to jar on it, though, because my level is just discomfort, but I think you're right that for the non-field-trial case the histogram is unchanged, while the field-trial case does change it. So far as having more than one trial case, I guess in the end it's entirely up to whether you want to approach this as multiple rollouts as you go, or try to get more info up front. http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.h (right): http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.h:7: #pragma once ???
hmmm... There are usually more than two groups of people, but perchance you haven't implemented the break-down that way. I suspect that was the point I failed to identify in the review. Sorry about that :-(. a) There are folks that "opt-out" and request some specific treatment. We usually provide such an out just in case we really screw up and one of the random options is intolerable. It is also helpful for testers and engineers that want a consistent behavior (for the parameter that we're modulating), and don't want random. b) Among the folks that don't opt-out (via a command line flag specific to this experiment), we usually have a control group and an experiment group. In your case, it might be that we have a 5% control group, and a 5% experiment group. Both of those groups are labeled with a special histogram. Sorry I've been doing this for so long, I was forgetting of these basic rules for how to try to run these experiments. Note that we don't have a "universal opt-out" flag, as I'm afraid such might get publicity, and then we'd have no population to experiment on :-(. We do have experiment specific flags (most of the time). Jim On Fri, Oct 28, 2011 at 3:30 PM, <brettw@chromium.org> wrote: > > http://codereview.chromium.**org/8379009/diff/6006/chrome/** > browser/history/history_field_**trial.cc<http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/history_field_trial.cc> > File chrome/browser/history/**history_field_trial.cc (right): > > http://codereview.chromium.**org/8379009/diff/6006/chrome/** > browser/history/history_field_**trial.cc#newcode40<http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/history_field_trial.cc#newcode40> > chrome/browser/history/**history_field_trial.cc:40: return g_low_mem_trial > ? std::string("_LowMem") : std::string(); > I feel like I'm missing something fundamental that's making us talk past > each other. > > The way this works is that I'm taking 5% of the users and making them > use smaller caches. I then plan to look at the histograms to see if > there is a significant difference in performance from the regular users. > > So I've got two histograms, one old with the same meaning and name it > always had for users not in the field trial (big cache), and one for > users in the field trial with _LowMem appended to the name. > > You're asking me to keep the old histogram with the old meaning. This is > what I feel like I'm doing. MakeName doesn't do what I want because then > I won't get the old histogram, since everything will have some suffix on > it. > > If I added 3 more "levels" of cache size in the field trial, I would > then have one original histogram, and 3 different suffixes for the field > trial. If I were to count the "no change" as part of the field trial and > keep the old histogram like you were suggesting, it seems like this > would just diplicate the no change case for a particular version of > Chrome. > > http://codereview.chromium.**org/8379009/<http://codereview.chromium.org/8379... >
On Fri, Oct 28, 2011 at 5:16 PM, Jim Roskind <jar@chromium.org> wrote: > hmmm... There are usually more than two groups of people, but perchance you > haven't implemented the break-down that way. I suspect that was the point I > failed to identify in the review. Sorry about that :-(. > a) There are folks that "opt-out" and request some specific treatment. We > usually provide such an out just in case we really screw up and one of the > random options is intolerable. It is also helpful for testers and engineers > that want a consistent behavior (for the parameter that we're modulating), > and don't want random. > b) Among the folks that don't opt-out (via a command line flag specific to > this experiment), we usually have a control group and an experiment group. > In your case, it might be that we have a 5% control group, and a 5% > experiment group. Both of those groups are labeled with a special > histogram. This is where you lose me. I don't understand what the "control group" is an how it's different from "not being in the trial." I was figuring that the 95% of people not in the trial were my control. If I added a control, it would be identical to the non-trial users. Brett
The key is that you didn't let anyone opt out (I think). The opt-out group is a very biased group, and can't be included in any experiment. IMO, you should allow folks to opt-out. When you do that, if you had 95% opting out, then when you lump together "your control group" with the "95% that opt out" you'll find that you don't have any visibility into your "control group." Jim On Fri, Oct 28, 2011 at 5:29 PM, Brett Wilson <brettw@chromium.org> wrote: > On Fri, Oct 28, 2011 at 5:16 PM, Jim Roskind <jar@chromium.org> wrote: > > hmmm... There are usually more than two groups of people, but perchance > you > > haven't implemented the break-down that way. I suspect that was the > point I > > failed to identify in the review. Sorry about that :-(. > > a) There are folks that "opt-out" and request some specific treatment. > We > > usually provide such an out just in case we really screw up and one of > the > > random options is intolerable. It is also helpful for testers and > engineers > > that want a consistent behavior (for the parameter that we're > modulating), > > and don't want random. > > b) Among the folks that don't opt-out (via a command line flag specific > to > > this experiment), we usually have a control group and an experiment > group. > > In your case, it might be that we have a 5% control group, and a 5% > > experiment group. Both of those groups are labeled with a special > > histogram. > > This is where you lose me. I don't understand what the "control group" > is an how it's different from "not being in the trial." I was figuring > that the 95% of people not in the trial were my control. If I added a > control, it would be identical to the non-trial users. > > Brett >
On Fri, Oct 28, 2011 at 5:34 PM, Jim Roskind <jar@chromium.org> wrote: > The key is that you didn't let anyone opt out (I think). The opt-out group > is a very biased group, and can't be included in any experiment. IMO, you > should allow folks to opt-out. When you do that, if you had 95% opting out, > then when you lump together "your control group" with the "95% that opt out" > you'll find that you don't have any visibility into your "control group." Okay, I can see if I add command line flags this will come up. I'll do that. Brett
On Fri, Oct 28, 2011 at 5:34 PM, Jim Roskind <jar@chromium.org> wrote: > The key is that you didn't let anyone opt out (I think). The opt-out group > is a very biased group, and can't be included in any experiment. IMO, you > should allow folks to opt-out. When you do that, if you had 95% opting out, > then when you lump together "your control group" with the "95% that opt out" > you'll find that you don't have any visibility into your "control group." I had a minute and thought about this CL again. Is it really worthwhile to have this switchable via the command line? To me this seems like unnecessary work for this patch. For some things like complicate network behavior, I can see how this could be desirable, but I can't envision anybody switching this either way. It seems like unnecessary busy work to add all that extra control. In the real world, I have the bandwidth to get any remaining bits cleaned up as-is, but I don't feel like I have the capacity right now to do more complex work since I'm getting very little sleep and can old work in 30 minute blocks when I'm lucky. I feel guilty for holding this hostage, but if you think it's critical that we have command line control for this feature, I'm going to table this review for now unless somebody else wants to pick up the patch. Brett
On Mon, Nov 14, 2011 at 4:10 PM, Brett Wilson <brettw@chromium.org> wrote: > On Fri, Oct 28, 2011 at 5:34 PM, Jim Roskind <jar@chromium.org> wrote: >> The key is that you didn't let anyone opt out (I think). The opt-out group >> is a very biased group, and can't be included in any experiment. IMO, you >> should allow folks to opt-out. When you do that, if you had 95% opting out, >> then when you lump together "your control group" with the "95% that opt out" >> you'll find that you don't have any visibility into your "control group." > > I had a minute and thought about this CL again. > > Is it really worthwhile to have this switchable via the command line? > To me this seems like unnecessary work for this patch. For some things > like complicate network behavior, I can see how this could be > desirable, but I can't envision anybody switching this either way. It > seems like unnecessary busy work to add all that extra control. > > In the real world, I have the bandwidth to get any remaining bits > cleaned up as-is, but I don't feel like I have the capacity right now > to do more complex work since I'm getting very little sleep and can > old work in 30 minute blocks when I'm lucky. I feel guilty for holding > this hostage, but if you think it's critical that we have command line > control for this feature, I'm going to table this review for now > unless somebody else wants to pick up the patch. I think this is a case where if your change makes things bad, the right approach is to revert your change, not to give people a command-line flag. But I also think that it's super unlikely that the change will mess up users in such a way that we can definitely say it was your change. Basically, I can't imagine anyone setting the flag for good reasons, but can easily imagine hundreds of users setting it using cargo-cult reasoning. -scott
Obviously... congratulations or the arrival of the new member of your family. The fact that a sleep deprivation experiment has begun signals this remarkable change. I hope it went well for everyone! IMO, (but I'm open to being educated), this change could really hork some users. The fact that we can push a new release "soon" (1 day? 3 days?) doesn't make me feel that much better, as some users will be (potentially) left high and dry. At first I thought this would be "OK" since we'd have canaries to bust... but then it dawned on me that perhaps it would only break in Dev (that we noticed)... or worse significantly break in Beta.... or big trouble ... in Stable. Flags are only a few lines of code. The logic is tiny. The downside of a blow up seems large. What am I missing? Jim On Mon, Nov 14, 2011 at 5:51 PM, Scott Hess <shess@chromium.org> wrote: > On Mon, Nov 14, 2011 at 4:10 PM, Brett Wilson <brettw@chromium.org> wrote: > > On Fri, Oct 28, 2011 at 5:34 PM, Jim Roskind <jar@chromium.org> wrote: > >> The key is that you didn't let anyone opt out (I think). The opt-out > group > >> is a very biased group, and can't be included in any experiment. IMO, > you > >> should allow folks to opt-out. When you do that, if you had 95% opting > out, > >> then when you lump together "your control group" with the "95% that opt > out" > >> you'll find that you don't have any visibility into your "control > group." > > > > I had a minute and thought about this CL again. > > > > Is it really worthwhile to have this switchable via the command line? > > To me this seems like unnecessary work for this patch. For some things > > like complicate network behavior, I can see how this could be > > desirable, but I can't envision anybody switching this either way. It > > seems like unnecessary busy work to add all that extra control. > > > > In the real world, I have the bandwidth to get any remaining bits > > cleaned up as-is, but I don't feel like I have the capacity right now > > to do more complex work since I'm getting very little sleep and can > > old work in 30 minute blocks when I'm lucky. I feel guilty for holding > > this hostage, but if you think it's critical that we have command line > > control for this feature, I'm going to table this review for now > > unless somebody else wants to pick up the patch. > > I think this is a case where if your change makes things bad, the > right approach is to revert your change, not to give people a > command-line flag. But I also think that it's super unlikely that the > change will mess up users in such a way that we can definitely say it > was your change. Basically, I can't imagine anyone setting the flag > for good reasons, but can easily imagine hundreds of users setting it > using cargo-cult reasoning. > > -scott >
On Mon, Nov 14, 2011 at 6:06 PM, Jim Roskind <jar@chromium.org> wrote: > Obviously... congratulations or the arrival of the new member of your > family. The fact that a sleep deprivation experiment has begun signals this > remarkable change. I hope it went well for everyone! > IMO, (but I'm open to being educated), this change could really hork some > users. The fact that we can push a new release "soon" (1 day? 3 days?) > doesn't make me feel that much better, as some users will be (potentially) > left high and dry. > At first I thought this would be "OK" since we'd have canaries to bust... > but then it dawned on me that perhaps it would only break in Dev (that we > noticed)... or worse significantly break in Beta.... or big trouble ... in > Stable. > Flags are only a few lines of code. The logic is tiny. The downside of a > blow up seems large. > What am I missing? What type of problem do you anticipate when you say "really hork," "high and dry", "blow up" and "significantly break". History lookups may be slower and background init may take longer. How does that match your descriptions? I also don't see how if this caused significant problems, having a command line flag for people to disable it will help much, and I agree with Scott. Brett
On Mon, Nov 14, 2011 at 6:45 PM, Brett Wilson <brettw@chromium.org> wrote: > On Mon, Nov 14, 2011 at 6:06 PM, Jim Roskind <jar@chromium.org> wrote: >> Obviously... congratulations or the arrival of the new member of your >> family. The fact that a sleep deprivation experiment has begun signals this >> remarkable change. I hope it went well for everyone! >> IMO, (but I'm open to being educated), this change could really hork some >> users. The fact that we can push a new release "soon" (1 day? 3 days?) >> doesn't make me feel that much better, as some users will be (potentially) >> left high and dry. >> At first I thought this would be "OK" since we'd have canaries to bust... >> but then it dawned on me that perhaps it would only break in Dev (that we >> noticed)... or worse significantly break in Beta.... or big trouble ... in >> Stable. >> Flags are only a few lines of code. The logic is tiny. The downside of a >> blow up seems large. >> What am I missing? > > What type of problem do you anticipate when you say "really hork," > "high and dry", "blow up" and "significantly break". History lookups > may be slower and background init may take longer. How does that match > your descriptions? I also don't see how if this caused significant > problems, having a command line flag for people to disable it will > help much, and I agree with Scott. I think we're having sort of a different of approach. I think you're coming at this from the perspective of using these experiements to tests things like SPDY which are big and complicated and need to be done carefully. I'm coming at this wanting to just optimize some internal cache sizes and feel like this is unnecessary overhead. I think we want people to write more experiments. Do you think that every experiment needs a command line flag? Although simple, it seems like there are already enough roadblocks (updating servers etc.) where it's harder to add experiments than it "should" be. If say the downloads people want to add an experiment to see if doubling the read buffer for the URL loader makes downloads go faster, should I reject their patch if the experiment isn't fully controllable via the command line? Brett
My biggest fear would be intense thrashing that left the browser unusable. Cache sizes can (I think) cause such, and I'm not savvy enough to personally say this can't happen in this application. It should perchance be that the thrashing will happen on a thread that won't "hurt" the user that much... or it could be that the disk will be super busy... and attempts to read/write to the disk for cache will be blocked up behind the sql disk accesses. Having the disk go crazy (super busy) would (on many systems) really leave the product at a "hard to use" level. That is my worst case "high and dry" and "horked" scenario. You guys are MUCH more savvy about what is possible, and certainly probable. If you are psyched that I'm just being overly paranoid, then lets land and iterate. Jim On Mon, Nov 14, 2011 at 6:45 PM, Brett Wilson <brettw@chromium.org> wrote: > On Mon, Nov 14, 2011 at 6:06 PM, Jim Roskind <jar@chromium.org> wrote: > > Obviously... congratulations or the arrival of the new member of your > > family. The fact that a sleep deprivation experiment has begun signals > this > > remarkable change. I hope it went well for everyone! > > IMO, (but I'm open to being educated), this change could really hork some > > users. The fact that we can push a new release "soon" (1 day? 3 days?) > > doesn't make me feel that much better, as some users will be > (potentially) > > left high and dry. > > At first I thought this would be "OK" since we'd have canaries to bust... > > but then it dawned on me that perhaps it would only break in Dev (that we > > noticed)... or worse significantly break in Beta.... or big trouble ... > in > > Stable. > > Flags are only a few lines of code. The logic is tiny. The downside of > a > > blow up seems large. > > What am I missing? > > What type of problem do you anticipate when you say "really hork," > "high and dry", "blow up" and "significantly break". History lookups > may be slower and background init may take longer. How does that match > your descriptions? I also don't see how if this caused significant > problems, having a command line flag for people to disable it will > help much, and I agree with Scott. > > Brett >
Thing is, I think if we look for users thrashing, we will find them, and if we happen to look because we just pushed a change, that doesn't mean the problems were caused by the change (versus being pre-existing). If you're really nervous about it, I think it would be more reasonable to structure the change in more incremental fashion, and setup the histograms so that we get good data as we roll things out more deeply, rather than providing a flag to let a subset of users "fix" themselves. Keep in mind that the OS filesystem caches still back us up. SQLite's internal caching is really geared more towards explicit memory-management, like on an embedded device. I think it's a lot more debatable with code like ours, as long as the cache is big enough that individual operations can complete (like if you can't even map all the interior nodes between roots and leaves of all the b-trees an operation works on, that could get horrible). -scott On Mon, Nov 14, 2011 at 6:56 PM, Jim Roskind <jar@chromium.org> wrote: > My biggest fear would be intense thrashing that left the browser unusable. > Cache sizes can (I think) cause such, and I'm not savvy enough to > personally say this can't happen in this application. > It should perchance be that the thrashing will happen on a thread that won't > "hurt" the user that much... or it could be that the disk will be super > busy... and attempts to read/write to the disk for cache will be blocked up > behind the sql disk accesses. Having the disk go crazy (super busy) would > (on many systems) really leave the product at a "hard to use" level. > That is my worst case "high and dry" and "horked" scenario. > You guys are MUCH more savvy about what is possible, and certainly probable. > > > If you are psyched that I'm just being overly paranoid, then lets land > and iterate. > Jim > > On Mon, Nov 14, 2011 at 6:45 PM, Brett Wilson <brettw@chromium.org> wrote: >> >> On Mon, Nov 14, 2011 at 6:06 PM, Jim Roskind <jar@chromium.org> wrote: >> > Obviously... congratulations or the arrival of the new member of your >> > family. The fact that a sleep deprivation experiment has begun signals >> > this >> > remarkable change. I hope it went well for everyone! >> > IMO, (but I'm open to being educated), this change could really hork >> > some >> > users. The fact that we can push a new release "soon" (1 day? 3 days?) >> > doesn't make me feel that much better, as some users will be >> > (potentially) >> > left high and dry. >> > At first I thought this would be "OK" since we'd have canaries to >> > bust... >> > but then it dawned on me that perhaps it would only break in Dev (that >> > we >> > noticed)... or worse significantly break in Beta.... or big trouble ... >> > in >> > Stable. >> > Flags are only a few lines of code. The logic is tiny. The downside of >> > a >> > blow up seems large. >> > What am I missing? >> >> What type of problem do you anticipate when you say "really hork," >> "high and dry", "blow up" and "significantly break". History lookups >> may be slower and background init may take longer. How does that match >> your descriptions? I also don't see how if this caused significant >> problems, having a command line flag for people to disable it will >> help much, and I agree with Scott. >> >> Brett > >
On Mon, Nov 14, 2011 at 7:03 PM, Scott Hess <shess@chromium.org> wrote: > Thing is, I think if we look for users thrashing, we will find them, > and if we happen to look because we just pushed a change, that doesn't > mean the problems were caused by the change (versus being > pre-existing). > > If you're really nervous about it, I think it would be more reasonable > to structure the change in more incremental fashion, and setup the > histograms so that we get good data as we roll things out more deeply, > rather than providing a flag to let a subset of users "fix" > themselves. > > Keep in mind that the OS filesystem caches still back us up. SQLite's > internal caching is really geared more towards explicit > memory-management, like on an embedded device. I think it's a lot > more debatable with code like ours, as long as the cache is big enough > that individual operations can complete (like if you can't even map > all the interior nodes between roots and leaves of all the b-trees an > operation works on, that could get horrible). I agree. One of the inspirations for this change is that Jeff's history file was 500 MB (due to a bug) and the only thing he noticed *on his laptop* (slow disk seek) was that Chrome seemed to take longer to shut down. Some users definitely thrash now, and this change will surely mean some more of them will. But the overall design of the history system is to hide I/O latency so it should be not particularly noticeable. Brett
On Thu, Nov 17, 2011 at 9:44 PM, Brett Wilson <brettw@chromium.org> wrote: > On Mon, Nov 14, 2011 at 7:03 PM, Scott Hess <shess@chromium.org> wrote: >> Thing is, I think if we look for users thrashing, we will find them, >> and if we happen to look because we just pushed a change, that doesn't >> mean the problems were caused by the change (versus being >> pre-existing). >> >> If you're really nervous about it, I think it would be more reasonable >> to structure the change in more incremental fashion, and setup the >> histograms so that we get good data as we roll things out more deeply, >> rather than providing a flag to let a subset of users "fix" >> themselves. >> >> Keep in mind that the OS filesystem caches still back us up. SQLite's >> internal caching is really geared more towards explicit >> memory-management, like on an embedded device. I think it's a lot >> more debatable with code like ours, as long as the cache is big enough >> that individual operations can complete (like if you can't even map >> all the interior nodes between roots and leaves of all the b-trees an >> operation works on, that could get horrible). > > I agree. One of the inspirations for this change is that Jeff's > history file was 500 MB (due to a bug) and the only thing he noticed > *on his laptop* (slow disk seek) was that Chrome seemed to take longer > to shut down. Some users definitely thrash now, and this change will > surely mean some more of them will. But the overall design of the > history system is to hide I/O latency so it should be not particularly > noticeable. Jar: ping
As per earlier comment... lets land and iterate. LGTM
http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... File chrome/browser/history/history_field_trial.cc (right): http://codereview.chromium.org/8379009/diff/6006/chrome/browser/history/histo... chrome/browser/history/history_field_trial.cc:23: return; // Already initialized. Note that FieldTrial::kDefaultGroupNumber is 0. I think this only fires if you're in the 5% group, otherwise you keep Activating... maybe needs a local global to handle this. I think this is being reverted for breaking browser tests... jar says it should have failed when you created a new FieldTrial, but that may be for another day. |