|
|
Created:
4 years, 8 months ago by bcwhite Modified:
4 years, 4 months ago Reviewers:
Ilya Sherman CC:
chromium-reviews, asvitkine+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionDisplay histograms from subprocesses in chrome://histograms
Traditionally, metrics in subprocesses were imported into
the browser process and so were known to the Statistics-
Recorder and would appear on the chrome://histograms page.
With the move to metrics held in persistent memory segments
from which the data is independently included in the upload,
histograms in subprocesses disappeared from that page.
Allow classes to register themselves with the Statistics-
Recorder using an interface that can write histogram
graphs to an output stream. The SubprocessMetricsProvider
registers itself during construction with the result being
a separate section on the page for histograms from sub-
processes.
The core Write(HTML)Graph methods are modified to take an
optional snapshot. This allows passing a merged snapshot
to the correct code to create the appropriate graph.
BUG=602295
Patch Set 1 #Patch Set 2 : change static set<MetricsDisplay*> into LazyInstance #Patch Set 3 : include recently-exited processes in report #Patch Set 4 : rebased #
Total comments: 14
Patch Set 5 : addressed review comments by Ilya #Patch Set 6 : rebased #
Total comments: 46
Patch Set 7 : addressed review comments by Ilya #Patch Set 8 : rebased #
Messages
Total messages: 28 (13 generated)
Patchset #2 (id:20001) has been deleted
bcwhite@chromium.org changed reviewers: + isherman@chromium.org
https://codereview.chromium.org/1880803003/diff/100001/base/metrics/histogram.cc File base/metrics/histogram.cc (right): https://codereview.chromium.org/1880803003/diff/100001/base/metrics/histogram... base/metrics/histogram.cc:469: WriteAsciiImpl(true, "<br>", static_cast<const SampleVector*>(snapshot), Hmm, I'm not super happy about the static_cast here. Why is it needed? That is, could either this method accept a SampleVector, or the WriteAsciiImpl method accept a HistogramSamples param? https://codereview.chromium.org/1880803003/diff/100001/base/metrics/histogram... base/metrics/histogram.cc:588: // are consistent across our output activities. nit: Mebbe move this comment into the if-stmt? https://codereview.chromium.org/1880803003/diff/100001/base/metrics/sparse_hi... File base/metrics/sparse_histogram.cc (right): https://codereview.chromium.org/1880803003/diff/100001/base/metrics/sparse_hi... base/metrics/sparse_histogram.cc:218: // Get a local copy of the data so we are consistent. Ditto https://codereview.chromium.org/1880803003/diff/100001/base/metrics/statistic... File base/metrics/statistics_recorder.h (right): https://codereview.chromium.org/1880803003/diff/100001/base/metrics/statistic... base/metrics/statistics_recorder.h:38: class MetricsDisplay { Optional nit: Perhaps MetricsDisplayer? https://codereview.chromium.org/1880803003/diff/100001/base/metrics/statistic... base/metrics/statistics_recorder.h:47: bool html, nit: Could you use an enum rather than a bool here, so that callsites are easier to understand? https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:177: "sub-processes (merged). Data from older processes are not included.", Hmm, why is data from older processes not included? Could it be? I would imagine this could get fairly confusing for someone looking at the data. https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.h (right): https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:87: // typically called by a background thread. Hmm, why so? I would have expected it to be called from the main thread. Could the messages from background threads instead be bounced to the main thread?
Patchset #6 (id:120001) has been deleted
Patchset #4 (id:80001) has been deleted
Patchset #5 (id:140001) has been deleted
https://codereview.chromium.org/1880803003/diff/100001/base/metrics/histogram.cc File base/metrics/histogram.cc (right): https://codereview.chromium.org/1880803003/diff/100001/base/metrics/histogram... base/metrics/histogram.cc:469: WriteAsciiImpl(true, "<br>", static_cast<const SampleVector*>(snapshot), On 2016/04/13 00:29:32, Ilya Sherman wrote: > Hmm, I'm not super happy about the static_cast here. Why is it needed? That > is, could either this method accept a SampleVector, or the WriteAsciiImpl method > accept a HistogramSamples param? This is a virtual function off of HistogramBase so has to use the HistogramSamples base class there (or it won't work with SparseHistogram). I originally tried passing HistogramSamples all the way but WriteAsciiImpl needs to call methods defined in (and only applicable to) SampleVector. I could pass HistogramSamples to WriteAsciiImpl and do the static_cast in there but that made for slightly more complicated code. It seemed more clear to do the cast in the overridden method. https://codereview.chromium.org/1880803003/diff/100001/base/metrics/histogram... base/metrics/histogram.cc:588: // are consistent across our output activities. On 2016/04/13 00:29:32, Ilya Sherman wrote: > nit: Mebbe move this comment into the if-stmt? Done. https://codereview.chromium.org/1880803003/diff/100001/base/metrics/sparse_hi... File base/metrics/sparse_histogram.cc (right): https://codereview.chromium.org/1880803003/diff/100001/base/metrics/sparse_hi... base/metrics/sparse_histogram.cc:218: // Get a local copy of the data so we are consistent. On 2016/04/13 00:29:32, Ilya Sherman wrote: > Ditto Done. https://codereview.chromium.org/1880803003/diff/100001/base/metrics/statistic... File base/metrics/statistics_recorder.h (right): https://codereview.chromium.org/1880803003/diff/100001/base/metrics/statistic... base/metrics/statistics_recorder.h:38: class MetricsDisplay { On 2016/04/13 00:29:32, Ilya Sherman wrote: > Optional nit: Perhaps MetricsDisplayer? Done. https://codereview.chromium.org/1880803003/diff/100001/base/metrics/statistic... base/metrics/statistics_recorder.h:47: bool html, On 2016/04/13 00:29:32, Ilya Sherman wrote: > nit: Could you use an enum rather than a bool here, so that callsites are easier > to understand? Done. https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:177: "sub-processes (merged). Data from older processes are not included.", On 2016/04/13 00:29:32, Ilya Sherman wrote: > Hmm, why is data from older processes not included? Could it be? I would > imagine this could get fairly confusing for someone looking at the data. The memory segments containing persistent histograms for an exited subprocess is freed as soon as the final delta is sent to UMA. If not, Chrome would accumulate idle memory segments as it ran and never release them. The original plan was for a single shared memory segment used by all renderers (which would provide what you ask) but such is not desirable for security reasons. Today, the reported histograms are for all renderers ever run but are updated only every 30 minutes and don't include anything between the previous update and the pracess exit. Now, the histograms are always current but don't keep around older data. It's unfortunate but not really worse that what we had previously; just different. But yes, it could be. During the clean-up of old segments, I could merge all the accumulated values into a separate set of inactive histograms that have no purpose but to add to the merged sample. This would be non-trivial, though, so if it's to be done I'd want to do it as a separate CL and to wait until there is an actual request for it. https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.h (right): https://codereview.chromium.org/1880803003/diff/100001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:87: // typically called by a background thread. On 2016/04/13 00:29:32, Ilya Sherman wrote: > Hmm, why so? I would have expected it to be called from the main thread. Could > the messages from background threads instead be bounced to the main thread? The previous code broke because it was called on a different thread. I didn't investigate. I assume that it's merely being collected in the background for sending to the renderer process for display. I considered bouncing but it seemed an overly complex solution. Acquiring the lock is something that happens relatively infrequently. Also, generating all these histogram graphs could take significant time so probably best if it's not on the UI thread. I can reduce the amount of time for which the lock is held, however.
Sorry for the delay here, Brian. I want to think through this more carefully, and haven't yet found a good block of time for that. (Need to remind myself about a lot of this code, since it's been ages since I've actually touched this part of the metrics system.)
Patchset #6 (id:180001) has been deleted
On 2016/04/19 00:58:57, Ilya Sherman wrote: > Sorry for the delay here, Brian. I want to think through this more carefully, > and haven't yet found a good block of time for that. (Need to remind myself > about a lot of this code, since it's been ages since I've actually touched this > part of the metrics system.) Most of these changes are in new code, too, just to add to the fun. ;-)
Hokay, I've now caught myself up on this code, and finally have another round of review comments for you. Again, apologies for the delay! https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... File base/metrics/statistics_recorder.cc (right): https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:34: g_metrics_displayers_ = LAZY_INSTANCE_INITIALIZER; Rather than introducing an additional global object, could the metrics displayers be owned by the StatisticsRecorder? https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:232: std::string* output) { It looks like this method is only used to support a VLOG statement, which I suspect is basically unused. Rather than adding additional complexity to support this use case, I'd prefer to remove the VLOG statement (in a precursor CL, for clarity). WDYT? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:15: #include "base/metrics/statistics_recorder.h" nit: Already included in the header file. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; Hmm. Suppose that multiple subprocesses recorded histograms with the same name (which is a fairly common case). Do the histograms get aggregated together, or duplicated for each renderer? It seems like the latter, which is quite odd. Or are they aggregated together somewhere prior to reaching this code? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:40: histogram.WriteHTMLGraph(&snapshot, &graph); So, why do you need to pass the snapshot here? How does it differ from the internal snapshot that the histogram has? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:48: nit: Please remove this blank line. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:51: nit: Please remove this blank line. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:55: for (auto& name_and_graph : graphs_) { nit: const auto&? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:61: base::StatisticsRecorder::MetricsDisplayer::DisplayType format_; nit: const? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:135: bool absolute, nit: The semantics of |absolute| are pretty hard to follow at the callsite. Would be nice to use an enum instead... or better yet, IMO, to just implement a separate method. There's not a whole lot of shared code, and some of the code needed for the delta case isn't really necessary for writing the HTML versions (and vice-versa). https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:214: for (auto& id_and_allocator : allocators_by_id_) { nit: const auto? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:220: for (auto& allocator : allocators_to_release_) { nit: const auto? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.h (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:15: #include "base/synchronization/lock.h" On the subject of locks: I've now researched how the HTML display code is run, and I understand why it's happening on a background thread. I don't think the performance benefit is very important, honestly -- this is code that's only run by developers, and even then only rarely, and I doubt it's actually *super* expensive -- the chrome://histograms page renders within O(100ms) for me. I think the primary reason this all runs on a background thread is just the convenience of architecting the code that way -- we use a //net class that just assigns tasks to worker pool threads. I don't think it's worth introducing a lock to support this functionality. Locks make the code more complicated, even for the much more common case of of actually uploading metrics data. I'd much rather push that complexity down into just the HTML displaying code. Specifically, I'd suggest either moving the work to happen on the main thread, or writing the code to run asynchronously, probably whichever is simpler to implement. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:69: // base::StatisticsRecorder::MetricsDisplay: nit: displayer https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:87: // Lock for accessing objects. This is required because MetricsDisplay is nit: displayer https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:96: std::map<int, std::unique_ptr<base::PersistentHistogramAllocator>>; This change seems unrelated to the rest of the CL. Or am I overlooking the connection?
Patchset #6 (id:200001) has been deleted
Patchset #6 (id:220001) has been deleted
https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... File base/metrics/statistics_recorder.cc (right): https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:34: g_metrics_displayers_ = LAZY_INSTANCE_INITIALIZER; On 2016/04/25 20:53:14, Ilya Sherman wrote: > Rather than introducing an additional global object, could the metrics > displayers be owned by the StatisticsRecorder? Yes. That'll mean including lazy_instance.h in the statistics_recorder.h file. Alexei has generally preferred globals over statics when something is completely internal. https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:232: std::string* output) { On 2016/04/25 20:53:14, Ilya Sherman wrote: > It looks like this method is only used to support a VLOG statement, which I > suspect is basically unused. Rather than adding additional complexity to > support this use case, I'd prefer to remove the VLOG statement (in a precursor > CL, for clarity). WDYT? I can try that. I'd like to do it as a follow-up CL, though. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:15: #include "base/metrics/statistics_recorder.h" On 2016/04/25 20:53:14, Ilya Sherman wrote: > nit: Already included in the header file. Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/04/25 20:53:14, Ilya Sherman wrote: > Hmm. Suppose that multiple subprocesses recorded histograms with the same name > (which is a fairly common case). Do the histograms get aggregated together, or > duplicated for each renderer? It seems like the latter, which is quite odd. Or > are they aggregated together somewhere prior to reaching this code? The HistogramSnapshotManager merges histograms with duplicate names. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:40: histogram.WriteHTMLGraph(&snapshot, &graph); On 2016/04/25 20:53:15, Ilya Sherman wrote: > So, why do you need to pass the snapshot here? How does it differ from the > internal snapshot that the histogram has? The snapshot passed in is merged from all those with the same name. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:48: On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: Please remove this blank line. Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:51: On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: Please remove this blank line. Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:55: for (auto& name_and_graph : graphs_) { On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: const auto&? Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:61: base::StatisticsRecorder::MetricsDisplayer::DisplayType format_; On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: const? Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:135: bool absolute, On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: The semantics of |absolute| are pretty hard to follow at the callsite. > Would be nice to use an enum instead... or better yet, IMO, to just implement a > separate method. There's not a whole lot of shared code, and some of the code > needed for the delta case isn't really necessary for writing the HTML versions > (and vice-versa). I didn't bother with an enum because it's all private code but Done. I'd rather not duplicate the code. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:214: for (auto& id_and_allocator : allocators_by_id_) { On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: const auto? Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:220: for (auto& allocator : allocators_to_release_) { On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: const auto? Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.h (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:15: #include "base/synchronization/lock.h" On 2016/04/25 20:53:15, Ilya Sherman wrote: > On the subject of locks: I've now researched how the HTML display code is run, > and I understand why it's happening on a background thread. I don't think the > performance benefit is very important, honestly -- this is code that's only run > by developers, and even then only rarely, and I doubt it's actually *super* > expensive -- the chrome://histograms page renders within O(100ms) for me. I > think the primary reason this all runs on a background thread is just the > convenience of architecting the code that way -- we use a //net class that just > assigns tasks to worker pool threads. > > I don't think it's worth introducing a lock to support this functionality. > Locks make the code more complicated, even for the much more common case of of > actually uploading metrics data. I'd much rather push that complexity down into > just the HTML displaying code. Specifically, I'd suggest either moving the work > to happen on the main thread, or writing the code to run asynchronously, > probably whichever is simpler to implement. I'll look into it. There are two calls, one from Desktop and one from iOS. I don't think I can post a callback to the UI thread from base/ because, correct me if I'm wrong, it doesn't know about threads that way. May need some help on that front. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:69: // base::StatisticsRecorder::MetricsDisplay: On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: displayer Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:87: // Lock for accessing objects. This is required because MetricsDisplay is On 2016/04/25 20:53:15, Ilya Sherman wrote: > nit: displayer Done. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:96: std::map<int, std::unique_ptr<base::PersistentHistogramAllocator>>; On 2016/04/25 20:53:15, Ilya Sherman wrote: > This change seems unrelated to the rest of the CL. Or am I overlooking the > connection? IDmap allows access only from a single thread. I used it originally because maps of std::scoped_ptr weren't supported.
https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... File base/metrics/statistics_recorder.cc (right): https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:34: g_metrics_displayers_ = LAZY_INSTANCE_INITIALIZER; On 2016/04/26 19:51:53, bcwhite wrote: > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > Rather than introducing an additional global object, could the metrics > > displayers be owned by the StatisticsRecorder? > > Yes. That'll mean including lazy_instance.h in the statistics_recorder.h file. > Alexei has generally preferred globals over statics when something is completely > internal. Why would it need to be a lazy instance (or static, for that matter) if it's internal to the class? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/04/26 19:51:54, bcwhite wrote: > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > Hmm. Suppose that multiple subprocesses recorded histograms with the same > name > > (which is a fairly common case). Do the histograms get aggregated together, > or > > duplicated for each renderer? It seems like the latter, which is quite odd. > Or > > are they aggregated together somewhere prior to reaching this code? > > The HistogramSnapshotManager merges histograms with duplicate names. I see. It's kind of wacky that |histogram| mostly describes the histogram, but might have different samples in it than the |snapshot| does. But, I guess this is how the code has worked for ages. Oh well. Thanks for the explanation. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:135: bool absolute, On 2016/04/26 19:51:53, bcwhite wrote: > On 2016/04/25 20:53:15, Ilya Sherman wrote: > > nit: The semantics of |absolute| are pretty hard to follow at the callsite. > > Would be nice to use an enum instead... or better yet, IMO, to just implement > a > > separate method. There's not a whole lot of shared code, and some of the code > > needed for the delta case isn't really necessary for writing the HTML versions > > (and vice-versa). > > I didn't bother with an enum because it's all private code but Done. > > I'd rather not duplicate the code. I think you're overestimating the amount of duplicated code, but I guess I'm okay with either approach. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.h (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:15: #include "base/synchronization/lock.h" On 2016/04/26 19:51:54, bcwhite wrote: > On 2016/04/25 20:53:15, Ilya Sherman wrote: > > On the subject of locks: I've now researched how the HTML display code is run, > > and I understand why it's happening on a background thread. I don't think the > > performance benefit is very important, honestly -- this is code that's only > run > > by developers, and even then only rarely, and I doubt it's actually *super* > > expensive -- the chrome://histograms page renders within O(100ms) for me. I > > think the primary reason this all runs on a background thread is just the > > convenience of architecting the code that way -- we use a //net class that > just > > assigns tasks to worker pool threads. > > > > I don't think it's worth introducing a lock to support this functionality. > > Locks make the code more complicated, even for the much more common case of of > > actually uploading metrics data. I'd much rather push that complexity down > into > > just the HTML displaying code. Specifically, I'd suggest either moving the > work > > to happen on the main thread, or writing the code to run asynchronously, > > probably whichever is simpler to implement. > > I'll look into it. There are two calls, one from Desktop and one from iOS. I > don't think I can post a callback to the UI thread from base/ because, correct > me if I'm wrong, it doesn't know about threads that way. > > May need some help on that front. Which base/ code are you thinking of, specifically?
https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... File base/metrics/statistics_recorder.cc (right): https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:34: g_metrics_displayers_ = LAZY_INSTANCE_INITIALIZER; On 2016/04/26 20:21:58, Ilya Sherman wrote: > On 2016/04/26 19:51:53, bcwhite wrote: > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > Rather than introducing an additional global object, could the metrics > > > displayers be owned by the StatisticsRecorder? > > > > Yes. That'll mean including lazy_instance.h in the statistics_recorder.h > file. > > Alexei has generally preferred globals over statics when something is > completely > > internal. > > Why would it need to be a lazy instance (or static, for that matter) if it's > internal to the class? Calls to this method are made as MetricsDisplayer objects are created which could be any time. A lazy-initializer always works regardless of when. But I think it would be safe to ceate the object inside Statistics::Recorder::Initialize(). I doubt any MetricsDisplayer objects would be created before that call. https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:232: std::string* output) { On 2016/04/26 19:51:53, bcwhite wrote: > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > It looks like this method is only used to support a VLOG statement, which I > > suspect is basically unused. Rather than adding additional complexity to > > support this use case, I'd prefer to remove the VLOG statement (in a precursor > > CL, for clarity). WDYT? > > I can try that. I'd like to do it as a follow-up CL, though. I started into that. Removing it from the SR is pretty trivial. Would you also like it removed from HistogramBase and the derived classes? Most of the code will remain since even the HTML displays are just ASCII displays wraped with <pre></pre>. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/04/26 20:21:58, Ilya Sherman wrote: > On 2016/04/26 19:51:54, bcwhite wrote: > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > Hmm. Suppose that multiple subprocesses recorded histograms with the same > > name > > > (which is a fairly common case). Do the histograms get aggregated together, > > or > > > duplicated for each renderer? It seems like the latter, which is quite odd. > > > Or > > > are they aggregated together somewhere prior to reaching this code? > > > > The HistogramSnapshotManager merges histograms with duplicate names. > > I see. It's kind of wacky that |histogram| mostly describes the histogram, but > might have different samples in it than the |snapshot| does. But, I guess this > is how the code has worked for ages. Oh well. Thanks for the explanation. Actually, the merging in the HSM is new. It used to be that everything was merged by the StatisticsRecorder beacuse all remote histograms were imported completely via RPC. With the move to persistent storage, importing them becomes unnecessary since the remote memory is directly accessible. So, no merged into the SR but merge in the HSM. In hindsight, it may have been better to continue the import of everything into the SR and just used the persistent memory as a replacement for the RPC. There are advantages and disadvantages with both methods, I think. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:135: bool absolute, On 2016/04/26 20:21:58, Ilya Sherman wrote: > On 2016/04/26 19:51:53, bcwhite wrote: > > On 2016/04/25 20:53:15, Ilya Sherman wrote: > > > nit: The semantics of |absolute| are pretty hard to follow at the callsite. > > > Would be nice to use an enum instead... or better yet, IMO, to just > implement > > a > > > separate method. There's not a whole lot of shared code, and some of the > code > > > needed for the delta case isn't really necessary for writing the HTML > versions > > > (and vice-versa). > > > > I didn't bother with an enum because it's all private code but Done. > > > > I'd rather not duplicate the code. > > I think you're overestimating the amount of duplicated code, but I guess I'm > okay with either approach. Acknowledged. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.h (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.h:15: #include "base/synchronization/lock.h" On 2016/04/26 20:21:58, Ilya Sherman wrote: > On 2016/04/26 19:51:54, bcwhite wrote: > > On 2016/04/25 20:53:15, Ilya Sherman wrote: > > > On the subject of locks: I've now researched how the HTML display code is > run, > > > and I understand why it's happening on a background thread. I don't think > the > > > performance benefit is very important, honestly -- this is code that's only > > run > > > by developers, and even then only rarely, and I doubt it's actually *super* > > > expensive -- the chrome://histograms page renders within O(100ms) for me. I > > > think the primary reason this all runs on a background thread is just the > > > convenience of architecting the code that way -- we use a //net class that > > just > > > assigns tasks to worker pool threads. > > > > > > I don't think it's worth introducing a lock to support this functionality. > > > Locks make the code more complicated, even for the much more common case of > of > > > actually uploading metrics data. I'd much rather push that complexity down > > into > > > just the HTML displaying code. Specifically, I'd suggest either moving the > > work > > > to happen on the main thread, or writing the code to run asynchronously, > > > probably whichever is simpler to implement. > > > > I'll look into it. There are two calls, one from Desktop and one from iOS. I > > don't think I can post a callback to the UI thread from base/ because, correct > > me if I'm wrong, it doesn't know about threads that way. > > > > May need some help on that front. > > Which base/ code are you thinking of, specifically? base/metrics/statistics_recorder, which is the common entry point. Ideally, it would switch to the UI thread, blocking any background thread until done, so that callers wouldn't need to know about it. With multiple callers, all (okay, both) of the will have to do the thread-jump to call SR::WriteHTMLGraphs(). I've seen a send-and-wait-reply in previous work; I'll investigate.
https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... File base/metrics/statistics_recorder.cc (right): https://codereview.chromium.org/1880803003/diff/240001/base/metrics/statistic... base/metrics/statistics_recorder.cc:232: std::string* output) { On 2016/04/26 21:03:29, bcwhite wrote: > On 2016/04/26 19:51:53, bcwhite wrote: > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > It looks like this method is only used to support a VLOG statement, which I > > > suspect is basically unused. Rather than adding additional complexity to > > > support this use case, I'd prefer to remove the VLOG statement (in a > precursor > > > CL, for clarity). WDYT? > > > > I can try that. I'd like to do it as a follow-up CL, though. > > I started into that. Removing it from the SR is pretty trivial. Would you also > like it removed from HistogramBase and the derived classes? Most of the code > will remain since even the HTML displays are just ASCII displays wraped with > <pre></pre>. Yeah, I think simplifying/removing the code everywhere where it's not needed is useful, even if it's not much code removed from the histogram classes. https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/04/26 21:03:29, bcwhite wrote: > On 2016/04/26 20:21:58, Ilya Sherman wrote: > > On 2016/04/26 19:51:54, bcwhite wrote: > > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > > Hmm. Suppose that multiple subprocesses recorded histograms with the same > > > name > > > > (which is a fairly common case). Do the histograms get aggregated > together, > > > or > > > > duplicated for each renderer? It seems like the latter, which is quite > odd. > > > > > Or > > > > are they aggregated together somewhere prior to reaching this code? > > > > > > The HistogramSnapshotManager merges histograms with duplicate names. > > > > I see. It's kind of wacky that |histogram| mostly describes the histogram, > but > > might have different samples in it than the |snapshot| does. But, I guess > this > > is how the code has worked for ages. Oh well. Thanks for the explanation. > > Actually, the merging in the HSM is new. It used to be that everything was > merged by the StatisticsRecorder beacuse all remote histograms were imported > completely via RPC. With the move to persistent storage, importing them becomes > unnecessary since the remote memory is directly accessible. So, no merged into > the SR but merge in the HSM. > > In hindsight, it may have been better to continue the import of everything into > the SR and just used the persistent memory as a replacement for the RPC. There > are advantages and disadvantages with both methods, I think. Hmm. Might be worth discussing when we meet tomorrow. Assuming we keep the merging approach: Since the merging concept is new, I wonder if it wouldn't make more sense to create a new histogram, which has the correct snapshot within it, rather than tracking the snapshot externally.
https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/04/27 20:17:42, Ilya Sherman wrote: > On 2016/04/26 21:03:29, bcwhite wrote: > > On 2016/04/26 20:21:58, Ilya Sherman wrote: > > > On 2016/04/26 19:51:54, bcwhite wrote: > > > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > > > Hmm. Suppose that multiple subprocesses recorded histograms with the > same > > > > name > > > > > (which is a fairly common case). Do the histograms get aggregated > > together, > > > > or > > > > > duplicated for each renderer? It seems like the latter, which is quite > > odd. > > > > > > > Or > > > > > are they aggregated together somewhere prior to reaching this code? > > > > > > > > The HistogramSnapshotManager merges histograms with duplicate names. > > > > > > I see. It's kind of wacky that |histogram| mostly describes the histogram, > > but > > > might have different samples in it than the |snapshot| does. But, I guess > > this > > > is how the code has worked for ages. Oh well. Thanks for the explanation. > > > > Actually, the merging in the HSM is new. It used to be that everything was > > merged by the StatisticsRecorder beacuse all remote histograms were imported > > completely via RPC. With the move to persistent storage, importing them > becomes > > unnecessary since the remote memory is directly accessible. So, no merged > into > > the SR but merge in the HSM. > > > > In hindsight, it may have been better to continue the import of everything > into > > the SR and just used the persistent memory as a replacement for the RPC. > There > > are advantages and disadvantages with both methods, I think. > > Hmm. Might be worth discussing when we meet tomorrow. > > Assuming we keep the merging approach: Since the merging concept is new, I > wonder if it wouldn't make more sense to create a new histogram, which has the > correct snapshot within it, rather than tracking the snapshot externally. There's no easy way to create a new histogram based on another which is necessary in order to have the bucketing information. You have to "pickle" then "unpickle" it to something new which is quite expensive. This would have to be done once per histogram name during every HSM run (not just those for display).
Continuing discussion of this somewhat tangential point, but FYI I'm mainly waiting for the change to use callbacks rather than locking. Please let me know if you're blocked on that. Otherwise, I'll continue to assume that you're working on that part while we discuss this other bit :) https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/05/02 13:20:34, bcwhite wrote: > On 2016/04/27 20:17:42, Ilya Sherman wrote: > > On 2016/04/26 21:03:29, bcwhite wrote: > > > On 2016/04/26 20:21:58, Ilya Sherman wrote: > > > > On 2016/04/26 19:51:54, bcwhite wrote: > > > > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > > > > Hmm. Suppose that multiple subprocesses recorded histograms with the > > same > > > > > name > > > > > > (which is a fairly common case). Do the histograms get aggregated > > > together, > > > > > or > > > > > > duplicated for each renderer? It seems like the latter, which is > quite > > > odd. > > > > > > > > > Or > > > > > > are they aggregated together somewhere prior to reaching this code? > > > > > > > > > > The HistogramSnapshotManager merges histograms with duplicate names. > > > > > > > > I see. It's kind of wacky that |histogram| mostly describes the > histogram, > > > but > > > > might have different samples in it than the |snapshot| does. But, I guess > > > this > > > > is how the code has worked for ages. Oh well. Thanks for the > explanation. > > > > > > Actually, the merging in the HSM is new. It used to be that everything was > > > merged by the StatisticsRecorder beacuse all remote histograms were imported > > > completely via RPC. With the move to persistent storage, importing them > > becomes > > > unnecessary since the remote memory is directly accessible. So, no merged > > into > > > the SR but merge in the HSM. > > > > > > In hindsight, it may have been better to continue the import of everything > > into > > > the SR and just used the persistent memory as a replacement for the RPC. > > There > > > are advantages and disadvantages with both methods, I think. > > > > Hmm. Might be worth discussing when we meet tomorrow. > > > > Assuming we keep the merging approach: Since the merging concept is new, I > > wonder if it wouldn't make more sense to create a new histogram, which has the > > correct snapshot within it, rather than tracking the snapshot externally. > > There's no easy way to create a new histogram based on another which is > necessary in order to have the bucketing information. > > You have to "pickle" then "unpickle" it to something new which is quite > expensive. This would have to be done once per histogram name during every HSM > run (not just those for display). Sorry, I'm still confused here. Are you saying that it would be hard to write a direct copy constructor? If so, how come? The expensive part of the copy is copying the snapshot... but that's the part that's being copied anyway. (And, we wouldn't need to copy most histograms -- just ones whose names appear multiple times, i.e. those defined in renderer processes.)
I started on removing the lock but hit a snag and moved onto other things for a bit. At histogram_internals_request_job.cc:54 is the call to WriteHTMLGraph() which fills the "data" parameter that is then returned. Within content/, how do I make that call on the UI thread and block until it completes? https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... File chrome/browser/metrics/subprocess_metrics_provider.cc (right): https://codereview.chromium.org/1880803003/diff/240001/chrome/browser/metrics... chrome/browser/metrics/subprocess_metrics_provider.cc:33: std::string& graph = graphs_[histogram.histogram_name()]; On 2016/05/04 06:00:26, Ilya Sherman (Away May 5-15) wrote: > On 2016/05/02 13:20:34, bcwhite wrote: > > On 2016/04/27 20:17:42, Ilya Sherman wrote: > > > On 2016/04/26 21:03:29, bcwhite wrote: > > > > On 2016/04/26 20:21:58, Ilya Sherman wrote: > > > > > On 2016/04/26 19:51:54, bcwhite wrote: > > > > > > On 2016/04/25 20:53:14, Ilya Sherman wrote: > > > > > > > Hmm. Suppose that multiple subprocesses recorded histograms with > the > > > same > > > > > > name > > > > > > > (which is a fairly common case). Do the histograms get aggregated > > > > together, > > > > > > or > > > > > > > duplicated for each renderer? It seems like the latter, which is > > quite > > > > odd. > > > > > > > > > > > Or > > > > > > > are they aggregated together somewhere prior to reaching this code? > > > > > > > > > > > > The HistogramSnapshotManager merges histograms with duplicate names. > > > > > > > > > > I see. It's kind of wacky that |histogram| mostly describes the > > histogram, > > > > but > > > > > might have different samples in it than the |snapshot| does. But, I > guess > > > > this > > > > > is how the code has worked for ages. Oh well. Thanks for the > > explanation. > > > > > > > > Actually, the merging in the HSM is new. It used to be that everything > was > > > > merged by the StatisticsRecorder beacuse all remote histograms were > imported > > > > completely via RPC. With the move to persistent storage, importing them > > > becomes > > > > unnecessary since the remote memory is directly accessible. So, no merged > > > into > > > > the SR but merge in the HSM. > > > > > > > > In hindsight, it may have been better to continue the import of everything > > > into > > > > the SR and just used the persistent memory as a replacement for the RPC. > > > There > > > > are advantages and disadvantages with both methods, I think. > > > > > > Hmm. Might be worth discussing when we meet tomorrow. > > > > > > Assuming we keep the merging approach: Since the merging concept is new, I > > > wonder if it wouldn't make more sense to create a new histogram, which has > the > > > correct snapshot within it, rather than tracking the snapshot externally. > > > > There's no easy way to create a new histogram based on another which is > > necessary in order to have the bucketing information. > > > > You have to "pickle" then "unpickle" it to something new which is quite > > expensive. This would have to be done once per histogram name during every > HSM > > run (not just those for display). > > Sorry, I'm still confused here. Are you saying that it would be hard to write a > direct copy constructor? If so, how come? The expensive part of the copy is > copying the snapshot... but that's the part that's being copied anyway. (And, > we wouldn't need to copy most histograms -- just ones whose names appear > multiple times, i.e. those defined in renderer processes.) It wouldn't be too difficult. A virtual Duplicate() method that returns a new (heap) HistogramBase object would sit beside Snapshot(). However, I think this is a bad idea as it would create multiple histogram objects of the same name when care has been taken to ensure that there is only ever one.
> I started on removing the lock but hit a snag and moved onto other things for a > bit. > > At histogram_internals_request_job.cc:54 is the call to WriteHTMLGraph() which > fills the "data" parameter that is then returned. > > Within content/, how do I make that call on the UI thread and block until it > completes? I've also traced the calls back and don't see any easy way to short-circuit the operation to not treat it as a URLRequestJob. (Granted, I'm unfamiliar with that area of the code.)
On 2016/05/04 15:38:08, bcwhite wrote: > I started on removing the lock but hit a snag and moved onto other things for a > bit. > > At histogram_internals_request_job.cc:54 is the call to WriteHTMLGraph() which > fills the "data" parameter that is then returned. > > Within content/, how do I make that call on the UI thread and block until it > completes? HistogramInternalsRequestJob::GetData() is passed a callback, which is what should be invoked for the asynchronous work. You can use content::BrowserThread APIs to post the task to the UI thread. Nothing should block.
Patchset #8 (id:280001) has been deleted
Patchset #8 (id:300001) has been deleted
Patchset #8 (id:320001) has been deleted
Patchset #8 (id:340001) has been deleted
Description was changed from ========== Display histograms from subprocesses in chrome://histograms Traditionally, metrics in subprocesses were imported into the browser process and so were known to the Statistics- Recorder and would appear on the chrome://histograms page. With the move to metrics held in persistent memory segments from which the data is independently included in the upload, histograms in subprocesses disappeared from that page. Allow classes to register themselves with the Statistics- Recorder using an interface that can write histogram graphs to an output stream. The SubprocessMetricsProvider registers itself during construction with the result being a separate section on the page for histograms from sub- processes. The core Write(HTML)Graph methods are modified to take an optional snapshot. This allows passing a merged snapshot to the correct code to create the appropriate graph. BUG=602295 ========== to ========== Display histograms from subprocesses in chrome://histograms Traditionally, metrics in subprocesses were imported into the browser process and so were known to the Statistics- Recorder and would appear on the chrome://histograms page. With the move to metrics held in persistent memory segments from which the data is independently included in the upload, histograms in subprocesses disappeared from that page. Allow classes to register themselves with the Statistics- Recorder using an interface that can write histogram graphs to an output stream. The SubprocessMetricsProvider registers itself during construction with the result being a separate section on the page for histograms from sub- processes. The core Write(HTML)Graph methods are modified to take an optional snapshot. This allows passing a merged snapshot to the correct code to create the appropriate graph. BUG=602295 ========== |