Issue 2786373002: memory-infra: Add peak-detector skeleton.

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-03-31 11:45:29 UTC) #1

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/1

3 years, 8 months ago (2017-03-31 11:45:48 UTC) #2

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 8 months ago (2017-03-31 12:52:05 UTC) #3

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 8 months ago (2017-03-31 12:52:06 UTC) #4

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-03-31 16:35:16 UTC) #5

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/20001

3 years, 8 months ago (2017-03-31 16:36:07 UTC) #6

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-03-31 16:40:18 UTC) #7

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/40001

3 years, 8 months ago (2017-03-31 16:41:24 UTC) #8

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-03-31 16:48:59 UTC) #9

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/60001

3 years, 8 months ago (2017-03-31 16:49:22 UTC) #10

Primiano Tucci (use gerrit)

primiano@chromium.org changed reviewers: + hjd@chromium.org, ssid@chromium.org

3 years, 8 months ago (2017-03-31 16:51:06 UTC) #11

Primiano Tucci (use gerrit)

ssid, hjd WDYT? this should work both in the current model (where we tear down ...

3 years, 8 months ago (2017-03-31 16:51:07 UTC) #12

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 8 months ago (2017-03-31 17:06:38 UTC) #13

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_ng/builds/419516)

3 years, 8 months ago (2017-03-31 17:06:39 UTC) #14

ssid

On 2017/03/31 16:51:07, Primiano Tucci wrote: > ssid, hjd WDYT? > > this should work ...

3 years, 8 months ago (2017-03-31 18:03:35 UTC) #15

On 2017/03/31 16:51:07, Primiano Tucci wrote:
> ssid, hjd WDYT?
> 
> this should work both in the current model (where we tear down the thread) and
> in the future model (where we don't tear it down).
> Am I missing some other use case?
> (Yeah, I know, I should call SuspendFastMemoryPolling() on the MDP, will do in
> the next CLs)

Suspend is called by TraceLogDisable() because if the thread is stopped and the
last task posted gets dropped, then suspend will never be called. It is also
called when removing dump providers from the polling list. But in this case if
post task fails then it will get called at TraceLogDisable().

Sorry, I will look at unittests later today.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:16: const uint32_t
kInitialPollingIntervalMs = 1;
On 2017/03/31 16:51:07, Primiano Tucci wrote:
> will sort this out in the next CL when I move the stuff from the MDScheduler

Acknowledged.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:49: DCHECK(state_ == NOT_INITIALIZED ||
state_ == DISABLED);
Why initialize multiple times? shouldn't this be state_ == DISABLED?
TearDown sets to NOT_INITIALIZED anyway.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:76: 
You'd also need to set poll_count = 0;

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:83: // just a minor optimization.
Do we need a comment that says "this is not just a minor optimization"?

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:101: task_runner_->PostTask(
What if we registered the first dump provider before creating the dump thread.
Or are we going to use some existing worker pool here?
Where are you planning to do Initialize()? at MDM constructor? Why not do it in
this CL?

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:110: return;  // Start will deal with
this later.
nit: Start()

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:126: // Will cause th next
PollMemoryAndDetectPeak() task to early return.
the

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:144:
mdp_info->dump_provider->PollFastMemoryTotal(&memory_total);
    uint64_t value = 0;
    mdpinfo->dump_provider->PollFastMemoryTotal(&value);
    *memory_total += value;

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:28: // thread, which does not have to be
|task_runner| on which the polling tasks
How are you planning to ensure this?
Which thread is MDM going to use when starting polling? MDM needs to store an
extra task runner for this? or it is going to postTask all methods of this class
to dump thread?
I think the issue here is that dump providers can be registered on any thread
and calling NotifyChanged() should be posted everytime.
What will we do for the case where dump thread is not yet created?
Maybe you are planning to store the thread that created MDM just for posting
tasks to this class?!

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:29: // run (see Start()).
Start() does not have any documentation.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:36: enum State {
I think "State" is a confusing name here. Especially because of the existing
structs in scheduler names PollingState. But, i can't think of better name. I
guess it's fine.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:54: const
scoped_refptr<SequencedTaskRunner>&,
Is there any advantage of using sequenced task runner here?

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:55: const OnPeakDetectedCallback&);
Maybe this callback shouldn't be part of Initialize().
We would anyway need another function to set up peak detection parameters like
time interval and modes. So, this can be paramter to Start() along with the
others that you'll add in future.
Essentially we need the flexibility of changing the callback every time we start
polling.

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-03-31 19:58:01 UTC) #16

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/80001

3 years, 8 months ago (2017-03-31 19:58:43 UTC) #17

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-03-31 20:01:45 UTC) #18

Primiano Tucci (use gerrit)

Thanks! https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory_peak_detector.cc File base/trace_event/memory_peak_detector.cc (right): https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory_peak_detector.cc#newcode49 base/trace_event/memory_peak_detector.cc:49: DCHECK(state_ == NOT_INITIALIZED || state_ == DISABLED); On ...

3 years, 8 months ago (2017-03-31 20:01:55 UTC) #19

Thanks!

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:49: DCHECK(state_ == NOT_INITIALIZED ||
state_ == DISABLED);
On 2017/03/31 18:03:35, ssid wrote:
> Why initialize multiple times? shouldn't this be state_ == DISABLED?
> TearDown sets to NOT_INITIALIZED anyway.

TearDown is only for tests.
initialize multiple times is for near-term compatibility, where every time we do
OnTraceLogEnabled we'll get a new thread, hence will have to re-initialize the
task-runner

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:76: 
On 2017/03/31 18:03:35, ssid wrote:
> You'd also need to set poll_count = 0;

That is for testing only and I only tear it down in the TearDownForTesting

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:83: // just a minor optimization.
On 2017/03/31 18:03:35, ssid wrote:
> Do we need a comment that says "this is not just a minor optimization"?

That was to prevent a future myself from removing it. although, thinking twice,
we have enough tests to make sure that that won't happen so you are right this
comment is superfluos

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:101: task_runner_->PostTask(
On 2017/03/31 18:03:35, ssid wrote:
> What if we registered the first dump provider before creating the dump thread.
Ah you are right, I have to make this a no-op (early out) if task_runner_ is
nullptr.
The code should still work, because what will happen there is:
 - MDP gets registered before Initialize(). This call is a noop
 - Then this get Initiazed() and Start().
 - Start will call ReloadDumpProvidersAndStartPollingIfNeeded,, so we'll fetch
the MDP from the list.

added a test to cover this.

> Or are we going to use some existing worker pool here?
> Where are you planning to do Initialize()? at MDM constructor? Why not do it
in
> this CL?

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:110: return;  // Start will deal with
this later.
On 2017/03/31 18:03:35, ssid wrote:
> nit: Start()

Done.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:126: // Will cause th next
PollMemoryAndDetectPeak() task to early return.
On 2017/03/31 18:03:35, ssid wrote:
> the

Done.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:144:
mdp_info->dump_provider->PollFastMemoryTotal(&memory_total);
On 2017/03/31 18:03:35, ssid wrote:
>     uint64_t value = 0;
>     mdpinfo->dump_provider->PollFastMemoryTotal(&value);
>     *memory_total += value;

Yup, I would have replaced allt this part next, but put this here to avoid to do
something silly next .)

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:28: // thread, which does not have to be
|task_runner| on which the polling tasks
On 2017/03/31 18:03:35, ssid wrote:
> How are you planning to ensure this?
> Which thread is MDM going to use when starting polling? MDM needs to store an
> extra task runner for this? or it is going to postTask all methods of this
class
> to dump thread?
> I think the issue here is that dump providers can be registered on any thread
> and calling NotifyChanged() should be posted everytime.
> What will we do for the case where dump thread is not yet created?
> Maybe you are planning to store the thread that created MDM just for posting
> tasks to this class?!

Oh you are right. Nah there is no need to post, because all these public methods
here just post under the hoods. I just removed the thread checker and specified
that the caller (MDM) will have to take care of linearization.
Essentially means just calling these from a lock.
I though to this this morning when I designed all this, but then I must have
forgot at some point later when I added the seqchecker . too much used to be
paranoid I guess :)

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:29: // run (see Start()).
On 2017/03/31 18:03:35, ssid wrote:
> Start() does not have any documentation.

now it does :)

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:36: enum State {
On 2017/03/31 18:03:35, ssid wrote:
> I think "State" is a confusing name here. Especially because of the existing
> structs in scheduler names PollingState. But, i can't think of better name. I
> guess it's fine.

It's going to be better once the polling part of the scheduler is no more and
we'll have only MemoryPeakDetector and PeriodicMemorySampler (or something like
that)

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:54: const
scoped_refptr<SequencedTaskRunner>&,
On 2017/03/31 18:03:35, ssid wrote:
> Is there any advantage of using sequenced task runner here?

nah just being future proof in case, in future, we want to move to a blocking
pool (which is sequenced but not singlethread IIRC)

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:55: const OnPeakDetectedCallback&);
On 2017/03/31 18:03:35, ssid wrote:
> Maybe this callback shouldn't be part of Initialize().
> We would anyway need another function to set up peak detection parameters like
> time interval and modes. So, this can be paramter to Start() along with the
> others that you'll add in future.
> Essentially we need the flexibility of changing the callback every time we
start
> polling.

Isn't the callback to be always the same though (either short-circuited to
ReqGlobalDump in the near term or sent to the service)?.
Let's see how this evolved in future

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/100001

3 years, 8 months ago (2017-03-31 20:02:05 UTC) #20

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 8 months ago (2017-03-31 21:11:19 UTC) #21

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 8 months ago (2017-03-31 21:11:20 UTC) #22

hjd

looks great! Just a few comments: https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memory_peak_detector.cc File base/trace_event/memory_peak_detector.cc (right): https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memory_peak_detector.cc#newcode102 base/trace_event/memory_peak_detector.cc:102: void MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() { ...

3 years, 8 months ago (2017-04-03 13:25:31 UTC) #23

looks great! Just a few comments:

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:102: void
MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
I think the following sequence or one like it can put us into a state where we
have multiple timers firing.
Basically if everything is running as normal and we call
NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory the next
time PollMemoryAndDetectPeak is called we will not schedule a new
PollMemoryAndDetectPeak however if we quickly call
NotifyMemoryDumpProvidersChanged() again and this time we have some mdps we'll
never stop the original PollMemoryAndDetectPeak timer thing and we'll call
PollMemoryAndDetectPeak again to start a new timer.

t{1,2} = thread {1,2}

t2: PollMemoryAndDetectPeak()
t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A

t1: NotifyMemoryDumpProvidersChanged()
t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
t2: ReloadDumpProvidersAndStartPollingIfNeeded()
t2: -> get_dump_providers_function_ => [] # No mdps
t2: -> state_ = ENABLED

t1: NotifyMemoryDumpProvidersChanged()
t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
t2: ReloadDumpProvidersAndStartPollingIfNeeded()
t2: -> get_dump_providers_function_ => [mdp] # Some mdps
t2: -> state_ = RUNNING
t2: -> PostTask(PollMemoryAndDetectPeak)
t2: PollMemoryAndDetectPeak()
t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B

t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new task
t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new task

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:45: // thread. Initialize() can be call
serveral times, provided that: (1) Stop()
nit: typo: serveral -> several

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:45: // thread. Initialize() can be call
serveral times, provided that: (1) Stop()
nit: call -> called

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector_unittest.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:56: peak_detector_ =
MemoryPeakDetector::GetInstance();
Maybe it would be better if each test constructed a separate instance of the
MemoryPeakDetector? It seems easy to accidentally add new state and forget to
clean it up in-between tests. This would avoid the need for TearDownForTesting.

Primiano Tucci (use gerrit)

On 2017/04/03 13:25:31, hjd wrote: > looks great! Just a few comments: > > https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memory_peak_detector.cc ...

3 years, 8 months ago (2017-04-03 15:51:17 UTC) #24

On 2017/04/03 13:25:31, hjd wrote:
> looks great! Just a few comments:
> 
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> File base/trace_event/memory_peak_detector.cc (right):
> 
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> base/trace_event/memory_peak_detector.cc:102: void
> MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
> I think the following sequence or one like it can put us into a state where we
> have multiple timers firing.
> Basically if everything is running as normal and we call
> NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory the
next
> time PollMemoryAndDetectPeak is called we will not schedule a new
> PollMemoryAndDetectPeak however if we quickly call
> NotifyMemoryDumpProvidersChanged() again and this time we have some mdps we'll
> never stop the original PollMemoryAndDetectPeak timer thing and we'll call
> PollMemoryAndDetectPeak again to start a new timer.
> 
> t{1,2} = thread {1,2}
> 
> t2: PollMemoryAndDetectPeak()
> t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> 
> t1: NotifyMemoryDumpProvidersChanged()
> t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> t2: -> get_dump_providers_function_ => [] # No mdps
> t2: -> state_ = ENABLED
> 
> t1: NotifyMemoryDumpProvidersChanged()
> t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> t2: -> state_ = RUNNING
> t2: -> PostTask(PollMemoryAndDetectPeak)
> t2: PollMemoryAndDetectPeak()
> t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> 
> t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new task
> t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new task

Unless I missed something it is impossible by design that two different threads
call ReloadDumpProvidersAndStartPollingIfNeeded / PollMemoryAndDetectPeak. All
calls are always posted to the same task_runner (and if there is a possibility
they are not, I screwed up something, although currently not clear where).

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector_unittest.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:56: peak_detector_ =
MemoryPeakDetector::GetInstance();
On 2017/04/03 13:25:30, hjd wrote:
> Maybe it would be better if each test constructed a separate instance of the
> MemoryPeakDetector?
Hmm It's a singleton, so in order to construct it here I'd have to make at least
the dtor public (which would be required for unique_ptr).

> It seems easy to accidentally add new state and forget to
> clean it up in-between tests. 
Hmm how? The deal with TearDown is that it is always called after each TEST_F
function.
 
> This would avoid the need for TearDownForTesting.
But that is what allows to clear up the state between each test.

hjd

On 2017/04/03 15:51:17, Primiano Tucci wrote: > On 2017/04/03 13:25:31, hjd wrote: > > looks ...

3 years, 8 months ago (2017-04-03 16:04:34 UTC) #25

On 2017/04/03 15:51:17, Primiano Tucci wrote:
> On 2017/04/03 13:25:31, hjd wrote:
> > looks great! Just a few comments:
> > 
> >
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> > File base/trace_event/memory_peak_detector.cc (right):
> > 
> >
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> > base/trace_event/memory_peak_detector.cc:102: void
> > MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
> > I think the following sequence or one like it can put us into a state where
we
> > have multiple timers firing.
> > Basically if everything is running as normal and we call
> > NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory the
> next
> > time PollMemoryAndDetectPeak is called we will not schedule a new
> > PollMemoryAndDetectPeak however if we quickly call
> > NotifyMemoryDumpProvidersChanged() again and this time we have some mdps
we'll
> > never stop the original PollMemoryAndDetectPeak timer thing and we'll call
> > PollMemoryAndDetectPeak again to start a new timer.
> > 
> > t{1,2} = thread {1,2}
> > 
> > t2: PollMemoryAndDetectPeak()
> > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> > 
> > t1: NotifyMemoryDumpProvidersChanged()
> > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > t2: -> get_dump_providers_function_ => [] # No mdps
> > t2: -> state_ = ENABLED
> > 
> > t1: NotifyMemoryDumpProvidersChanged()
> > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> > t2: -> state_ = RUNNING
> > t2: -> PostTask(PollMemoryAndDetectPeak)
> > t2: PollMemoryAndDetectPeak()
> > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> > 
> > t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new
task
> > t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new
task
> 
> Unless I missed something it is impossible by design that two different
threads
> call ReloadDumpProvidersAndStartPollingIfNeeded / PollMemoryAndDetectPeak. All
> calls are always posted to the same task_runner (and if there is a possibility
> they are not, I screwed up something, although currently not clear where).

Let's talk offline and I'll see if I can explain better :P

>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> File base/trace_event/memory_peak_detector_unittest.cc (right):
> 
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> base/trace_event/memory_peak_detector_unittest.cc:56: peak_detector_ =
> MemoryPeakDetector::GetInstance();
> On 2017/04/03 13:25:30, hjd wrote:
> > Maybe it would be better if each test constructed a separate instance of the
> > MemoryPeakDetector?
> Hmm It's a singleton, so in order to construct it here I'd have to make at
least
> the dtor public (which would be required for unique_ptr).

Yep. It seems better just to make the ctor/dtor visible to the test than to have
a
hack to reset the state. In general I would argue for it being bad to test a
singleton via the GetInstance method 

> > It seems easy to accidentally add new state and forget to
> > clean it up in-between tests. 
> Hmm how? The deal with TearDown is that it is always called after each TEST_F
> function.

I'm not worried that we will get it wrong now. I am worried that later some on
(for example me :P) will come and add a field and not reset it in the
special test TearDown method. When that happens the tests will suddenly be order
dependant where they weren't before, if we construct and teardown the
whole object each time this kind of error can never occur.

> >&nbsp;This would avoid the need for TearDownForTesting.
> But that is what allows to clear up the state between each test.

Yep, but couldn't we just use the dtor for that?

ssid

Looks good. I am just not happy about the Initialize being called multiple times without ...

3 years, 8 months ago (2017-04-03 17:55:28 UTC) #26

Looks good. I am just not happy about the Initialize being called multiple times
without tear down.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:49: DCHECK(state_ == NOT_INITIALIZED ||
state_ == DISABLED);
On 2017/03/31 20:01:54, Primiano Tucci wrote:
> On 2017/03/31 18:03:35, ssid wrote:
> > Why initialize multiple times? shouldn't this be state_ == DISABLED?
> > TearDown sets to NOT_INITIALIZED anyway.
> 
> TearDown is only for tests.
> initialize multiple times is for near-term compatibility, where every time we
do
> OnTraceLogEnabled we'll get a new thread, hence will have to re-initialize the
> task-runner

If we call it multiple times, maybe Setup()? also setup without tear down seems
error prone in future

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:76: 
On 2017/03/31 20:01:54, Primiano Tucci wrote:
> On 2017/03/31 18:03:35, ssid wrote:
> > You'd also need to set poll_count = 0;
> 
> That is for testing only and I only tear it down in the TearDownForTesting

Does it mean this function will not support multiple polling sessions in single
browsing session?
1. start polling()
2. stop polling()
3. start polling() -> now we start with poll_count = X.
I don't think there is a need to keep the state and we might run into some bugs.

In case of multiple tracing sessions with different dump threads, tear down will
be needed because we will then be keeping a reference to a task runner of the
dead thread.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:101: task_runner_->PostTask(
On 2017/03/31 20:01:54, Primiano Tucci wrote:
> On 2017/03/31 18:03:35, ssid wrote:
> > What if we registered the first dump provider before creating the dump
thread.
> Ah you are right, I have to make this a no-op (early out) if task_runner_ is
> nullptr.
> The code should still work, because what will happen there is:
>  - MDP gets registered before Initialize(). This call is a noop
>  - Then this get Initiazed() and Start().
>  - Start will call ReloadDumpProvidersAndStartPollingIfNeeded,, so we'll fetch
> the MDP from the list.
> 
> added a test to cover this.
> 
> > Or are we going to use some existing worker pool here?
> > Where are you planning to do Initialize()? at MDM constructor? Why not do it
> in
> > this CL?
> 

Acknowledged.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:55: const OnPeakDetectedCallback&);
On 2017/03/31 20:01:55, Primiano Tucci wrote:
> On 2017/03/31 18:03:35, ssid wrote:
> > Maybe this callback shouldn't be part of Initialize().
> > We would anyway need another function to set up peak detection parameters
like
> > time interval and modes. So, this can be paramter to Start() along with the
> > others that you'll add in future.
> > Essentially we need the flexibility of changing the callback every time we
> start
> > polling.
> 
> Isn't the callback to be always the same though (either short-circuited to
> ReqGlobalDump in the near term or sent to the service)?.
> Let's see how this evolved in future

hm callback can be different. for instance for slow reports we will setup with
BACKGROUND mode or DETAILED mode in normal case.
Also supporting multiple Initialize() solves this issue. But, just not happy
with calling initialize() twice on a singleton. Maybe this can be names Setup()?

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:44: DCHECK(dump_providers_.empty());
DCHECK(task_runner)?

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:102: void
MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
On 2017/04/03 13:25:30, hjd wrote:
> I think the following sequence or one like it can put us into a state where we
> have multiple timers firing.
> Basically if everything is running as normal and we call
> NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory the
next
> time PollMemoryAndDetectPeak is called we will not schedule a new
> PollMemoryAndDetectPeak however if we quickly call
> NotifyMemoryDumpProvidersChanged() again and this time we have some mdps we'll
> never stop the original PollMemoryAndDetectPeak timer thing and we'll call
> PollMemoryAndDetectPeak again to start a new timer.
> 
> t{1,2} = thread {1,2}
> 
> t2: PollMemoryAndDetectPeak()
> t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> 
> t1: NotifyMemoryDumpProvidersChanged()
> t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> t2: -> get_dump_providers_function_ => [] # No mdps
> t2: -> state_ = ENABLED
> 
> t1: NotifyMemoryDumpProvidersChanged()
> t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> t2: -> state_ = RUNNING
> t2: -> PostTask(PollMemoryAndDetectPeak)
> t2: PollMemoryAndDetectPeak()
> t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> 
> t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new task
> t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new task

We will never have the case where we go from having polling mdp to no polling
mdp. There are dchecks to check this. Basically the current process metrics
provider never unregisters.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:55: //   |task_runner| when a memory
peak is detected.
nit: |task_runner| is not defined in the declaration.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:99: // must be accessed only by the
thread that owns the class instance and NOT by
"thread that owns the class" is not true right?

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector_unittest.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:160:
peak_detector_->NotifyMemoryDumpProvidersChanged();
Can you add a 
EXPECT_EQ(MemoryPeakDetector::NOT_INITIALIZED, GetPeakDetectorState());

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:253:
bg_thread_->FlushForTesting();
There is FlushForTesting calls in all the tests. I don't think this simulates
the real world accurately. Look at the comment Hector added about the case where
polling is enabled twice. I haven't reasoned about it fully, but we will miss
such cases if the Flush is called at each check point.
Also, we don't really have to support unregistering all mdp to stop polling.

Primiano Tucci (use gerrit)

Patchset #7 (id:120001) has been deleted

3 years, 8 months ago (2017-04-03 20:02:05 UTC) #27

Primiano Tucci (use gerrit)

Patchset #7 (id:140001) has been deleted

3 years, 8 months ago (2017-04-03 20:27:50 UTC) #28

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org to run a CQ dry run

3 years, 8 months ago (2017-04-03 20:28:03 UTC) #29

Primiano Tucci (use gerrit)

Thanks a lot folks, very useful comments. see replies below https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory_peak_detector.cc File base/trace_event/memory_peak_detector.cc (right): https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory_peak_detector.cc#newcode49 ...

3 years, 8 months ago (2017-04-03 20:28:16 UTC) #30

Thanks a lot folks, very useful comments. see replies below

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:49: DCHECK(state_ == NOT_INITIALIZED ||
state_ == DISABLED);
On 2017/04/03 17:55:27, ssid wrote:
> On 2017/03/31 20:01:54, Primiano Tucci wrote:
> > On 2017/03/31 18:03:35, ssid wrote:
> > > Why initialize multiple times? shouldn't this be state_ == DISABLED?
> > > TearDown sets to NOT_INITIALIZED anyway.
> > 
> > TearDown is only for tests.
> > initialize multiple times is for near-term compatibility, where every time
we
> do
> > OnTraceLogEnabled we'll get a new thread, hence will have to re-initialize
the
> > task-runner
> 
> If we call it multiple times, maybe Setup()?
+1, done

> also setup without tear down seems
> error prone in future
Alright, the only argument is really avoiding keeping the task runner
refcounted, in the case we switch to another task runner model where the
task_runner is not just a proxy.
Done

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:76: 
On 2017/04/03 17:55:27, ssid wrote:
> On 2017/03/31 20:01:54, Primiano Tucci wrote:
> > On 2017/03/31 18:03:35, ssid wrote:
> > > You'd also need to set poll_count = 0;
> > 
> > That is for testing only and I only tear it down in the TearDownForTesting
> 
> Does it mean this function will not support multiple polling sessions in
single
> browsing session?
> 1. start polling()
> 2. stop polling()
> 3. start polling() -> now we start with poll_count = X.

I see where the confusion comes from. poll_count_ is really
poll_count_for_testing_, i.e. it is not used for production decisions. Renamed
var accordingly.

The use case you mentioned is supported, just I need the absolute number of poll
counts for the test, to keep track of spurious poll tasks between stop/start.

> I don't think there is a need to keep the state and we might run into some
bugs.
Is purely for tests.

> In case of multiple tracing sessions with different dump threads, tear down
will
> be needed because we will then be keeping a reference to a task runner of the
> dead thread.
Is not a problem with the current Thread, as the task runner is just a
MessageLoop proxy and will just no-op PostTask, but agree might become a problem
in future.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.h:55: const OnPeakDetectedCallback&);
On 2017/04/03 17:55:27, ssid wrote:
> On 2017/03/31 20:01:55, Primiano Tucci wrote:
> > On 2017/03/31 18:03:35, ssid wrote:
> > > Maybe this callback shouldn't be part of Initialize().
> > > We would anyway need another function to set up peak detection parameters
> like
> > > time interval and modes. So, this can be paramter to Start() along with
the
> > > others that you'll add in future.
> > > Essentially we need the flexibility of changing the callback every time we
> > start
> > > polling.
> > 
> > Isn't the callback to be always the same though (either short-circuited to
> > ReqGlobalDump in the near term or sent to the service)?.
> > Let's see how this evolved in future
> 
> hm callback can be different. for instance for slow reports we will setup with
> BACKGROUND mode or DETAILED mode in normal case.
I think we should deal with that within the callback.

> Also supporting multiple Initialize() solves this issue.
Yeah but it's overkill really. That complexity can be handled in the callback
with one state variable.

> But, just not happy
> with calling initialize() twice on a singleton. Maybe this can be names
Setup()?
Done

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:44: DCHECK(dump_providers_.empty());
On 2017/04/03 17:55:27, ssid wrote:
> DCHECK(task_runner)?

Done.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:102: void
MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
On 2017/04/03 17:55:27, ssid wrote:
> On 2017/04/03 13:25:30, hjd wrote:
> > I think the following sequence or one like it can put us into a state where
we
> > have multiple timers firing.
> > Basically if everything is running as normal and we call
> > NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory the
> next
> > time PollMemoryAndDetectPeak is called we will not schedule a new
> > PollMemoryAndDetectPeak however if we quickly call
> > NotifyMemoryDumpProvidersChanged() again and this time we have some mdps
we'll
> > never stop the original PollMemoryAndDetectPeak timer thing and we'll call
> > PollMemoryAndDetectPeak again to start a new timer.
> > 
> > t{1,2} = thread {1,2}
> > 
> > t2: PollMemoryAndDetectPeak()
> > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> > 
> > t1: NotifyMemoryDumpProvidersChanged()
> > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > t2: -> get_dump_providers_function_ => [] # No mdps
> > t2: -> state_ = ENABLED
> > 
> > t1: NotifyMemoryDumpProvidersChanged()
> > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> > t2: -> state_ = RUNNING
> > t2: -> PostTask(PollMemoryAndDetectPeak)
> > t2: PollMemoryAndDetectPeak()
> > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> > 
> > t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new
task
> > t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new
task
> 
> We will never have the case where we go from having polling mdp to no polling
> mdp. There are dchecks to check this. Basically the current process metrics
> provider never unregisters.

Even if we do, this code should be robust enough to tolerate it (in which case
will just go back to the ENABLED) state, stopping the polling. See tests.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:45: // thread. Initialize() can be call
serveral times, provided that: (1) Stop()
On 2017/04/03 13:25:30, hjd wrote:
> nit: call -> called

Done.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:45: // thread. Initialize() can be call
serveral times, provided that: (1) Stop()
On 2017/04/03 13:25:30, hjd wrote:
> nit: call -> called

Done.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:55: //   |task_runner| when a memory
peak is detected.
On 2017/04/03 17:55:28, ssid wrote:
> nit: |task_runner| is not defined in the declaration.

Done.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:99: // must be accessed only by the
thread that owns the class instance and NOT by
On 2017/04/03 17:55:27, ssid wrote:
> "thread that owns the class" is not true right?

Right. reworded.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector_unittest.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:56: peak_detector_ =
MemoryPeakDetector::GetInstance();
On 2017/04/03 15:51:17, Primiano Tucci wrote:
> On 2017/04/03 13:25:30, hjd wrote:
> > Maybe it would be better if each test constructed a separate instance of the
> > MemoryPeakDetector?
> Hmm It's a singleton, so in order to construct it here I'd have to make at
least
> the dtor public (which would be required for unique_ptr).
> 
> > It seems easy to accidentally add new state and forget to
> > clean it up in-between tests. 
> Hmm how? The deal with TearDown is that it is always called after each TEST_F
> function.
>  
> >&nbsp;This would avoid the need for TearDownForTesting.
> But that is what allows to clear up the state between each test. 

Okay you made very good arguments. Using an actual instance in the test.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:160:
peak_detector_->NotifyMemoryDumpProvidersChanged();
On 2017/04/03 17:55:28, ssid wrote:
> Can you add a 
> EXPECT_EQ(MemoryPeakDetector::NOT_INITIALIZED, GetPeakDetectorState());

Good point, done

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector_unittest.cc:253:
bg_thread_->FlushForTesting();
On 2017/04/03 17:55:28, ssid wrote:
> There is FlushForTesting calls in all the tests. I don't think this simulates
> the real world accurately. 
I removed all of them but two. Most of them where unneeded because the
evt.wait(), or the GetPeakDetectorState() have the right sequencing.
I need this to simulate time advancing and make sure that the events posted
there happen.
If there is some case missing we have to simulate the right sequence.

>Look at the comment Hector added about the case where
> polling is enabled twice. I haven't reasoned about it fully, but we will miss
> such cases if the Flush is called at each check point.

> Also, we don't really have to support unregistering all mdp to stop polling.
Yeah but I don't want to worry about that (think if one day we get gpu fast
polling), feels too fragile.

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/160001

3 years, 8 months ago (2017-04-03 20:28:35 UTC) #31

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

3 years, 8 months ago (2017-04-03 22:35:03 UTC) #32

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

3 years, 8 months ago (2017-04-03 22:35:04 UTC) #33

ssid

lgtm, the code is functionally correct. But I still think we should not be supporting ...

3 years, 8 months ago (2017-04-04 05:08:36 UTC) #34

lgtm, the code is functionally correct. But I still think we should not be
supporting unregistration of all mdps in PeakDetector. I did not point out about
the race earlier because I was thinking we wouldn't support this case.
The unittest still passes because of the Flush() call, which is fine.
Let's just land this and fix this case in next CL.

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/40001/base/trace_event/memory...
base/trace_event/memory_peak_detector.cc:76: 
On 2017/04/03 20:28:15, Primiano Tucci wrote:
> On 2017/04/03 17:55:27, ssid wrote:
> > On 2017/03/31 20:01:54, Primiano Tucci wrote:
> > > On 2017/03/31 18:03:35, ssid wrote:
> > > > You'd also need to set poll_count = 0;
> > > 
> > > That is for testing only and I only tear it down in the TearDownForTesting
> > 
> > Does it mean this function will not support multiple polling sessions in
> single
> > browsing session?
> > 1. start polling()
> > 2. stop polling()
> > 3. start polling() -> now we start with poll_count = X.
> 
> I see where the confusion comes from. poll_count_ is really
> poll_count_for_testing_, i.e. it is not used for production decisions. Renamed
> var accordingly.
> 
> The use case you mentioned is supported, just I need the absolute number of
poll
> counts for the test, to keep track of spurious poll tasks between stop/start.
> 
> > I don't think there is a need to keep the state and we might run into some
> bugs.
> Is purely for tests.
> 
> > In case of multiple tracing sessions with different dump threads, tear down
> will
> > be needed because we will then be keeping a reference to a task runner of
the
> > dead thread.
> Is not a problem with the current Thread, as the task runner is just a
> MessageLoop proxy and will just no-op PostTask, but agree might become a
problem
> in future.

Acknowledged.

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:102: void
MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
On 2017/04/03 20:28:15, Primiano Tucci wrote:
> On 2017/04/03 17:55:27, ssid wrote:
> > On 2017/04/03 13:25:30, hjd wrote:
> > > I think the following sequence or one like it can put us into a state
where
> we
> > > have multiple timers firing.
> > > Basically if everything is running as normal and we call
> > > NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory the
> > next
> > > time PollMemoryAndDetectPeak is called we will not schedule a new
> > > PollMemoryAndDetectPeak however if we quickly call
> > > NotifyMemoryDumpProvidersChanged() again and this time we have some mdps
> we'll
> > > never stop the original PollMemoryAndDetectPeak timer thing and we'll call
> > > PollMemoryAndDetectPeak again to start a new timer.
> > > 
> > > t{1,2} = thread {1,2}
> > > 
> > > t2: PollMemoryAndDetectPeak()
> > > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> > > 
> > > t1: NotifyMemoryDumpProvidersChanged()
> > > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > > t2: -> get_dump_providers_function_ => [] # No mdps
> > > t2: -> state_ = ENABLED
> > > 
> > > t1: NotifyMemoryDumpProvidersChanged()
> > > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > > t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> > > t2: -> state_ = RUNNING
> > > t2: -> PostTask(PollMemoryAndDetectPeak)
> > > t2: PollMemoryAndDetectPeak()
> > > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> > > 
> > > t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new
> task
> > > t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new
> task
> > 
> > We will never have the case where we go from having polling mdp to no
polling
> > mdp. There are dchecks to check this. Basically the current process metrics
> > provider never unregisters.
> 
> Even if we do, this code should be robust enough to tolerate it (in which case
> will just go back to the ENABLED) state, stopping the polling. See tests.

hm no. What Hector said was right. If we have a case where we lose all polling
dump providers then it is possible that we start polling twice. This is the
reason why I had explicitly disallowed the unregistering all polling providers.

The case here is:
1. Polling started -> delayed task
2. all dump providers are gone- > state changed to disable.
3. dump provider registered again -> state changed to enable and polling started
again.
4. The delayed task posted at the step one did not see the disable and enable
changes and just sees it's enabled again. so continues polling. 

Now we have two sets of polling tasks.
Maybe it's better to just disable polling for the entire tracing session if dump
providers go away. That is:
if (state_ == RUNNING && dump_providers_.empty())
  state_ = DISABLED;

Or a DCHECK(!dump_providers.empty());

https://codereview.chromium.org/2786373002/diff/160001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.h (right):

https://codereview.chromium.org/2786373002/diff/160001/base/trace_event/memor...
base/trace_event/memory_peak_detector.h:36: NOT_INITIALIZED = 0,  // Before
Setup()
It'd be good to change this to NOT_SETUP. but not a big deal.

Primiano Tucci (use gerrit)

> But I still think we should not be > supporting unregistration of all mdps ...

3 years, 8 months ago (2017-04-04 09:14:31 UTC) #35

> But I still think we should not be
> supporting unregistration of all mdps in PeakDetector. 
I like to have building blocks that are robust so I can have to not think about
corner cases, if that robustness can be achieved easily.

> I did not point out about
> the race earlier because I was thinking we wouldn't support this case.
> The unittest still passes because of the Flush() call, which is fine.
No, the unittest did pass because I did not think to this case and I did not
cover it.
Now that you folks made me realize about that I added a new unittest which did
fail with the previous logic, and passes with the generation_ fix :)
See MemoryPeakDetectorTest.StartStopQuickly

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
File base/trace_event/memory_peak_detector.cc (right):

https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
base/trace_event/memory_peak_detector.cc:102: void
MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
On 2017/04/04 05:08:35, ssid wrote:
> On 2017/04/03 20:28:15, Primiano Tucci wrote:
> > On 2017/04/03 17:55:27, ssid wrote:
> > > On 2017/04/03 13:25:30, hjd wrote:
> > > > I think the following sequence or one like it can put us into a state
> where
> > we
> > > > have multiple timers firing.
> > > > Basically if everything is running as normal and we call
> > > > NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory
the
> > > next
> > > > time PollMemoryAndDetectPeak is called we will not schedule a new
> > > > PollMemoryAndDetectPeak however if we quickly call
> > > > NotifyMemoryDumpProvidersChanged() again and this time we have some mdps
> > we'll
> > > > never stop the original PollMemoryAndDetectPeak timer thing and we'll
call
> > > > PollMemoryAndDetectPeak again to start a new timer.
> > > > 
> > > > t{1,2} = thread {1,2}
> > > > 
> > > > t2: PollMemoryAndDetectPeak()
> > > > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> > > > 
> > > > t1: NotifyMemoryDumpProvidersChanged()
> > > > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > > > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > > > t2: -> get_dump_providers_function_ => [] # No mdps
> > > > t2: -> state_ = ENABLED
> > > > 
> > > > t1: NotifyMemoryDumpProvidersChanged()
> > > > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > > > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > > > t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> > > > t2: -> state_ = RUNNING
> > > > t2: -> PostTask(PollMemoryAndDetectPeak)
> > > > t2: PollMemoryAndDetectPeak()
> > > > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> > > > 
> > > > t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a new
> > task
> > > > t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a new
> > task
> > > 
> > > We will never have the case where we go from having polling mdp to no
> polling
> > > mdp. There are dchecks to check this. Basically the current process
metrics
> > > provider never unregisters.
> > 
> > Even if we do, this code should be robust enough to tolerate it (in which
case
> > will just go back to the ENABLED) state, stopping the polling. See tests.
> 
> hm no. What Hector said was right. If we have a case where we lose all polling
> dump providers then it is possible that we start polling twice. This is the
> reason why I had explicitly disallowed the unregistering all polling
providers.
> 
> The case here is:
> 1. Polling started -> delayed task
> 2. all dump providers are gone- > state changed to disable.
> 3. dump provider registered again -> state changed to enable and polling
started
> again.
> 4. The delayed task posted at the step one did not see the disable and enable
> changes and just sees it's enabled again. so continues polling. 

Okay you were both right. There is a corner case I missed here, which can be
simplified as follows:
- Load one MDP in the list.
- Start -> Stop -> Start -> Stop very quickly (before the first poll happen)
Each start will post a new task that will keep posting the chain.
Thankfully this is easier to test (see the new StartStopQuickly test) and easy
to fix (see the generation_ logic)


> Now we have two sets of polling tasks.
> Maybe it's better to just disable polling for the entire tracing session if
dump
> providers go away. That is:
> if (state_ == RUNNING && dump_providers_.empty())
>   state_ = DISABLED;
> 
> Or a DCHECK(!dump_providers.empty());
Nah I think that the generation_ fixes the problem and keeps the logic robust.
Happy to re-iterate on next patchset.

hjd

On 2017/04/04 09:14:31, Primiano Tucci wrote: > > But I still think we should not ...

3 years, 8 months ago (2017-04-04 09:24:58 UTC) #36

On 2017/04/04 09:14:31, Primiano Tucci wrote:
> > But I still think we should not be
> >&nbsp;supporting unregistration of all mdps in PeakDetector. 
> I like to have building blocks that are robust so I can have to not think
about
> corner cases, if that robustness can be achieved easily.
> 
> > I did not point out about
> > the race earlier because I was thinking we wouldn't support this case.
> > The unittest still passes because of the Flush() call, which is fine.
> No, the unittest did pass because I did not think to this case and I did not
> cover it.
> Now that you folks made me realize about that I added a new unittest which did
> fail with the previous logic, and passes with the generation_ fix :)
> See MemoryPeakDetectorTest.StartStopQuickly
> 
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> File base/trace_event/memory_peak_detector.cc (right):
> 
>
https://codereview.chromium.org/2786373002/diff/100001/base/trace_event/memor...
> base/trace_event/memory_peak_detector.cc:102: void
> MemoryPeakDetector::ReloadDumpProvidersAndStartPollingIfNeeded() {
> On 2017/04/04 05:08:35, ssid wrote:
> > On 2017/04/03 20:28:15, Primiano Tucci wrote:
> > > On 2017/04/03 17:55:27, ssid wrote:
> > > > On 2017/04/03 13:25:30, hjd wrote:
> > > > > I think the following sequence or one like it can put us into a state
> > where
> > > we
> > > > > have multiple timers firing.
> > > > > Basically if everything is running as normal and we call
> > > > > NotifyMemoryDumpProvidersChanged() when we have no mdps then in theory
> the
> > > > next
> > > > > time PollMemoryAndDetectPeak is called we will not schedule a new
> > > > > PollMemoryAndDetectPeak however if we quickly call
> > > > > NotifyMemoryDumpProvidersChanged() again and this time we have some
mdps
> > > we'll
> > > > > never stop the original PollMemoryAndDetectPeak timer thing and we'll
> call
> > > > > PollMemoryAndDetectPeak again to start a new timer.
> > > > > 
> > > > > t{1,2} = thread {1,2}
> > > > > 
> > > > > t2: PollMemoryAndDetectPeak()
> > > > > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task A
> > > > > 
> > > > > t1: NotifyMemoryDumpProvidersChanged()
> > > > > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > > > > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > > > > t2: -> get_dump_providers_function_ => [] # No mdps
> > > > > t2: -> state_ = ENABLED
> > > > > 
> > > > > t1: NotifyMemoryDumpProvidersChanged()
> > > > > t1: -> PostTask(ReloadDumpProvidersAndStartPollingIfNeeded)
> > > > > t2: ReloadDumpProvidersAndStartPollingIfNeeded()
> > > > > t2: -> get_dump_providers_function_ => [mdp] # Some mdps
> > > > > t2: -> state_ = RUNNING
> > > > > t2: -> PostTask(PollMemoryAndDetectPeak)
> > > > > t2: PollMemoryAndDetectPeak()
> > > > > t2: -> PostDelayedTask(PollMemoryAndDetectPeak) # Delayed task B
> > > > > 
> > > > > t2: PollMemoryAndDetectPeak() # Delayed task A fires and schedules a
new
> > > task
> > > > > t2: PollMemoryAndDetectPeak() # Delayed task B fires and schedules a
new
> > > task
> > > > 
> > > > We will never have the case where we go from having polling mdp to no
> > polling
> > > > mdp. There are dchecks to check this. Basically the current process
> metrics
> > > > provider never unregisters.
> > > 
> > > Even if we do, this code should be robust enough to tolerate it (in which
> case
> > > will just go back to the ENABLED) state, stopping the polling. See tests.
> > 
> > hm no. What Hector said was right. If we have a case where we lose all
polling
> > dump providers then it is possible that we start polling twice. This is the
> > reason why I had explicitly disallowed the unregistering all polling
> providers.
> > 
> > The case here is:
> > 1. Polling started -> delayed task
> > 2. all dump providers are gone- > state changed to disable.
> > 3. dump provider registered again -> state changed to enable and polling
> started
> > again.
> > 4. The delayed task posted at the step one did not see the disable and
enable
> > changes and just sees it's enabled again. so continues polling. 
> 
> Okay you were both right. There is a corner case I missed here, which can be
> simplified as follows:
> - Load one MDP in the list.
> - Start ->&nbsp;Stop -> Start -> Stop very quickly (before the first poll
happen)
> Each start will post a new task that will keep posting the chain.
> Thankfully this is easier to test (see the new StartStopQuickly test) and easy
> to fix (see the generation_ logic)
> 
> 
> > Now we have two sets of polling tasks.
> > Maybe it's better to just disable polling for the entire tracing session if
> dump
> > providers go away. That is:
> > if (state_ == RUNNING && dump_providers_.empty())
> >   state_ = DISABLED;
> > 
> > Or a DCHECK(!dump_providers.empty());
> Nah I think that the generation_ fixes the problem and keeps the logic robust.
> Happy to re-iterate on next patchset.

lgtm :)

Primiano Tucci (use gerrit)

Description was changed from ========== memory-infra: Add peak-detector skeleton. Adds a skeleton and tests for ...

3 years, 8 months ago (2017-04-04 09:28:41 UTC) #37

Primiano Tucci (use gerrit)

primiano@chromium.org changed reviewers: + thakis@chromium.org

3 years, 8 months ago (2017-04-04 09:28:41 UTC) #38

Primiano Tucci (use gerrit)

TBR thakis for the usual adds to base/BUILD.gn

3 years, 8 months ago (2017-04-04 09:29:04 UTC) #39

Primiano Tucci (use gerrit)

The CQ bit was checked by primiano@chromium.org

3 years, 8 months ago (2017-04-04 09:29:09 UTC) #40

Primiano Tucci (use gerrit)

The patchset sent to the CQ was uploaded after l-g-t-m from ssid@chromium.org, hjd@chromium.org Link to ...

3 years, 8 months ago (2017-04-04 09:29:10 UTC) #41

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2786373002/200001

3 years, 8 months ago (2017-04-04 09:29:21 UTC) #42

commit-bot: I haz the power

CQ is committing da patch. Bot data: {"patchset_id": 200001, "attempt_start_ts": 1491298149692200, "parent_rev": "266599bdc383070164a173779ca123100a769a03", "commit_rev": "f34c3e056c9a2eaa12a41daa2f8e05b579360f28"}

3 years, 8 months ago (2017-04-04 10:29:07 UTC) #43

commit-bot: I haz the power

Description was changed from ========== memory-infra: Add peak-detector skeleton. Adds a skeleton and tests for ...

3 years, 8 months ago (2017-04-04 10:29:49 UTC) #44

commit-bot: I haz the power

3 years, 8 months ago (2017-04-04 10:29:50 UTC) #45

Message was sent while issue was closed.

Committed patchset #9 (id:200001) as
https://chromium.googlesource.com/chromium/src/+/f34c3e056c9a2eaa12a41daa2f8e...

Issue 2786373002: memory-infra: Add peak-detector skeleton. (Closed)

Description

Patch Set 1 #

Patch Set 2 : . #

Patch Set 3 : . #

Patch Set 4 : . #

Patch Set 5 : ssid review #

Patch Set 6 : remove seqchecker #

Patch Set 7 : rebase + introduce teardown + reset instance in test #

Patch Set 8 : Add generation + test #

Patch Set 9 : minor typos #

Messages