Issue 246553003: Record Mobile Operator ID in UMA report. This is Android only for now.

bolian

Hello Jim, Alex and Ben, Please review this CL that adds operator ID into the ...

6 years, 8 months ago (2014-04-22 18:09:06 UTC) #1

Ilya Sherman

Can you record this data as a histogram rather than including it in the systemprofile ...

6 years, 8 months ago (2014-04-22 20:50:53 UTC) #2

bolian

The purpose is to associate this data with existing histograms, for example, histograms for data ...

6 years, 8 months ago (2014-04-22 20:59:26 UTC) #3

Alexei Svitkine (slow)

Some questions for you below. By the way, before this can land in Chromium, you ...

6 years, 8 months ago (2014-04-23 16:54:11 UTC) #5

bolian

I should have mentioned this in the description. Yes, the privacy team has agreed to ...

6 years, 8 months ago (2014-04-23 17:29:39 UTC) #6

Alexei Svitkine (slow)

At minimum, the server-side changes would be to update the protos. If you want the ...

6 years, 8 months ago (2014-04-23 17:58:49 UTC) #7

bolian

We don't need a dashboard for this. Our cron job will pick up the proto ...

6 years, 8 months ago (2014-04-24 00:59:23 UTC) #8

Ilya Sherman

I still think that it would be more appropriate to emit to a histogram for ...

6 years, 8 months ago (2014-04-24 02:20:46 UTC) #9

bolian

Do you mean log the value each time the network changes in a histogram say, ...

6 years, 8 months ago (2014-04-24 04:39:31 UTC) #10

Ilya Sherman

On 2014/04/24 04:39:31, bolian wrote: > Do you mean log the value each time the ...

6 years, 8 months ago (2014-04-24 05:19:32 UTC) #11

bolian

>> If you happen to see the user switch networks during a single recording >> ...

6 years, 8 months ago (2014-04-24 05:53:24 UTC) #12

Ilya Sherman

On 2014/04/24 05:53:24, bolian wrote: > >> If you happen to see the user switch ...

6 years, 8 months ago (2014-04-24 06:29:47 UTC) #13

bolian

Got it. Thanks for explaining this. However, it is still not clear to me why ...

6 years, 8 months ago (2014-04-24 20:34:47 UTC) #14

Ilya Sherman

On 2014/04/24 20:34:47, bolian wrote: > However, it is still not clear to me why ...

6 years, 8 months ago (2014-04-24 23:01:09 UTC) #15

bolian

Thanks, Ilya. This sounds good. I can work on the change to emit a histogram. ...

6 years, 8 months ago (2014-04-25 17:52:57 UTC) #16

Ilya Sherman

On 2014/04/25 17:52:57, bolian wrote: > Thanks, Ilya. This sounds good. I can work on ...

6 years, 8 months ago (2014-04-25 17:56:00 UTC) #17

bolian

The MCC and MNC list is large (http://en.wikipedia.org/wiki/Mobile_country_code) We can log MCC and MNC separately, ...

6 years, 8 months ago (2014-04-25 18:46:18 UTC) #18

Ilya Sherman

On 2014/04/25 18:46:18, bolian wrote: > The MCC and MNC list is large (http://en.wikipedia.org/wiki/Mobile_country_code) > ...

6 years, 8 months ago (2014-04-25 18:59:05 UTC) #19

bengr

https://codereview.chromium.org/246553003/diff/40001/chrome/browser/metrics/metrics_network_observer.cc File chrome/browser/metrics/metrics_network_observer.cc (right): https://codereview.chromium.org/246553003/diff/40001/chrome/browser/metrics/metrics_network_observer.cc#newcode21 chrome/browser/metrics/metrics_network_observer.cc:21: return device_info.GetNetworkOperator(); Sorry to have not caught this in ...

6 years, 8 months ago (2014-04-25 20:39:40 UTC) #20

bolian

Hi Ilya, Alexei, and Ben, I reverted the proto change and added the ID as ...

6 years, 8 months ago (2014-04-25 23:33:04 UTC) #21

Ilya Sherman

https://codereview.chromium.org/246553003/diff/80001/net/base/network_change_notifier.cc File net/base/network_change_notifier.cc (right): https://codereview.chromium.org/246553003/diff/80001/net/base/network_change_notifier.cc#newcode263 net/base/network_change_notifier.cc:263: // MCC and MNC codes are each 3 digits. ...

6 years, 8 months ago (2014-04-26 02:29:12 UTC) #22

bolian

https://codereview.chromium.org/246553003/diff/80001/net/base/network_change_notifier.cc File net/base/network_change_notifier.cc (right): https://codereview.chromium.org/246553003/diff/80001/net/base/network_change_notifier.cc#newcode263 net/base/network_change_notifier.cc:263: // MCC and MNC codes are each 3 digits. ...

6 years, 7 months ago (2014-04-28 18:07:54 UTC) #23

Ilya Sherman

Histograms LGTM with the inline comment addressed. Thanks! https://codereview.chromium.org/246553003/diff/100001/tools/metrics/histograms/histograms.xml File tools/metrics/histograms/histograms.xml (right): https://codereview.chromium.org/246553003/diff/100001/tools/metrics/histograms/histograms.xml#newcode10588 tools/metrics/histograms/histograms.xml:10588: + ...

6 years, 7 months ago (2014-04-28 21:27:17 UTC) #24

bolian

Thanks. I updated the CL to log the UMA on connection type change instead of ...

6 years, 7 months ago (2014-04-28 22:39:17 UTC) #25

bolian

Hi Misha, Could you take a look at the net/ part of the change? Also, ...

6 years, 7 months ago (2014-04-28 22:49:44 UTC) #26

Alexei Svitkine (slow)

https://codereview.chromium.org/246553003/diff/140001/net/base/network_change_notifier.cc File net/base/network_change_notifier.cc (right): https://codereview.chromium.org/246553003/diff/140001/net/base/network_change_notifier.cc#newcode240 net/base/network_change_notifier.cc:240: int value = atoi(mcc_mnc.c_str()); Nit: Use StringToUint() from base/strings/string_number_conversions.h ...

6 years, 7 months ago (2014-04-29 16:35:44 UTC) #29

bolian

https://codereview.chromium.org/246553003/diff/140001/net/base/network_change_notifier.cc File net/base/network_change_notifier.cc (right): https://codereview.chromium.org/246553003/diff/140001/net/base/network_change_notifier.cc#newcode240 net/base/network_change_notifier.cc:240: int value = atoi(mcc_mnc.c_str()); On 2014/04/29 16:35:44, Alexei Svitkine ...

6 years, 7 months ago (2014-04-29 17:36:43 UTC) #30

Matt Welsh

As the primary user of this data, I'm not too happy about this being made ...

6 years, 7 months ago (2014-04-29 22:55:12 UTC) #33

Ilya Sherman

On 2014/04/29 22:55:12, Matt Welsh wrote: > As the primary user of this data, I'm ...

6 years, 7 months ago (2014-05-01 01:26:36 UTC) #34

On 2014/04/29 22:55:12, Matt Welsh wrote:
> As the primary user of this data, I'm not too happy about this being made into
a
> separate histogram which only enumerates only certain carriers. This seems
like
> the wrong design to me, for several reasons.
> 
> Primarily we need to be able to write queries and slice up UMA metrics
(whatever
> they are) by different properties of the device:
>   - Device type
>   - Chrome version
>   - Network connection type
>   - Carrier
> 
> We currently have the first three, and are missing the last one. All of our
UMA
> processing pipeline code pulls these values out of system_profile and having
to
> special case the carrier really doesn't make sense to me. It is going to be a
> lot more work for us to deal with joining against a separate histogram than it
> is to pluck the information we need right out of the proto field where it
> belongs. You guys are not the only ones writing code that processes UMA
protos!

I don't quite follow why it's much harder to join against a separate histogram
than to read from the SystemProfile.  Is it just the difference between repeated
and non-repeated fields?  FWIW, plenty of other people have been successful at
joining over multiple histograms.

> Second, I disagree with the assertion in #11 that "most of the time the user
> will not have changed networks during an upload". There are several problems
> here. UMA metrics aren't uploaded on 3G networks at all, so I'm not quite
clear
> on how you would expect us to tease apart UMA reports from the same user based
> on which network they were on *at the time the metric was collected* (as
opposed
> to when it is uploaded), since by definition the user's network would be
"WiFi"
> when they are in the process of uploading. This seems very brittle to me.

This is independent of whether or not the field is included in the
SystemProfile.  UMA metrics are only uploaded on WiFi, but I believe that logs
are still cut and serialized once every 5 minutes.  The SystemProfile or
histogram value would be captured at the time the log is cut, not at the time of
the upload.  Hence, my assertion is really that during most 5-minute windows of
use, users are not going to be changing networks.

> Third, if the concern is that adding this field will somehow break other tools
> that ingest UMA protos (and it shouldn't), I am willing to take on whatever
work
> is needed to plumb this information though those tools.

The concern is not that, but rather that the SystemProfile is the wrong tool for
the job.  Many fields were added to it before we had support for sparse
histograms.  Now that sparse histograms are available, it ought to be relatively
rare to need to make changes to the SystemProfile message.

Matt Welsh

I may be unclear on what you have in mind in terms of how I ...

6 years, 7 months ago (2014-05-01 18:27:19 UTC) #35

Alexei Svitkine (slow)

It should be possible to make a histogram that's logged on every UMA upload. That ...

6 years, 7 months ago (2014-05-01 18:32:43 UTC) #36

Matt Welsh

On 2014/05/01 18:32:43, Alexei Svitkine wrote: > It should be possible to make a histogram ...

6 years, 7 months ago (2014-05-01 18:34:46 UTC) #37

Ilya Sherman

On 2014/05/01 18:34:46, Matt Welsh wrote: > On 2014/05/01 18:32:43, Alexei Svitkine wrote: > > ...

6 years, 7 months ago (2014-05-01 18:39:55 UTC) #38

Matt Welsh

On 2014/05/01 18:39:55, Ilya Sherman wrote: > On 2014/05/01 18:34:46, Matt Welsh wrote: > > ...

6 years, 7 months ago (2014-05-01 19:01:10 UTC) #39

Ilya Sherman

On 2014/05/01 19:01:10, Matt Welsh wrote: > On 2014/05/01 18:39:55, Ilya Sherman wrote: > > ...

6 years, 7 months ago (2014-05-01 19:08:42 UTC) #40

On 2014/05/01 19:01:10, Matt Welsh wrote:
> On 2014/05/01 18:39:55, Ilya Sherman wrote:
> > On 2014/05/01 18:34:46, Matt Welsh wrote:
> > > On 2014/05/01 18:32:43, Alexei Svitkine wrote:
> > > > It should be possible to make a histogram that's logged on every UMA
> upload.
> > > > That way, you won't need to look at any other UMA records to get at that
> > data.
> > > > You'd just need to find that histogram and extract the value from it
> > (similar
> > > to
> > > > what you would do from the system profile except with a bit of iteration
> > > through
> > > > histograms to find it).
> > > > 
> > > > The histogram could still log all the carriers. A sparse histogram will
> log
> > > all
> > > > values that it gets, regardless of whether all the enums are in
> > > histograms.xml.
> > > > You just won't get pretty names for the ones that aren't in the XML file
> on
> > > the
> > > > UMA dashboard. (But you can expand the XML list over time as you see
> popular
> > > > carriers in the data).
> > > > 
> > > > Does that make sense?
> > > 
> > > OK, it wasn't clear to me that one could assume that a given UMA report
> > > contained all of the histograms from a given upload.
> > > I don't really understand the upload process and how the data can be
> > segmented.
> > > Since we are now talking about a data
> > > dependency *across histograms* in order to interpret things appropriately,
> is
> > > that a guarantee you're willing to commit to
> > > long term?
> > 
> > Yes, each upload/report is a single ChromeUserMetricsExtension message. 
This
> is
> > indeed a guaranteed aspect of the UMA design.  If we ever change this, it'll
> be
> > a lot of work, and we'll be careful to work with stakeholders to migrate
them
> > over... but I really don't anticipate us wanting to change this anytime in
the
> > foreseeable future.
> 
> Is there a size limit on the individual ChromeUserMetricsExtension message?
> (Would it ever get split up?)

There is a theoretical size limit, yes -- the server will reject unreasonably
large uploads.  In practice, this is not a concern for histograms.

The best way to think about this change is as though we are adding a repeated
histogram field to the SystemProfile message.  If we were to ever split up
uploads, we would duplicate the SystemProfile across all of the fragmented
uploads, and we would do the same for any histograms that are included in each
upload.

Matt Welsh

On 2014/05/01 19:08:42, Ilya Sherman wrote: > On 2014/05/01 19:01:10, Matt Welsh wrote: > > ...

6 years, 7 months ago (2014-05-01 19:09:50 UTC) #41

On 2014/05/01 19:08:42, Ilya Sherman wrote:
> On 2014/05/01 19:01:10, Matt Welsh wrote:
> > On 2014/05/01 18:39:55, Ilya Sherman wrote:
> > > On 2014/05/01 18:34:46, Matt Welsh wrote:
> > > > On 2014/05/01 18:32:43, Alexei Svitkine wrote:
> > > > > It should be possible to make a histogram that's logged on every UMA
> > upload.
> > > > > That way, you won't need to look at any other UMA records to get at
that
> > > data.
> > > > > You'd just need to find that histogram and extract the value from it
> > > (similar
> > > > to
> > > > > what you would do from the system profile except with a bit of
iteration
> > > > through
> > > > > histograms to find it).
> > > > > 
> > > > > The histogram could still log all the carriers. A sparse histogram
will
> > log
> > > > all
> > > > > values that it gets, regardless of whether all the enums are in
> > > > histograms.xml.
> > > > > You just won't get pretty names for the ones that aren't in the XML
file
> > on
> > > > the
> > > > > UMA dashboard. (But you can expand the XML list over time as you see
> > popular
> > > > > carriers in the data).
> > > > > 
> > > > > Does that make sense?
> > > > 
> > > > OK, it wasn't clear to me that one could assume that a given UMA report
> > > > contained all of the histograms from a given upload.
> > > > I don't really understand the upload process and how the data can be
> > > segmented.
> > > > Since we are now talking about a data
> > > > dependency *across histograms* in order to interpret things
appropriately,
> > is
> > > > that a guarantee you're willing to commit to
> > > > long term?
> > > 
> > > Yes, each upload/report is a single ChromeUserMetricsExtension message. 
> This
> > is
> > > indeed a guaranteed aspect of the UMA design.  If we ever change this,
it'll
> > be
> > > a lot of work, and we'll be careful to work with stakeholders to migrate
> them
> > > over... but I really don't anticipate us wanting to change this anytime in
> the
> > > foreseeable future.
> > 
> > Is there a size limit on the individual ChromeUserMetricsExtension message?
> > (Would it ever get split up?)
> 
> There is a theoretical size limit, yes -- the server will reject unreasonably
> large uploads.  In practice, this is not a concern for histograms.
> 
> The best way to think about this change is as though we are adding a repeated
> histogram field to the SystemProfile message.  If we were to ever split up
> uploads, we would duplicate the SystemProfile across all of the fragmented
> uploads, and we would do the same for any histograms that are included in each
> upload.

I'm still unclear how we will encode the carrier name in this histogram - can
you explain?

Ilya Sherman

On 2014/05/01 19:09:50, Matt Welsh wrote: > On 2014/05/01 19:08:42, Ilya Sherman wrote: > > ...

6 years, 7 months ago (2014-05-01 19:12:00 UTC) #42

On 2014/05/01 19:09:50, Matt Welsh wrote:
> On 2014/05/01 19:08:42, Ilya Sherman wrote:
> > On 2014/05/01 19:01:10, Matt Welsh wrote:
> > > On 2014/05/01 18:39:55, Ilya Sherman wrote:
> > > > On 2014/05/01 18:34:46, Matt Welsh wrote:
> > > > > On 2014/05/01 18:32:43, Alexei Svitkine wrote:
> > > > > > It should be possible to make a histogram that's logged on every UMA
> > > upload.
> > > > > > That way, you won't need to look at any other UMA records to get at
> that
> > > > data.
> > > > > > You'd just need to find that histogram and extract the value from it
> > > > (similar
> > > > > to
> > > > > > what you would do from the system profile except with a bit of
> iteration
> > > > > through
> > > > > > histograms to find it).
> > > > > > 
> > > > > > The histogram could still log all the carriers. A sparse histogram
> will
> > > log
> > > > > all
> > > > > > values that it gets, regardless of whether all the enums are in
> > > > > histograms.xml.
> > > > > > You just won't get pretty names for the ones that aren't in the XML
> file
> > > on
> > > > > the
> > > > > > UMA dashboard. (But you can expand the XML list over time as you see
> > > popular
> > > > > > carriers in the data).
> > > > > > 
> > > > > > Does that make sense?
> > > > > 
> > > > > OK, it wasn't clear to me that one could assume that a given UMA
report
> > > > > contained all of the histograms from a given upload.
> > > > > I don't really understand the upload process and how the data can be
> > > > segmented.
> > > > > Since we are now talking about a data
> > > > > dependency *across histograms* in order to interpret things
> appropriately,
> > > is
> > > > > that a guarantee you're willing to commit to
> > > > > long term?
> > > > 
> > > > Yes, each upload/report is a single ChromeUserMetricsExtension message. 
> > This
> > > is
> > > > indeed a guaranteed aspect of the UMA design.  If we ever change this,
> it'll
> > > be
> > > > a lot of work, and we'll be careful to work with stakeholders to migrate
> > them
> > > > over... but I really don't anticipate us wanting to change this anytime
in
> > the
> > > > foreseeable future.
> > > 
> > > Is there a size limit on the individual ChromeUserMetricsExtension
message?
> > > (Would it ever get split up?)
> > 
> > There is a theoretical size limit, yes -- the server will reject
unreasonably
> > large uploads.  In practice, this is not a concern for histograms.
> > 
> > The best way to think about this change is as though we are adding a
repeated
> > histogram field to the SystemProfile message.  If we were to ever split up
> > uploads, we would duplicate the SystemProfile across all of the fragmented
> > uploads, and we would do the same for any histograms that are included in
each
> > upload.
> 
> I'm still unclear how we will encode the carrier name in this histogram - can
> you explain?

The way that is already implemented in this CL: leading three digits for MCC,
remaining digits for MNC.

chromium-reviews

OK, I thought there was a comment above about only including certain carriers and not ...

6 years, 7 months ago (2014-05-01 19:25:41 UTC) #43

OK, I thought there was a comment above about only including certain
carriers and not others. If we can include all carriers using a sparse
histogram listing MCC/MNC then I don't have any objections to this approach
-- it is a bit more work to process on the MR end but not that bad. Thanks.


On Thu May 01 2014 at 12:22:01 PM, <isherman@chromium.org> wrote:

> On 2014/05/01 19:09:50, Matt Welsh wrote:
> > On 2014/05/01 19:08:42, Ilya Sherman wrote:
> > > On 2014/05/01 19:01:10, Matt Welsh wrote:
> > > > On 2014/05/01 18:39:55, Ilya Sherman wrote:
> > > > > On 2014/05/01 18:34:46, Matt Welsh wrote:
> > > > > > On 2014/05/01 18:32:43, Alexei Svitkine wrote:
> > > > > > > It should be possible to make a histogram that's logged on
> > every UMA
> > > > upload.
> > > > > > > That way, you won't need to look at any other UMA records to
> > get at
> > that
> > > > > data.
> > > > > > > You'd just need to find that histogram and extract the value
> > from it
> > > > > (similar
> > > > > > to
> > > > > > > what you would do from the system profile except with a bit of
> > iteration
> > > > > > through
> > > > > > > histograms to find it).
> > > > > > >
> > > > > > > The histogram could still log all the carriers. A sparse
> > histogram
> > will
> > > > log
> > > > > > all
> > > > > > > values that it gets, regardless of whether all the enums are in
> > > > > > histograms.xml.
> > > > > > > You just won't get pretty names for the ones that aren't in the
> > XML
> > file
> > > > on
> > > > > > the
> > > > > > > UMA dashboard. (But you can expand the XML list over time as
> > you see
> > > > popular
> > > > > > > carriers in the data).
> > > > > > >
> > > > > > > Does that make sense?
> > > > > >
> > > > > > OK, it wasn't clear to me that one could assume that a given UMA
> report
> > > > > > contained all of the histograms from a given upload.
> > > > > > I don't really understand the upload process and how the data can
> > be
> > > > > segmented.
> > > > > > Since we are now talking about a data
> > > > > > dependency *across histograms* in order to interpret things
> > appropriately,
> > > > is
> > > > > > that a guarantee you're willing to commit to
> > > > > > long term?
> > > > >
> > > > > Yes, each upload/report is a single ChromeUserMetricsExtension
> > message.
> > > This
> > > > is
> > > > > indeed a guaranteed aspect of the UMA design.  If we ever change
> > this,
> > it'll
> > > > be
> > > > > a lot of work, and we'll be careful to work with stakeholders to
> > migrate
> > > them
> > > > > over... but I really don't anticipate us wanting to change this
> > anytime
> in
> > > the
> > > > > foreseeable future.
> > > >
> > > > Is there a size limit on the individual ChromeUserMetricsExtension
> message?
> > > > (Would it ever get split up?)
> > >
> > > There is a theoretical size limit, yes -- the server will reject
> unreasonably
> > > large uploads.  In practice, this is not a concern for histograms.
> > >
> > > The best way to think about this change is as though we are adding a
> repeated
> > > histogram field to the SystemProfile message.  If we were to ever split
> > up
> > > uploads, we would duplicate the SystemProfile across all of the
> > fragmented
> > > uploads, and we would do the same for any histograms that are included
> > in
> each
> > > upload.
>
> > I'm still unclear how we will encode the carrier name in this histogram -
> > can
> > you explain?
>
> The way that is already implemented in this CL: leading three digits for
> MCC,
> remaining digits for MNC.
>
> https://codereview.chromium.org/246553003/
>

To unsubscribe from this group and stop receiving emails from it, send an email
to chromium-reviews+unsubscribe@chromium.org.

bolian

I want to pointed out that this CL by itself is not complete, it is ...

6 years, 7 months ago (2014-05-01 20:38:42 UTC) #44

bolian

As discussed in the meeting, I now also log the operator code when it is ...

6 years, 7 months ago (2014-05-05 20:58:06 UTC) #45

bengr

https://codereview.chromium.org/246553003/diff/190001/net/base/network_change_notifier.cc File net/base/network_change_notifier.cc (right): https://codereview.chromium.org/246553003/diff/190001/net/base/network_change_notifier.cc#newcode8 net/base/network_change_notifier.cc:8: #include "base/metrics/sparse_histogram.h" Move these to the #if defined(OS_ANDROID) below. ...

6 years, 7 months ago (2014-05-05 21:57:57 UTC) #47

bolian

https://codereview.chromium.org/246553003/diff/190001/net/base/network_change_notifier.cc File net/base/network_change_notifier.cc (right): https://codereview.chromium.org/246553003/diff/190001/net/base/network_change_notifier.cc#newcode8 net/base/network_change_notifier.cc:8: #include "base/metrics/sparse_histogram.h" On 2014/05/05 21:57:58, bengr1 wrote: > Move ...

6 years, 7 months ago (2014-05-05 22:18:20 UTC) #48

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/bolian@chromium.org/246553003/210001

6 years, 7 months ago (2014-05-06 17:06:59 UTC) #51

commit-bot: I haz the power

FYI, CQ is re-trying this CL (attempt #1). Please consider checking whether the failures are ...

6 years, 7 months ago (2014-05-07 08:09:22 UTC) #52

Message was sent while issue was closed.

Change committed as 268731

Issue 246553003: Record Mobile Operator ID in UMA report. This is Android only for now. (Closed)

Description

Patch Set 1 #

Patch Set 2 : . #

Patch Set 3 : addressed comments. #

Patch Set 4 : revert proto change and emit a new histogram #

Patch Set 5 : . #

Patch Set 6 : addressed comments. #

Patch Set 7 : log on connection type change. #

Patch Set 8 : . #

Patch Set 9 : addressed comments. #

Patch Set 10 : . #

Patch Set 11 : log non-mobile code as well. #

Patch Set 12 : addressed comments. #

Messages