|
|
Created:
4 years, 11 months ago by Polina Bondarenko Modified:
4 years, 10 months ago Reviewers:
Thiemo Nagel, Ilya Sherman, jochen (gone - plz use gerrit), Andrew T Wilson (Slow), battre CC:
asvitkine+watch_chromium.org, chromium-reviews, davemoore+watch_chromium.org, oshima+watch_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionLog anonymization is replaced by AnonymizerTool.
The regex experessions checks of IP, URL, Email, SSID in
SystemLogUploader are substituted with feedback::AnonymizerTool, that treats the same and more (+BSSID).
Removed UMA stats from SystemLogUploader.
BUG=579040
Committed: https://crrev.com/11834980594fb8ac47572e691988610ef9b60422
Cr-Commit-Position: refs/heads/master@{#371525}
Patch Set 1 : #
Total comments: 8
Patch Set 2 : #
Total comments: 6
Patch Set 3 : #
Total comments: 2
Patch Set 4 : #
Total comments: 10
Patch Set 5 : #
Total comments: 2
Patch Set 6 : Added obsolete to removed histogram values. #
Total comments: 4
Patch Set 7 : Fixed indent. #Patch Set 8 : Rebase #
Total comments: 1
Messages
Total messages: 40 (16 generated)
Description was changed from ========== Added anonymizer to SystemLogUploader. BUG=579040 ========== to ========== Added anonymizer to SystemLogUploader. BUG=579040 ==========
pbond@chromium.org changed reviewers: + tnagel@google.com
The CQ bit was checked by pbond@chromium.org
The CQ bit was unchecked by pbond@chromium.org
Patchset #1 (id:1) has been deleted
pbond@chromium.org changed reviewers: + tnagel@chromium.org - tnagel@google.com
Description was changed from ========== Added anonymizer to SystemLogUploader. BUG=579040 ========== to ========== Added anonymizer to SystemLogUploader. AnonymizerTool removes PII from logs: SSID, BSSID, Email, URL, IP. Fixed PIITest unittest to make sure that AnonymizerTool still works as intended. BUG=579040 ==========
Concerning the commit message: "Fixed PIITest unittest to make sure that AnonymizerTool still works as intended." "still" is not correct, because AnonymizerTool wasn't used before. Could you please rephrase? https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.cc (right): https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.cc:59: system_logs->push_back(std::make_pair(file_path, data)); Why have you moved the anonymizing step to UploadSystemLogs()? In my opinion, it is safer to anonymize as soon as possible. (This reduces the chance of introducing errors in future changes to the code.) https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.h:84: std::string RemoveSensitiveData(const std::string& data); I think this should be a private method. https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.h:130: feedback::AnonymizerTool anonymizer_; I'd suggest to let the AnonymizerTool live as a local variable in ReadFiles(). That way, it's internal state is cleared before every upload so that its address re-numbering scheme is predictable. (Otherwise re-numbering would depend on whether the device was rebooted or not.) https://codereview.chromium.org/1610123003/diff/20001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (left): https://codereview.chromium.org/1610123003/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:10265: -<histogram name="Enterprise.SystemLogPIILeak" enum="SystemLogPIIType"> I don't think it is OK to change the type of an histogram after it was created. What is the purpose of that histogram anyways? Is anybody looking at this data? If we would like to study this, AnonymizerTool would be a better place for the histogram.
Another thing: "Added anonymizer to SystemLogUploader." seems to imply that previously there was no anonymization. Thus I'd rather write that the current anonymization is replaced by AnonymizerTool. Concerning review: Since Drew reviewed the initial CLs, it might make sense to have him review the follow-up as well?
Fixed comments: - Moved anonymization to ReadFiles - Removed PIITest unit test. (I don't see why we need it since there are unittests for AnonymizerTool) - Removed UMA histograms for System logs. (Not sure, could I do that, but it seems like we already know, that there are some types of PII in system logs and we remove this data, why should we log that?) https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.cc (right): https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.cc:59: system_logs->push_back(std::make_pair(file_path, data)); On 2016/01/21 14:22:29, Thiemo Nagel wrote: > Why have you moved the anonymizing step to UploadSystemLogs()? In my opinion, > it is safer to anonymize as soon as possible. (This reduces the chance of > introducing errors in future changes to the code.) Yes, you're right, I like mostly the idea of moving AnonymizerTool and all anonymization to ReadFiles. https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.h:84: std::string RemoveSensitiveData(const std::string& data); On 2016/01/21 14:22:29, Thiemo Nagel wrote: > I think this should be a private method. Thinking about this, I decided to remove this method at all and make all anonymization inside ReadFiles (see code in .cc file) https://codereview.chromium.org/1610123003/diff/20001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.h:130: feedback::AnonymizerTool anonymizer_; On 2016/01/21 14:22:29, Thiemo Nagel wrote: > I'd suggest to let the AnonymizerTool live as a local variable in ReadFiles(). > That way, it's internal state is cleared before every upload so that its address > re-numbering scheme is predictable. (Otherwise re-numbering would depend on > whether the device was rebooted or not.) Done. https://codereview.chromium.org/1610123003/diff/20001/tools/metrics/histogram... File tools/metrics/histograms/histograms.xml (left): https://codereview.chromium.org/1610123003/diff/20001/tools/metrics/histogram... tools/metrics/histograms/histograms.xml:10265: -<histogram name="Enterprise.SystemLogPIILeak" enum="SystemLogPIIType"> On 2016/01/21 14:22:29, Thiemo Nagel wrote: > I don't think it is OK to change the type of an histogram after it was created. > What is the purpose of that histogram anyways? Is anybody looking at this data? > > If we would like to study this, AnonymizerTool would be a better place for the > histogram. Removed.
Patchset #2 (id:40001) has been deleted
Please also see my previous comments about the description! https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.cc (right): https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.cc:15: #include "base/metrics/histogram_macros.h" I think this is not needed anymore. https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.cc:27: #include "components/policy/core/common/cloud/enterprise_metrics.h" I think this is not needed anymore. https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader_unittest.cc (left): https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader_unittest.cc:290: TEST_F(SystemLogUploaderTest, TestPII) { I'd prefer to keep that test to prove that the algorithm change doesn't introduce a regression.
Fixed comments: moved anonymizer back to SystemLogUploader::RemovePIIData for testing. https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.cc (right): https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.cc:15: #include "base/metrics/histogram_macros.h" On 2016/01/21 16:54:35, Thiemo Nagel wrote: > I think this is not needed anymore. Done. https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.cc:27: #include "components/policy/core/common/cloud/enterprise_metrics.h" On 2016/01/21 16:54:35, Thiemo Nagel wrote: > I think this is not needed anymore. Done. https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader_unittest.cc (left): https://codereview.chromium.org/1610123003/diff/60001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader_unittest.cc:290: TEST_F(SystemLogUploaderTest, TestPII) { On 2016/01/21 16:54:35, Thiemo Nagel wrote: > I'd prefer to keep that test to prove that the algorithm change doesn't > introduce a regression. Done.
LGTM % comment and % fixing the commit message as commented previously. https://codereview.chromium.org/1610123003/diff/80001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/80001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.h:19: #include "components/feedback/anonymizer_tool.h" Please use forward declaration of AnonymizerTool instead of including the header.
Description was changed from ========== Added anonymizer to SystemLogUploader. AnonymizerTool removes PII from logs: SSID, BSSID, Email, URL, IP. Fixed PIITest unittest to make sure that AnonymizerTool still works as intended. BUG=579040 ========== to ========== Log anonymization is replaced by AnonymizerTool. The regex experessions checks of IP, URL, Email, SSID in SystemLogUploader are substituted with feedback::AnonymizerTool, that treats the same and more (+BSSID). Removed UMA stats from SystemLogUploader. BUG=579040 ==========
Fixed comment. https://codereview.chromium.org/1610123003/diff/80001/chrome/browser/chromeos... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/80001/chrome/browser/chromeos... chrome/browser/chromeos/policy/system_log_uploader.h:19: #include "components/feedback/anonymizer_tool.h" On 2016/01/21 19:11:49, Thiemo Nagel wrote: > Please use forward declaration of AnonymizerTool instead of including the > header. Done.
pbond@chromium.org changed reviewers: + atwilson@chromium.org
Great! LGTM.
LGTM with comments https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.cc (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.cc:58: policy::SystemLogUploader::RemoveSensitiveData(anonymizer, data))); Pass |anonymizer| as const object, or as ptr. https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.cc:218: return anonymizer.Anonymize(data); Is there really value to have this broken-out static routine now given that it's exactly one line? https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.h:87: static std::string RemoveSensitiveData(feedback::AnonymizerTool& anonymizer, If you're passing in an anonymizer by reference, then it needs to be const. https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader_unittest.cc (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader_unittest.cc:306: "aaaaaaaa SSID='1'aaaaa\n" Are we OK that apparently we don't anonymize strings of the form SSID=1234 any more (only SSID='1234')?
Fixed comments. https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.cc (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.cc:58: policy::SystemLogUploader::RemoveSensitiveData(anonymizer, data))); On 2016/01/25 11:06:42, Andrew T Wilson (Slow) wrote: > Pass |anonymizer| as const object, or as ptr. Passed as const ptr. https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.cc:218: return anonymizer.Anonymize(data); On 2016/01/25 11:06:42, Andrew T Wilson (Slow) wrote: > Is there really value to have this broken-out static routine now given that it's > exactly one line? It exists only for unit testing. https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.h:87: static std::string RemoveSensitiveData(feedback::AnonymizerTool& anonymizer, On 2016/01/25 11:06:42, Andrew T Wilson (Slow) wrote: > If you're passing in an anonymizer by reference, then it needs to be const. Fixed to const ptr. The object can't be const, because the anonymizer is changed in process of anonymizing logs. E.g., it lists all IP addresses to count them. https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader_unittest.cc (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader_unittest.cc:306: "aaaaaaaa SSID='1'aaaaa\n" On 2016/01/25 11:06:42, Andrew T Wilson (Slow) wrote: > Are we OK that apparently we don't anonymize strings of the form SSID=1234 any > more (only SSID='1234')? I think, it's OK. AnonymizerTool covers more cases than the previous implementation. Fixed unit test to check [SSID=1234], because it mostly fits the case of /var/log/net.log
pbond@chromium.org changed reviewers: + isherman@chromium.org, jochen@chromium.org
isherman@chromium.org: Please review changes in tools/metrics/histograms/histograms.xml jochen@chromium.org: Please review changes in chrome/browser/chromeos/BUILD.gn
https://codereview.chromium.org/1610123003/diff/120001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (left): https://codereview.chromium.org/1610123003/diff/120001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:10272: -</histogram> Please mark this as <obsolete> rather than removing the entry.
Fixed histogram comment. https://codereview.chromium.org/1610123003/diff/120001/tools/metrics/histogra... File tools/metrics/histograms/histograms.xml (left): https://codereview.chromium.org/1610123003/diff/120001/tools/metrics/histogra... tools/metrics/histograms/histograms.xml:10272: -</histogram> On 2016/01/25 19:42:23, Ilya Sherman wrote: > Please mark this as <obsolete> rather than removing the entry. Done.
histograms.xml lgtm
lgtm with comments addressed https://codereview.chromium.org/1610123003/diff/140001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/140001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.h:88: feedback::AnonymizerTool* const anonymizer, can this pointer also be const? https://codereview.chromium.org/1610123003/diff/140001/chrome/chrome_browser_... File chrome/chrome_browser_chromeos.gypi (right): https://codereview.chromium.org/1610123003/diff/140001/chrome/chrome_browser_... chrome/chrome_browser_chromeos.gypi:1130: '../components/components.gyp:feedback_component', indenting looks odd?
Fixed indent. https://codereview.chromium.org/1610123003/diff/140001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.h (right): https://codereview.chromium.org/1610123003/diff/140001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.h:88: feedback::AnonymizerTool* const anonymizer, On 2016/01/26 11:47:39, jochen wrote: > can this pointer also be const? No, AnonymizerTool can't be const, it calls non-const method Anonymize() and, for example, modifies a list of already translated PII data inside the object to be able to translate IP addresses to <IP 1>, <IP 2>, etc. https://codereview.chromium.org/1610123003/diff/140001/chrome/chrome_browser_... File chrome/chrome_browser_chromeos.gypi (right): https://codereview.chromium.org/1610123003/diff/140001/chrome/chrome_browser_... chrome/chrome_browser_chromeos.gypi:1130: '../components/components.gyp:feedback_component', On 2016/01/26 11:47:39, jochen wrote: > indenting looks odd? Thanks! Fixed.
https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader_unittest.cc (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader_unittest.cc:306: "aaaaaaaa SSID='1'aaaaa\n" Is the new code not a superset of the old functionality? Could you please include "SSID 'sensitive_wifi'" (real example from CrOS logs, without quotes) in the unit test?
https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader_unittest.cc (right): https://codereview.chromium.org/1610123003/diff/100001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader_unittest.cc:306: "aaaaaaaa SSID='1'aaaaa\n" On 2016/01/26 15:49:24, Thiemo Nagel wrote: > Is the new code not a superset of the old functionality? > > Could you please include "SSID 'sensitive_wifi'" (real example from CrOS logs, > without quotes) in the unit test? > Yes, it is. Thiemo, do you think that we should cover all use cases here? These cases are covered in anonymizer_tool_unittests.cc
> Yes, it is. Thiemo, do you think that we should cover all use cases here? These > cases are covered in anonymizer_tool_unittests.cc Thank you, I just wanted to make sure the case is covered. :) No need to add another test if there's already one in AnonymizerTool.
Thanks for reviewing!
The CQ bit was checked by pbond@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from tnagel@chromium.org, atwilson@chromium.org, isherman@chromium.org, jochen@chromium.org Link to the patchset: https://codereview.chromium.org/1610123003/#ps180001 (title: "Rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1610123003/180001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1610123003/180001
Message was sent while issue was closed.
Description was changed from ========== Log anonymization is replaced by AnonymizerTool. The regex experessions checks of IP, URL, Email, SSID in SystemLogUploader are substituted with feedback::AnonymizerTool, that treats the same and more (+BSSID). Removed UMA stats from SystemLogUploader. BUG=579040 ========== to ========== Log anonymization is replaced by AnonymizerTool. The regex experessions checks of IP, URL, Email, SSID in SystemLogUploader are substituted with feedback::AnonymizerTool, that treats the same and more (+BSSID). Removed UMA stats from SystemLogUploader. BUG=579040 ==========
Message was sent while issue was closed.
Committed patchset #8 (id:180001)
Message was sent while issue was closed.
Description was changed from ========== Log anonymization is replaced by AnonymizerTool. The regex experessions checks of IP, URL, Email, SSID in SystemLogUploader are substituted with feedback::AnonymizerTool, that treats the same and more (+BSSID). Removed UMA stats from SystemLogUploader. BUG=579040 ========== to ========== Log anonymization is replaced by AnonymizerTool. The regex experessions checks of IP, URL, Email, SSID in SystemLogUploader are substituted with feedback::AnonymizerTool, that treats the same and more (+BSSID). Removed UMA stats from SystemLogUploader. BUG=579040 Committed: https://crrev.com/11834980594fb8ac47572e691988610ef9b60422 Cr-Commit-Position: refs/heads/master@{#371525} ==========
Message was sent while issue was closed.
Patchset 8 (id:??) landed as https://crrev.com/11834980594fb8ac47572e691988610ef9b60422 Cr-Commit-Position: refs/heads/master@{#371525}
Message was sent while issue was closed.
battre@chromium.org changed reviewers: + battre@chromium.org
Message was sent while issue was closed.
https://codereview.chromium.org/1610123003/diff/180001/chrome/browser/chromeo... File chrome/browser/chromeos/policy/system_log_uploader.cc (left): https://codereview.chromium.org/1610123003/diff/180001/chrome/browser/chromeo... chrome/browser/chromeos/policy/system_log_uploader.cc:73: const char kWebUrl[] = "(http|https|Http|Https|rtsp|Rtsp):\\/\\/"; The AnonymizerTool does not sanitize rtsp URLs. Could you add that by adapting the line #define SCHEME NCG("http|https|ftp|chrome|chrome-extension|android") in components/feedback/anonymizer_tool.cc? Thanks, Dominic |