| 
    
      
  | 
  
 Chromium Code Reviews
        
  Description[Android WebView] Implement copying and uploading of Minidumps.
During WebView start-up after crashing WebView, we will pass previously
created minidumps to CrashReceiverService which copies the minidumps to
the current WebView provider's data directory and starts a job through
JobScheduler. This job will eventually start a MinidumpUploadJobService
which in turn uploads Minidumps to Crash servers.
Both the copying-service and the upload-service run in processes
separate from the crashing app-process.
BUG=652730
Committed: https://crrev.com/19b0d1b784a9581b02a5dbe8e6fa1d2556743e75
Cr-Commit-Position: refs/heads/master@{#440130}
   
  Patch Set 1 #Patch Set 2 : Add test to ensure copying + uploading implementations work together. #
      Total comments: 6
      
     
  
  Patch Set 3 : Add some more tests, move back getCrashType() from CrashFileManager to MDUS. #
      Total comments: 1
      
     
  
  Patch Set 4 : Make minidump-copying serialized, put temporary copy-files in separate directory. #Patch Set 5 : Removed some incorrect/unnecessary minor stuff (comments/logs). #
      Total comments: 85
      
     
  
  Patch Set 6 : Fix most of Ilya's comments (a couple of bugs and some javadocs comments). #
      Total comments: 2
      
     
  
  Patch Set 7 : Add storage/app-throttling + minidump file size restriction. #
      Total comments: 8
      
     
  
  Patch Set 8 : Add command line argument to ensure minidump uploading is turned off by default. #Patch Set 9 : Remove debug logging. #
      Total comments: 12
      
     
  
  Patch Set 10 : Fix (most of) Toby's comments. #
      Total comments: 23
      
     
  
  Patch Set 11 : Fix Ilya's comments + change upload-job back-off timing. #
      Total comments: 4
      
     
  
  Patch Set 12 : Fix Ilya's comments for realz (count maxTried minidumps towards minidump limit). #Patch Set 13 : Fix bug that deleted the newest minidump instead of the oldest when copying a minidump after hittin… #Patch Set 14 : Add minidump-component deps to android_webview. Fix file-io-related trybot issues. #
      Total comments: 2
      
     
  
  Patch Set 15 : Fix trybot failures + close ParcelFDs after use + fix Ilya's nit. #Patch Set 16 : Fix findbugs errors. #Patch Set 17 : Fix findbugs errors for realz! #Messages
    Total messages: 63 (15 generated)
     
  
  
 gsennton@chromium.org changed reviewers: + tobiasjs@chromium.org 
 Just uploading this for visibility right now. 
 gsennton@chromium.org changed reviewers: + isherman@chromium.org 
 Hi Ilya, I'm not asking for this to be reviewed yet (there are too many TODOs in there) but could you please have a look at the couple of comments in PS2 and see if you have any thoughts? :) https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:436: // TODO does it make any sense to put this method here? (except that it is then available to Can we move getCrashType() here? getCrashType() would be used from MinidumpUploaderTest to be able to differentiate between two minidumps - i.e. we create one browser-minidump and one renderer-minidump but only copy one of them, and then ensure that the one that was copied is indeed uploaded correctly (and we can't differentiate between them using file names since the act of copying the files will rename them using some globally unique name). https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:485: * TODO need to ensure we don't accidentally delete these files during clean-up.. maybe only The current behaviour for this method is: 1. Copy minidump from file descriptor to temporary file (Crash Files/foo.tmp). 2. Rename tmp-file as a minidump, i.e. 'Crash Files/foo.tmp' -> 'Crash Files/foo.dmp' If we happen to run CrashFileManager.cleanOutAllNonFreshMinidumpFiles() between steps 1 and 2 above we will remove the current tmp file. Should we be holding a lock on the temporary file somehow - to avoid it being deleted? (or change the implementation of cleanOutAllNonFreshMinidumpFiles() to only remove temporary files that were created a certain time ago? (e.g. at least 2 minutes ago). 
 https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:436: // TODO does it make any sense to put this method here? (except that it is then available to On 2016/11/29 14:21:09, gsennton wrote: > Can we move getCrashType() here? getCrashType() would be used from > MinidumpUploaderTest to be able to differentiate between two minidumps - i.e. we > create one browser-minidump and one renderer-minidump but only copy one of them, > and then ensure that the one that was copied is indeed uploaded correctly (and > we can't differentiate between them using file names since the act of copying > the files will rename them using some globally unique name). Dunno, I guess I don't see any major concern about moving this here. It's not a perfect spot for it, but it's not a particularly poor one, either. https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:485: * TODO need to ensure we don't accidentally delete these files during clean-up.. maybe only On 2016/11/29 14:21:09, gsennton wrote: > The current behaviour for this method is: > 1. Copy minidump from file descriptor to temporary file (Crash Files/foo.tmp). > 2. Rename tmp-file as a minidump, i.e. 'Crash Files/foo.tmp' -> 'Crash > Files/foo.dmp' > > > If we happen to run CrashFileManager.cleanOutAllNonFreshMinidumpFiles() between > steps 1 and 2 above we will remove the current tmp file. Should we be holding a > lock on the temporary file somehow - to avoid it being deleted? (or change the > implementation of cleanOutAllNonFreshMinidumpFiles() to only remove temporary > files that were created a certain time ago? (e.g. at least 2 minutes ago). Adding a safety check to the cleanup code to not delete really recent files seems reasonable to me, assuming this doesn't have some sort of horrible consequence on machines where the system clock is bonkers =) 
 https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:436: // TODO does it make any sense to put this method here? (except that it is then available to On 2016/11/30 01:26:04, Ilya Sherman wrote: > On 2016/11/29 14:21:09, gsennton wrote: > > Can we move getCrashType() here? getCrashType() would be used from > > MinidumpUploaderTest to be able to differentiate between two minidumps - i.e. > we > > create one browser-minidump and one renderer-minidump but only copy one of > them, > > and then ensure that the one that was copied is indeed uploaded correctly (and > > we can't differentiate between them using file names since the act of copying > > the files will rename them using some globally unique name). > > Dunno, I guess I don't see any major concern about moving this here. It's not a > perfect spot for it, but it's not a particularly poor one, either. Actually, I'm just gonna read the entire file and ensure the contents match before and after the copying + uploading (and leave the getCrashType-method in MinidumpUploadService). :) https://codereview.chromium.org/2515353005/diff/20001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:485: * TODO need to ensure we don't accidentally delete these files during clean-up.. maybe only On 2016/11/30 01:26:04, Ilya Sherman wrote: > On 2016/11/29 14:21:09, gsennton wrote: > > The current behaviour for this method is: > > 1. Copy minidump from file descriptor to temporary file (Crash Files/foo.tmp). > > 2. Rename tmp-file as a minidump, i.e. 'Crash Files/foo.tmp' -> 'Crash > > Files/foo.dmp' > > > > > > If we happen to run CrashFileManager.cleanOutAllNonFreshMinidumpFiles() > between > > steps 1 and 2 above we will remove the current tmp file. Should we be holding > a > > lock on the temporary file somehow - to avoid it being deleted? (or change the > > implementation of cleanOutAllNonFreshMinidumpFiles() to only remove temporary > > files that were created a certain time ago? (e.g. at least 2 minutes ago). > > Adding a safety check to the cleanup code to not delete really recent files > seems reasonable to me, assuming this doesn't have some sort of horrible > consequence on machines where the system clock is bonkers =) Urk, are there any such devices? :) Not sure how to ensure this doesn't break anything in the real world :/ 
 > Urk, are there any such devices? :) > Not sure how to ensure this doesn't break anything in the real world :/ It's moderately safe to assume that the system clock is almost always monotonically increasing, because things will get really weird for a lot of apps if that's not the case. But I have no problem with losing a few minidumps because the system clock says they're really old, but reality disagrees. 
 On 2016/11/30 17:57:36, Tobias Sargeant wrote: > > Urk, are there any such devices? :) > > Not sure how to ensure this doesn't break anything in the real world :/ > > It's moderately safe to assume that the system clock is almost always > monotonically increasing, because things will get really weird for a lot of apps > if that's not the case. But I have no problem with losing a few minidumps > because the system clock says they're really old, but reality disagrees. I guess my concern is that some system might have a frozen or backwards-running clock, and we'd end up never deleting old files on that system. I'm not sure how valid this concern is... (FWIW, I think it's fine if we sometimes think that minidump files are old when they are new, since it's rare that this check should matter anyway; I'd expect the intersection of race conditions + seriously overeager system clocks to be rare enough to be willing to lose those crash reports. I'm more worried about this check causing files to stick around forever on systems with a sufficiently busted clock.) 
 On 2016/12/01 00:42:44, Ilya Sherman wrote: > On 2016/11/30 17:57:36, Tobias Sargeant wrote: > > > Urk, are there any such devices? :) > > > Not sure how to ensure this doesn't break anything in the real world :/ > > > > It's moderately safe to assume that the system clock is almost always > > monotonically increasing, because things will get really weird for a lot of > apps > > if that's not the case. But I have no problem with losing a few minidumps > > because the system clock says they're really old, but reality disagrees. > > I guess my concern is that some system might have a frozen or backwards-running > clock, and we'd end up never deleting old files on that system. I'm not sure > how valid this concern is... (FWIW, I think it's fine if we sometimes think > that minidump files are old when they are new, since it's rare that this check > should matter anyway; I'd expect the intersection of race conditions + seriously > overeager system clocks to be rare enough to be willing to lose those crash > reports. I'm more worried about this check causing files to stick around > forever on systems with a sufficiently busted clock.) I think this would be adequately covered by having a maximum number of stored files, and deleting the file that has had most upload tries when that limit is exceeded, breaking ties based upon ctime. 
 On 2016/12/01 10:45:20, Tobias Sargeant wrote: > On 2016/12/01 00:42:44, Ilya Sherman wrote: > > On 2016/11/30 17:57:36, Tobias Sargeant wrote: > > > > Urk, are there any such devices? :) > > > > Not sure how to ensure this doesn't break anything in the real world :/ > > > > > > It's moderately safe to assume that the system clock is almost always > > > monotonically increasing, because things will get really weird for a lot of > > apps > > > if that's not the case. But I have no problem with losing a few minidumps > > > because the system clock says they're really old, but reality disagrees. > > > > I guess my concern is that some system might have a frozen or > backwards-running > > clock, and we'd end up never deleting old files on that system. I'm not sure > > how valid this concern is... (FWIW, I think it's fine if we sometimes think > > that minidump files are old when they are new, since it's rare that this check > > should matter anyway; I'd expect the intersection of race conditions + > seriously > > overeager system clocks to be rare enough to be willing to lose those crash > > reports. I'm more worried about this check causing files to stick around > > forever on systems with a sufficiently busted clock.) > > I think this would be adequately covered by having a maximum number of stored > files, and deleting the file that has had most upload tries when that limit is > exceeded, breaking ties based upon ctime. We are only talking about tmp-files here so they wouldn't have any upload tries (and if the system clock is busted, won't any argument we have about timing just be invalid? (so keeping the first X files won't work since "first" is undefined?) I guess the alternative would be to lock access to the temp-file while creating (and renaming) it, is such a thing possible? 
 On 2016/12/01 13:23:24, gsennton wrote: > We are only talking about tmp-files here so they wouldn't have any upload tries > (and if the system clock is busted, won't any argument we have about timing just > be invalid? (so keeping the first X files won't work since "first" is > undefined?) In that case, can't cleanOutAllNonFreshMinidumpFiles explicitly skip tempfiles, and then have the copying code guarantee that it has either renamed the tempfile correctly, or has deleted it? The only way that could fail is having the service be killed without generating an exception. Also, you could clear out all tempfiles at the beginning of the copy (before you create your own) if that only executed on a single thread (or was synchronized - both of which seem reasonable, given that multiple apps shouldn't be connecting to the service at the same time, as a rule). 
 On 2016/12/01 13:52:20, Tobias Sargeant wrote: > On 2016/12/01 13:23:24, gsennton wrote: > > > We are only talking about tmp-files here so they wouldn't have any upload > tries > > (and if the system clock is busted, won't any argument we have about timing > just > > be invalid? (so keeping the first X files won't work since "first" is > > undefined?) > > In that case, can't cleanOutAllNonFreshMinidumpFiles explicitly skip tempfiles, > and then have the copying code guarantee that it has either renamed the tempfile > correctly, or has deleted it? The only way that could fail is having the service > be killed without generating an exception. I don't think we want to skip deleting temp-files since we are supporting the case of creating .tmp files in the crash directory (but I might be wrong ;)). > Also, you could clear out all tempfiles at the beginning of the copy (before you > create your own) if that only executed on a single thread (or was synchronized - > both of which seem reasonable, given that multiple apps shouldn't be connecting > to the service at the same time, as a rule). Yeah, we could probably just make the entire copying-mechanism single-threaded, though we might still want to clean out crash files from some other thread (e.g. from the upload-service) but I guess the chance of us hitting that case (cleaning out files just when the tmp-file is being renamed) would be small enough to ignore? 
 On 2016/12/01 16:01:05, gsennton wrote: > On 2016/12/01 13:52:20, Tobias Sargeant wrote: > > On 2016/12/01 13:23:24, gsennton wrote: > > > > > We are only talking about tmp-files here so they wouldn't have any upload > > tries > > > (and if the system clock is busted, won't any argument we have about timing > > just > > > be invalid? (so keeping the first X files won't work since "first" is > > > undefined?) > > > > In that case, can't cleanOutAllNonFreshMinidumpFiles explicitly skip > tempfiles, > > and then have the copying code guarantee that it has either renamed the > tempfile > > correctly, or has deleted it? The only way that could fail is having the > service > > be killed without generating an exception. > > > I don't think we want to skip deleting temp-files since we are supporting the > case of creating .tmp files in the crash directory (but I might be wrong ;)). > > > Also, you could clear out all tempfiles at the beginning of the copy (before > you > > create your own) if that only executed on a single thread (or was synchronized > - > > both of which seem reasonable, given that multiple apps shouldn't be > connecting > > to the service at the same time, as a rule). > > Yeah, we could probably just make the entire copying-mechanism single-threaded, > though we might still want to clean out crash files from some other thread (e.g. > from the upload-service) but I guess the chance of us hitting that case > (cleaning out files just when the tmp-file is being renamed) would be small > enough to ignore? To take a step back here, the initial concern was that there's a small race condition: the cleanup code can run between when the tmp files are created and when they're renamed, right? Would it work to only apply this check to the *first* .tmp file that we see? (The OS will return them in *some* order, it doesn't have to be the first chronologically.) I think this dramatically reduces the likelihood of the race causing problems, and also puts a pretty strict cap on the number of files that might stick around on a system with a funny clock. 
 Regarding your suggestion Ilya: I could be wrong but it feels like only applying the is-old-enough check to one tmp-file is very specific/special behaviour that could easily work in an unexpected way? Maybe we should store the copying-tmp files in a way/place where CrashFileManager won't find them (so we don't have to mess with the tmp-removal logic of CrashFileManager since that could break the way we use tmp-files in Chrome) and just have the copy-method take care of deleting the tmp-file (ensuring that it is deleted if we fail to rename it)? Regarding the current CL (PS3): I think it's now reached a reviewable state, there are still some (fairly major) things I need to fix, I'll try to list them all below: 1. App connections to the copying-service (CrashReceiverService) - the copying should be handled synchronously / on a single thread rather than on several binder threads (like now). I believe we still want calls into the service to be synchronous though, so that the service doesn't get shut down without having copied the files (which I believe can happen if apps call into the service asynchronously and then unbind from it before the files are done being copied). 2. Upload throttling a) Could implement app-throttling (only allow an app to copy a certain number of files per upload-batch). b) Could implement storage-throttling (only allow X minidumps to be stored at the same time per device). c) Could implement upload-throttling (add a minimum amount of time between successful batches of uploads + failure-backoff). 3. Cleanups in the MinidumpUploaderJobService when a job is done (e.g. not leaking the worker thread used). Apart from those things I think this CL is reviewable - could you ptal (if you see that some of the TODO questions have obvious solutions please let me know :)). Ilya, I'm aware that you are not a WebView reviewer, but maybe you should take a look at the way I'm using the minidump_uploader APIs, especially in CrashReceiverService, MinidumpUploaderImpl, and MinidumpUploaderTest? (if that makes sense :)). https://codereview.chromium.org/2515353005/diff/40001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/40001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:285: if (ageInSeconds >= MIN_TMPFILE_AGE_IN_SECONDS) { Oops, for now (as long as we haven't resolved the tmp-file issue) I'll just leave this code here. 
 On 2016/12/02 12:48:51, gsennton wrote: > Regarding your suggestion Ilya: > I could be wrong but it feels like only applying the is-old-enough check to one > tmp-file is very specific/special behaviour that could easily work in an > unexpected way? > > Maybe we should store the copying-tmp files in a way/place where > CrashFileManager won't find them (so we don't have to mess with the tmp-removal > logic of CrashFileManager since that could break the way we use tmp-files in > Chrome) and just have the copy-method take care of deleting the tmp-file > (ensuring that it is deleted if we fail to rename it)? Regarding the temporary copy-files we reached the conclusion that we can just put these in a separate directory (not recognized by CrashFileManager) and then rename each file to move it to the directory recognized by CrashFileManager. In this way we can run the cleanup-procedure in the upload-service, while only cleaning up copy-temp-files from the copying-service (and as long as we handle copying minidumps sequentially, we can't hit the case of removing copy-tmp-files while they are not yet renamed). 
 Sorry, I might not have a chance to take a look today. Will try to get to it by tomorrow evening if not tonight (all times PST, alas). 
 On 2016/12/06 04:10:20, Ilya Sherman wrote: > Sorry, I might not have a chance to take a look today. Will try to get to it by > tomorrow evening if not tonight (all times PST, alas). No worries, if you haven't had time to look at it when you see this message it might be better to just wait for the next PS (I'm currently fixing the copying-service so that it copies minidumps on one single thread and I think I've added another test since PS3 and made some other minor changes). I'll try to get a new PS uploaded today. 
 On 2016/12/06 04:10:20, Ilya Sherman wrote: > Sorry, I might not have a chance to take a look today. Will try to get to it by > tomorrow evening if not tonight (all times PST, alas). No worries, if you haven't had time to look at it when you see this message it might be better to just wait for the next PS (I'm currently fixing the copying-service so that it copies minidumps on one single thread and I think I've added another test since PS3 and made some other minor changes). I'll try to get a new PS uploaded today. 
 Alright! In PS5 the copying of minidumps is serialized so we don't run into any threading issues :). The entire CL should be cleaner as well. ptal :) 
 Haven't looked at the API uses yet, will try to get to that later today. In the meantime, here's a smattering of comments and questions for you =) https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:259: public boolean ensureCrashDirExists() { Please add some docs. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:263: } Where is the corresponding Chrome code, and should it be refactored to call into this function as well? https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:331: Log.e(TAG, "in listCrashFiles with pattern " + pattern); Please remember to revert these debug statements prior to committing =) https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:438: if (!tmpDir.isDirectory() && !tmpDir.mkdir()) { I'm not quite following the reasoning for having the intermediate location. What's preventing a pile of .tmp files from accumulating there? https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:446: if (getAllMinidumpFiles(maxTries).length >= MAX_CRASH_REPORTS_TO_KEEP) { This seems kind of scary, as it would prevent us from reporting new crashes if we have a backlog. I think we'd prefer to drop old crashes than new ones. I think it's okay during the lifetime of the process to exceed MAX_CRASH_REPORTS_TO_KEEP, as long as the cleanup code eventually cleans them up. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:465: // TODO the following file could potentially already exist... Could it? Does the Chrome case handle such a case? https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:473: @VisibleForTesting nit: Does this actually need to be visible for testing? I don't see it being used outside of this file. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:474: static File createMinidumpTmpFile(File directory, String prefix) throws IOException { Please add docs. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:478: private static String createThreadSafeFilePrefix() { Please add docs. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:479: return UUID.randomUUID().toString(); Are you sure that UUID.randomUUID() is threadsafe? https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:348: public void testCopyingAbortsWhenTooManyMinidumps() throws IOException { Hmm, I'm not sure that I understand the purpose of the functionality that is being tested here. What is the concern, and why not let the clean up code address that concern? 
 I didn't fully read/think through the MinidumpUploaderImpl code yet. I think it might be useful to get a high-level overview of the design, either via a design doc or via a quick video call. If this is covered by the main design doc, could you please remind me what the link is? Thanks! https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:208: } catch (IOException e) { nit: Could this be narrowed to FileNotFoundException, or am I misreading the docs? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:211: minidumpFiles[i].delete(); Hmm, I don't understand this line. Why do we delete the file before we've read it? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? Losing crashes seems bad. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:63: return; Is it worth throwing an exception in this case? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:67: Log.e(TAG, "something went wrong when waiting to copy minidumps, bailing!"); Where is the code that bails? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:84: while (mIsCopying) { I'm probably just being dense, but I don't understand how this code works. Where is mIsCopying ever set to false? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:129: Context context, String appPackageName, ParcelFileDescriptor[] fileDescriptors) { Is this idea that each app that instantiates a WebView generates its own crash reports, and then those are copied over into some central storage location, and then they are all uploaded from the central storage location? What's the advantage of the centralization over each app being responsible for uploading? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:140: Log.w(TAG, "failed to copy minidump from " + fd.toString()); What happens to the crash report in this case? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java:33: if (mRunningJob) throw new RuntimeException("Already running a job."); Is this essentially a fancy way to write an assertion, or is the code actually relying on catching the exception if there's a race? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:40: synchronized (mCurrentMinidumpLock) { Is it possible to enforce single-threadedness rather than using synchronization primitives? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:102: return true; Hmm, what happens if the client *was* in the sample when the minidump was generated, but was ejected by the time the upload code runs -- say, because they didn't have a network connection for a while? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:116: return true; It's probably worth documenting this return value. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:121: return true; Umm, why is this hardcoded to true? Do users not get a say about whether they want to report crashes? 
 Thanks for the review Ilya! There is a design doc here: https://docs.google.com/a/google.com/document/d/1xFkDwIwr3_XzGhohi_BnpPs0G8KI... Most of what you are looking for should be under the "Detailed Design" section. CrashReceiverService is what I refer to as the copying-service in the doc, and MinidumpUploadJobService is the upload-service - with MinidumpUploaderImpl being the class that actually implements the upload-logic and MinidumpUploadJobService being the connection between that and the JobScheduler. Let me know if that description is unclear / if you want a vc :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:208: } catch (IOException e) { On 2016/12/07 00:59:19, Ilya Sherman wrote: > nit: Could this be narrowed to FileNotFoundException, or am I misreading the > docs? Done. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:211: minidumpFiles[i].delete(); On 2016/12/07 00:59:19, Ilya Sherman wrote: > Hmm, I don't understand this line. Why do we delete the file before we've read > it? We have a filedescriptor pointing to the file, so the [I'm not sure about the terminology here] handle to the file will be deleted here, but the actual contents can still be read from the file descriptor. If we would want to be able to retry copying the file at some later point we can't call delete here. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? On 2016/12/07 00:59:19, Ilya Sherman wrote: > Losing crashes seems bad. I guess this is a trade-off between storing crash data in the app's data directory (which is where minidumpFiles are stored before we copy them through the transmitCrashes method) and uploading crashes even when we fail to connect to the copying-service at some point. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:63: return; On 2016/12/07 00:59:19, Ilya Sherman wrote: > Is it worth throwing an exception in this case? The point of this code would be to ensure that the appPackageName string is actually the name of the app that is connecting to us (so that an app can't fake its own name). So what we would want to do here is probably just return directly - this API is open to all apps on the device so we wouldn't want to present a way to crash the service like this :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:67: Log.e(TAG, "something went wrong when waiting to copy minidumps, bailing!"); On 2016/12/07 00:59:19, Ilya Sherman wrote: > Where is the code that bails? Haha, good question, adding a return-statement :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:84: while (mIsCopying) { On 2016/12/07 00:59:19, Ilya Sherman wrote: > I'm probably just being dense, but I don't understand how this code works. > Where is mIsCopying ever set to false? Oh wow (I think this was the last code I added before uploading - so I probably didn't test copying several minidumps in a row before the service got desctructed). This is a bug :), mIsCopying should be set to false when we are done copying minidumps. Thanks I fixed this and added a test testCopySeveralMinidumpBatches. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:129: Context context, String appPackageName, ParcelFileDescriptor[] fileDescriptors) { On 2016/12/07 00:59:19, Ilya Sherman wrote: > Is this idea that each app that instantiates a WebView generates its own crash > reports, and then those are copied over into some central storage location, and > then they are all uploaded from the central storage location? What's the > advantage of the centralization over each app being responsible for uploading? There are some reasons listed here: https://docs.google.com/a/google.com/document/d/1xFkDwIwr3_XzGhohi_BnpPs0G8KI... Additionaly I think we should be careful with how much data we are storing in an app's data directory - if we would have the app upload the data itself we might have to store the data there for a long time, while if we just copy the data to WebView's data directory, we will usually be able to clear out the data stored in the app's directory on the next app startup. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:136: File copiedFile = crashFileManager.copyMinidumpFromFD(fd.getFileDescriptor(), We should probably have basic checks to ensure we are copying a minidump rather than a random file? Do you know any good checks we can perform? (minidump format / headers) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:140: Log.w(TAG, "failed to copy minidump from " + fd.toString()); On 2016/12/07 00:59:19, Ilya Sherman wrote: > What happens to the crash report in this case? As is the current implemenetation in AwBrowserProcess, it would just be deleted. We might want different behaviour :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java:33: if (mRunningJob) throw new RuntimeException("Already running a job."); On 2016/12/07 00:59:19, Ilya Sherman wrote: > Is this essentially a fancy way to write an assertion, or is the code actually > relying on catching the exception if there's a race? Just an assertion (I'm not entirely sure we want it here since it's a hard fail). https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:40: synchronized (mCurrentMinidumpLock) { On 2016/12/07 00:59:19, Ilya Sherman wrote: > Is it possible to enforce single-threadedness rather than using synchronization > primitives? The reason I'm using synchronization primitives here is that onStopJob is called on the main thread of the service, while any work we perform (uploading) should be performed on a separate thread (mWorkerThread). This is because all JobScheduler callbacks (onStartJob / onStopJob) are called on the main thread - so if we perform the work on the main thread we won't notice whenever the JobScheduler tells us to stop. I want to be able to interrupt the uploading when onStopJob is called - we do that in cancelUploads() in this class through calling setCancelUpload() and mWorkerThread.interrupt() - and, without then waiting for the worker thread to finish the current uploading, perform the correct cleanup operation: renaming the current minidump as a failure, i.e. ".try"+1. We do the renaming on the main thread since that is where onStopJob is called (and we thus need to synchronize the name of the current minidump to reach it from the main thread). I guess it isn't super important to rename the minidump we happen to be handling when the job is aborted through onStopJob() - if not we could just remove the setCurrentMinidump/getCurrentMinidump methods. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:102: return true; On 2016/12/07 00:59:19, Ilya Sherman wrote: > Hmm, what happens if the client *was* in the sample when the minidump was > generated, but was ejected by the time the upload code runs -- say, because they > didn't have a network connection for a while? Currently, we plan to hard-code this value on the WebView-side, so it can't change over time. Long-term we will probably want Finch for this - and then it would make sense to ask for that value here. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:116: return true; On 2016/12/07 00:59:19, Ilya Sherman wrote: > It's probably worth documenting this return value. Actually, right now I'm not sure why isMetricsUploadPermitted is needed at all, isn't this covered by the rest of the methods here? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:121: return true; On 2016/12/07 00:59:19, Ilya Sherman wrote: > Umm, why is this hardcoded to true? Do users not get a say about whether they > want to report crashes? I'm investigating whether we need to check the permission here or whether we can just check it before generating minidumps (and thus if we never generate a minidump we won't upload anything). https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:259: public boolean ensureCrashDirExists() { On 2016/12/06 22:35:47, Ilya Sherman wrote: > Please add some docs. Done. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:263: } On 2016/12/06 22:35:47, Ilya Sherman wrote: > Where is the corresponding Chrome code, and should it be refactored to call into > this function as well? Chrome seems to create this directory in native code (before generating minidumps I would assume - since this is where we store minidumps ;)). It can be fetched through the PathProvider interface: https://cs.chromium.org/chromium/src/chrome/common/chrome_paths.cc?sq=package... I've added this method in Java because in the WebView-case this directory should exist in the app-directory (created through native code) but not in the WebView-directory (where the copying/upload services operate). https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:331: Log.e(TAG, "in listCrashFiles with pattern " + pattern); On 2016/12/06 22:35:47, Ilya Sherman wrote: > Please remember to revert these debug statements prior to committing =) Acknowledged. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:438: if (!tmpDir.isDirectory() && !tmpDir.mkdir()) { On 2016/12/06 22:35:47, Ilya Sherman wrote: > I'm not quite following the reasoning for having the intermediate location. > What's preventing a pile of .tmp files from accumulating there? The idea would be to ensure that this method doesn't leave any file in that directory (so I think I still need to add a finally-statement in here to ensure that contract is fulfilled). The reason we want the intermediate location is to ensure the upload-service doesn't happen to delete the tmp-file mid-copy (when cleaning out files from that service). https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:446: if (getAllMinidumpFiles(maxTries).length >= MAX_CRASH_REPORTS_TO_KEEP) { On 2016/12/06 22:35:47, Ilya Sherman wrote: > This seems kind of scary, as it would prevent us from reporting new crashes if > we have a backlog. I think we'd prefer to drop old crashes than new ones. I > think it's okay during the lifetime of the process to exceed > MAX_CRASH_REPORTS_TO_KEEP, as long as the cleanup code eventually cleans them > up. Yeah, that sounds good (dropping old crashes rather than new ones) - I just think we should have some kind of cap to ensure we aren't accumulating INF minidumps (which we could if something goes wrong / some app decides to call into our service to copy stuff over and over again). Though that cap should probably be higher than MAX_CRASH_REPORTS_TO_KEEP. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:465: // TODO the following file could potentially already exist... On 2016/12/06 22:35:47, Ilya Sherman wrote: > Could it? Does the Chrome case handle such a case? Not sure how Chrome handles potential collisions. What I mean with the TODO here is just that we are not checking whether the file already exists - so there is a chance that it does. As long as our filename generation scheme depends on the time at which they were created (and maps to some big space) this should be fine :) https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:473: @VisibleForTesting On 2016/12/06 22:35:47, Ilya Sherman wrote: > nit: Does this actually need to be visible for testing? I don't see it being > used outside of this file. It is indeed only called from inside this file. I think I wanted to add some kind of test using this method but I don't see a reason to use this in a test atm. Removing the annotation and changing access modifier. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:474: static File createMinidumpTmpFile(File directory, String prefix) throws IOException { On 2016/12/06 22:35:47, Ilya Sherman wrote: > Please add docs. Done. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:478: private static String createThreadSafeFilePrefix() { On 2016/12/06 22:35:47, Ilya Sherman wrote: > Please add docs. Done. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:479: return UUID.randomUUID().toString(); On 2016/12/06 22:35:47, Ilya Sherman wrote: > Are you sure that UUID.randomUUID() is threadsafe? Not really, I'll have to look into this more. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:348: public void testCopyingAbortsWhenTooManyMinidumps() throws IOException { On 2016/12/06 22:35:47, Ilya Sherman wrote: > Hmm, I'm not sure that I understand the purpose of the functionality that is > being tested here. What is the concern, and why not let the clean up code > address that concern? The concern is that any app could call the copying-service whenever they want which would mean we could be storing lots of minidumps - so I think we should have some cap on this, and not just rely on the upload-service cleaning these up after uploading a bunch of minidumps. 
 Sorry, again not going to have a chance to make a thorough pass today; will try to tomorrow. 
 In the meantime, responding to in-progress comment-conversations =) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:211: minidumpFiles[i].delete(); On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Hmm, I don't understand this line. Why do we delete the file before we've > read > > it? > > We have a filedescriptor pointing to the file, so the [I'm not sure about the > terminology here] handle to the file will be deleted here, but the actual > contents can still be read from the file descriptor. > > If we would want to be able to retry copying the file at some later point we > can't call delete here. (a) It would be helpful to document this behavior, since it's non-obvious, at least to me =) (b) I do think that it's a good idea to think about what happens to the crash reports if the copy fail for some reason. Should there be a retry mechanism? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Losing crashes seems bad. > > I guess this is a trade-off between storing crash data in the app's data > directory (which is where minidumpFiles are stored before we copy them through > the transmitCrashes method) and uploading crashes even when we fail to connect > to the copying-service at some point. Yeah. If it's expected that we'd (almost) never lose crash reports here, then maybe just having a histogram as a sanity-check would be sufficient. But, I wouldn't be surprised if file operations are not the most reliable thing ever, especially across processes. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:136: File copiedFile = crashFileManager.copyMinidumpFromFD(fd.getFileDescriptor(), On 2016/12/07 18:51:43, gsennton wrote: > We should probably have basic checks to ensure we are copying a minidump rather > than a random file? > Do you know any good checks we can perform? (minidump format / headers) No idea. I actually know almost nothing about the contents of the files... <_< >_> <_< https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:140: Log.w(TAG, "failed to copy minidump from " + fd.toString()); On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > What happens to the crash report in this case? > > As is the current implemenetation in AwBrowserProcess, it would just be deleted. > We might want different behaviour :) Yeah, or at least instrumentation as a sanity-check. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java:33: if (mRunningJob) throw new RuntimeException("Already running a job."); On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Is this essentially a fancy way to write an assertion, or is the code actually > > relying on catching the exception if there's a race? > > Just an assertion (I'm not entirely sure we want it here since it's a hard > fail). Okay. I think it might be better to write something like "assert(mRunningJob)" (or whatever the Java equivalent is), just for clarity. But, maybe that's just because I'm not used to Java best practices. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:40: synchronized (mCurrentMinidumpLock) { On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Is it possible to enforce single-threadedness rather than using > synchronization > > primitives? > > The reason I'm using synchronization primitives here is that onStopJob is called > on the main thread of the service, while any work we perform (uploading) should > be performed on a separate thread (mWorkerThread). > > This is because all JobScheduler callbacks (onStartJob / onStopJob) are called > on the main thread - so if we perform the work on the main thread we won't > notice whenever the JobScheduler tells us to stop. > I want to be able to interrupt the uploading when onStopJob is called - we do > that in cancelUploads() in this class through calling setCancelUpload() and > mWorkerThread.interrupt() - and, without then waiting for the worker thread to > finish the current uploading, perform the correct cleanup operation: renaming > the current minidump as a failure, i.e. ".try"+1. We do the renaming on the main > thread since that is where onStopJob is called (and we thus need to synchronize > the name of the current minidump to reach it from the main thread). I guess it > isn't super important to rename the minidump we happen to be handling when the > job is aborted through onStopJob() - if not we could just remove the > setCurrentMinidump/getCurrentMinidump methods. Yeah, it's not clear whether it's worth considering a case like this to be a "failed" try -- onStopJob would be called if the network connection shifted out from under us, right? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:116: return true; On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > It's probably worth documenting this return value. > > Actually, right now I'm not sure why isMetricsUploadPermitted is needed at all, > isn't this covered by the rest of the methods here? Yeah, it is. It might be possible to refactor the code to be cleaner; I think it's mostly a weird artifact from how the code used to be structured. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:121: return true; On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Umm, why is this hardcoded to true? Do users not get a say about whether they > > want to report crashes? > > I'm investigating whether we need to check the permission here or whether we can > just check it before generating minidumps (and thus if we never generate a > minidump we won't upload anything). In Chrome, we honor the user's decision not to upload anything if they opt out even after generating the minidump. I guess Android might have a slightly different policy; but I'd definitely encourage you to verify that, and maybe choose the more conservative policy regardless =) https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:438: if (!tmpDir.isDirectory() && !tmpDir.mkdir()) { On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > I'm not quite following the reasoning for having the intermediate location. > > What's preventing a pile of .tmp files from accumulating there? > > The idea would be to ensure that this method doesn't leave any file in that > directory (so I think I still need to add a finally-statement in here to ensure > that contract is fulfilled). > The reason we want the intermediate location is to ensure the upload-service > doesn't happen to delete the tmp-file mid-copy (when cleaning out files from > that service). What if the app crashes while copying files? =) https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:446: if (getAllMinidumpFiles(maxTries).length >= MAX_CRASH_REPORTS_TO_KEEP) { On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > This seems kind of scary, as it would prevent us from reporting new crashes if > > we have a backlog. I think we'd prefer to drop old crashes than new ones. I > > think it's okay during the lifetime of the process to exceed > > MAX_CRASH_REPORTS_TO_KEEP, as long as the cleanup code eventually cleans them > > up. > > Yeah, that sounds good (dropping old crashes rather than new ones) - I just > think we should have some kind of cap to ensure we aren't accumulating INF > minidumps (which we could if something goes wrong / some app decides to call > into our service to copy stuff over and over again). Though that cap should > probably be higher than MAX_CRASH_REPORTS_TO_KEEP. Fair enough, a separate cap probably makes sense. 
 Thanks Ilya! Don't worry about not having had time to do a thorough pass - right now I think I'm more interested in getting the discussions sorted anyway :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:211: minidumpFiles[i].delete(); On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > Hmm, I don't understand this line. Why do we delete the file before we've > > read > > > it? > > > > We have a filedescriptor pointing to the file, so the [I'm not sure about the > > terminology here] handle to the file will be deleted here, but the actual > > contents can still be read from the file descriptor. > > > > If we would want to be able to retry copying the file at some later point we > > can't call delete here. > > (a) It would be helpful to document this behavior, since it's non-obvious, at > least to me =) > (b) I do think that it's a good idea to think about what happens to the crash > reports if the copy fail for some reason. Should there be a retry mechanism? I agree on both points :) Had a chat with Toby, and he pointed out that the most likely explanations for files failing to be copied are such that retrying won't help (such as the disk being full, or file system corruption). This point together with us not caring much if we lose some small amount of crashes means we shouldn't care much about this (but if there is some nice way to check that our assumptions regarding copying-failures not happening very often hold then that would be nice to have I think - we don't currently have UMA for WebView though). Though right now we could be dropping reports if the binder-call into the service fails for whatever reason.. (the RemoteException below). We might want to keep the files if we receive a RemoteException.. (but that could also mean succeeding to copy some files, and then retrying them again later to create duplicates). https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > Losing crashes seems bad. > > > > I guess this is a trade-off between storing crash data in the app's data > > directory (which is where minidumpFiles are stored before we copy them through > > the transmitCrashes method) and uploading crashes even when we fail to connect > > to the copying-service at some point. > > Yeah. If it's expected that we'd (almost) never lose crash reports here, then > maybe just having a histogram as a sanity-check would be sufficient. But, I > wouldn't be surprised if file operations are not the most reliable thing ever, > especially across processes. Histogram as in UMA? (we don't have UMA yet for WebView - we should have it fairly soon, but I'm not sure about the exact time frame). https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:136: File copiedFile = crashFileManager.copyMinidumpFromFD(fd.getFileDescriptor(), On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > We should probably have basic checks to ensure we are copying a minidump > rather > > than a random file? > > Do you know any good checks we can perform? (minidump format / headers) > > No idea. I actually know almost nothing about the contents of the files... > > <_< > >_> > <_< Had a chat with Toby, and we think it's enough to have a minimal check for the Minidump-signature (first 4 bytes == "MDMP" or "PMDM") and also ensure the file we copy is not larger than 1MB. This is just to do a minimal check against mistakes and ensure we aren't storing/uploading any huge files. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:140: Log.w(TAG, "failed to copy minidump from " + fd.toString()); On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > What happens to the crash report in this case? > > > > As is the current implemenetation in AwBrowserProcess, it would just be > deleted. > > We might want different behaviour :) > > Yeah, or at least instrumentation as a sanity-check. What do you mean by instrumentation here; an instrumentation test? I'm not sure we can test the AwBrowserProcess logic here as it is directing to CrashReceiverService through binder, but I might be missing something :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java:33: if (mRunningJob) throw new RuntimeException("Already running a job."); On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > Is this essentially a fancy way to write an assertion, or is the code > actually > > > relying on catching the exception if there's a race? > > > > Just an assertion (I'm not entirely sure we want it here since it's a hard > > fail). > > Okay. I think it might be better to write something like "assert(mRunningJob)" > (or whatever the Java equivalent is), just for clarity. But, maybe that's just > because I'm not used to Java best practices. Right, the problem with java asserts is that they are almost never enabled - there is a (recent) ongoing discussion about whether to enable asserts on some clank channels here: https://groups.google.com/a/google.com/forum/#!topic/clank-team/s13MZndZhH0 It looks like they are enabled for debug-builds but not release builds, which sounds like what we want here - I'll change to using an assert here for PS7. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:40: synchronized (mCurrentMinidumpLock) { On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > Is it possible to enforce single-threadedness rather than using > > synchronization > > > primitives? > > > > The reason I'm using synchronization primitives here is that onStopJob is > called > > on the main thread of the service, while any work we perform (uploading) > should > > be performed on a separate thread (mWorkerThread). > > > > This is because all JobScheduler callbacks (onStartJob / onStopJob) are called > > on the main thread - so if we perform the work on the main thread we won't > > notice whenever the JobScheduler tells us to stop. > > I want to be able to interrupt the uploading when onStopJob is called - we do > > that in cancelUploads() in this class through calling setCancelUpload() and > > mWorkerThread.interrupt() - and, without then waiting for the worker thread to > > finish the current uploading, perform the correct cleanup operation: renaming > > the current minidump as a failure, i.e. ".try"+1. We do the renaming on the > main > > thread since that is where onStopJob is called (and we thus need to > synchronize > > the name of the current minidump to reach it from the main thread). I guess it > > isn't super important to rename the minidump we happen to be handling when the > > job is aborted through onStopJob() - if not we could just remove the > > setCurrentMinidump/getCurrentMinidump methods. > > Yeah, it's not clear whether it's worth considering a case like this to be a > "failed" try -- onStopJob would be called if the network connection shifted out > from under us, right? Correct - onStopJob will be called when the restrictions we have put on our JobScheduler job no longer hold (the only restriction we have is for us to be on an unmetered connection). Unless I'm forgetting something I don't think there is a strong reason for marking an interrupt by onStopJob() as a failure - to be honest I don't know what kind of issues we want to protect ourselves against by marking file-uploads as retries in general, do you know? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:116: return true; On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > It's probably worth documenting this return value. > > > > Actually, right now I'm not sure why isMetricsUploadPermitted is needed at > all, > > isn't this covered by the rest of the methods here? > > Yeah, it is. It might be possible to refactor the code to be cleaner; I think > it's mostly a weird artifact from how the code used to be structured. Alright, it seems we are using it as a sanity-check in MinidumpUploadCallable (apart from that it's used for UMA). Adding a comment here in PS7. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:121: return true; On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > Umm, why is this hardcoded to true? Do users not get a say about whether > they > > > want to report crashes? > > > > I'm investigating whether we need to check the permission here or whether we > can > > just check it before generating minidumps (and thus if we never generate a > > minidump we won't upload anything). > > In Chrome, we honor the user's decision not to upload anything if they opt out > even after generating the minidump. I guess Android might have a slightly > different policy; but I'd definitely encourage you to verify that, and maybe > choose the more conservative policy regardless =) Okidoke, the reason I haven't just put that check here is because right now this code (both the Services added in this CL) doesn't depend on much more than components/minidump_uploader, and base/. While, if we would want to check the user's decision in this Service, we would have to add a dependency on the code that communicates with GmsCore (and also probably add a command-line flag to enable developers to override the Android Checkbox choice for development/testing - the command-line flag reading code lives in native, which we don't depend on either yet). https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:438: if (!tmpDir.isDirectory() && !tmpDir.mkdir()) { On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > > I'm not quite following the reasoning for having the intermediate location. > > > What's preventing a pile of .tmp files from accumulating there? > > > > The idea would be to ensure that this method doesn't leave any file in that > > directory (so I think I still need to add a finally-statement in here to > ensure > > that contract is fulfilled). > > The reason we want the intermediate location is to ensure the upload-service > > doesn't happen to delete the tmp-file mid-copy (when cleaning out files from > > that service). > > What if the app crashes while copying files? =) Right, if the WebView process crashes while copying files then we should end up with a tmp-file we haven't removed. Actually, we are cleaning this directory from the method deleteFilesInWebViewTmpDir(Context context) over in CrashReceiverService :). https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:446: if (getAllMinidumpFiles(maxTries).length >= MAX_CRASH_REPORTS_TO_KEEP) { On 2016/12/08 07:32:30, Ilya Sherman wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > > This seems kind of scary, as it would prevent us from reporting new crashes > if > > > we have a backlog. I think we'd prefer to drop old crashes than new ones. > I > > > think it's okay during the lifetime of the process to exceed > > > MAX_CRASH_REPORTS_TO_KEEP, as long as the cleanup code eventually cleans > them > > > up. > > > > Yeah, that sounds good (dropping old crashes rather than new ones) - I just > > think we should have some kind of cap to ensure we aren't accumulating INF > > minidumps (which we could if something goes wrong / some app decides to call > > into our service to copy stuff over and over again). Though that cap should > > probably be higher than MAX_CRASH_REPORTS_TO_KEEP. > > Fair enough, a separate cap probably makes sense. Do you think we should drop old crashes when we hit that cap, or just not copy new minidumps? https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:465: // TODO the following file could potentially already exist... On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > Could it? Does the Chrome case handle such a case? > > Not sure how Chrome handles potential collisions. What I mean with the TODO here > is just that we are not checking whether the file already exists - so there is a > chance that it does. > As long as our filename generation scheme depends on the time at which they were > created (and maps to some big space) this should be fine :) Changing this code in PS7 so that it can't collide with existing minidumps. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:479: return UUID.randomUUID().toString(); On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > Are you sure that UUID.randomUUID() is threadsafe? > > Not really, I'll have to look into this more. I'm removing this in PS7 - all we need here is the File.createTempFile utility. 
 https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? On 2016/12/08 17:14:02, gsennton wrote: > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > On 2016/12/07 18:51:43, gsennton wrote: > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > Losing crashes seems bad. > > > > > > I guess this is a trade-off between storing crash data in the app's data > > > directory (which is where minidumpFiles are stored before we copy them > through > > > the transmitCrashes method) and uploading crashes even when we fail to > connect > > > to the copying-service at some point. > > > > Yeah. If it's expected that we'd (almost) never lose crash reports here, then > > maybe just having a histogram as a sanity-check would be sufficient. But, I > > wouldn't be surprised if file operations are not the most reliable thing ever, > > especially across processes. > > Histogram as in UMA? (we don't have UMA yet for WebView - we should have it > fairly soon, but I'm not sure about the exact time frame). Ah, no UMA is sad times. (You have Finch but not UMA? Odd!) Are there Clearcut counters or any other form of metrics available? Or, are there any other sanity-check mechanisms available? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:140: Log.w(TAG, "failed to copy minidump from " + fd.toString()); On 2016/12/08 17:14:02, gsennton wrote: > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > On 2016/12/07 18:51:43, gsennton wrote: > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > What happens to the crash report in this case? > > > > > > As is the current implemenetation in AwBrowserProcess, it would just be > > deleted. > > > We might want different behaviour :) > > > > Yeah, or at least instrumentation as a sanity-check. > > What do you mean by instrumentation here; an instrumentation test? I'm not sure > we can test the AwBrowserProcess logic here as it is directing to > CrashReceiverService through binder, but I might be missing something :) I meant an UMA metric =) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:40: synchronized (mCurrentMinidumpLock) { On 2016/12/08 17:14:02, gsennton wrote: > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > On 2016/12/07 18:51:43, gsennton wrote: > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > Is it possible to enforce single-threadedness rather than using > > > synchronization > > > > primitives? > > > > > > The reason I'm using synchronization primitives here is that onStopJob is > > called > > > on the main thread of the service, while any work we perform (uploading) > > should > > > be performed on a separate thread (mWorkerThread). > > > > > > This is because all JobScheduler callbacks (onStartJob / onStopJob) are > called > > > on the main thread - so if we perform the work on the main thread we won't > > > notice whenever the JobScheduler tells us to stop. > > > I want to be able to interrupt the uploading when onStopJob is called - we > do > > > that in cancelUploads() in this class through calling setCancelUpload() and > > > mWorkerThread.interrupt() - and, without then waiting for the worker thread > to > > > finish the current uploading, perform the correct cleanup operation: > renaming > > > the current minidump as a failure, i.e. ".try"+1. We do the renaming on the > > main > > > thread since that is where onStopJob is called (and we thus need to > > synchronize > > > the name of the current minidump to reach it from the main thread). I guess > it > > > isn't super important to rename the minidump we happen to be handling when > the > > > job is aborted through onStopJob() - if not we could just remove the > > > setCurrentMinidump/getCurrentMinidump methods. > > > > Yeah, it's not clear whether it's worth considering a case like this to be a > > "failed" try -- onStopJob would be called if the network connection shifted > out > > from under us, right? > > Correct - onStopJob will be called when the restrictions we have put on our > JobScheduler job no longer hold (the only restriction we have is for us to be on > an unmetered connection). > Unless I'm forgetting something I don't think there is a strong reason for > marking an interrupt by onStopJob() as a failure - to be honest I don't know > what kind of issues we want to protect ourselves against by marking file-uploads > as retries in general, do you know? I think the idea of retries is that the server might reject some uploads for whatever reason, or the current network environment might be broken. It makes more sense if there's some sort of backoff built in... which currently there isn't :P https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:121: return true; On 2016/12/08 17:14:02, gsennton wrote: > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > On 2016/12/07 18:51:43, gsennton wrote: > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > Umm, why is this hardcoded to true? Do users not get a say about whether > > they > > > > want to report crashes? > > > > > > I'm investigating whether we need to check the permission here or whether we > > can > > > just check it before generating minidumps (and thus if we never generate a > > > minidump we won't upload anything). > > > > In Chrome, we honor the user's decision not to upload anything if they opt out > > even after generating the minidump. I guess Android might have a slightly > > different policy; but I'd definitely encourage you to verify that, and maybe > > choose the more conservative policy regardless =) > > Okidoke, the reason I haven't just put that check here is because right now this > code (both the Services added in this CL) doesn't depend on much more than > components/minidump_uploader, and base/. While, if we would want to check the > user's decision in this Service, we would have to add a dependency on the code > that communicates with GmsCore (and also probably add a command-line flag to > enable developers to override the Android Checkbox choice for > development/testing - the command-line flag reading code lives in native, which > we don't depend on either yet). Well, sounds like it's worth investigating more deeply what the requirements here are. Avoiding unnecessary dependencies is good; as long as it's not compromising on desired functionality =) https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:446: if (getAllMinidumpFiles(maxTries).length >= MAX_CRASH_REPORTS_TO_KEEP) { On 2016/12/08 17:14:02, gsennton wrote: > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > On 2016/12/07 18:51:43, gsennton wrote: > > > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > > > This seems kind of scary, as it would prevent us from reporting new > crashes > > if > > > > we have a backlog. I think we'd prefer to drop old crashes than new ones. > > > I > > > > think it's okay during the lifetime of the process to exceed > > > > MAX_CRASH_REPORTS_TO_KEEP, as long as the cleanup code eventually cleans > > them > > > > up. > > > > > > Yeah, that sounds good (dropping old crashes rather than new ones) - I just > > > think we should have some kind of cap to ensure we aren't accumulating INF > > > minidumps (which we could if something goes wrong / some app decides to call > > > into our service to copy stuff over and over again). Though that cap should > > > probably be higher than MAX_CRASH_REPORTS_TO_KEEP. > > > > Fair enough, a separate cap probably makes sense. > > Do you think we should drop old crashes when we hit that cap, or just not copy > new minidumps? I guess if the cap is pretty generous, then it shouldn't matter too much, so whichever option is simplest probably makes sense. In general, I think newer reports are more useful than older ones; so if it's easy enough to prioritize those, then it's probably good to do. 
 And a code comment: https://codereview.chromium.org/2515353005/diff/100001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/100001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:85: if (copyMinidumps(context, appPackageName, fileDescriptors) && scheduleUploads) { nit: I was tempted, when reading this line, to suggest reordering the checks (for efficiency) before I realized that the order is intentional. It might be worth splitting the first check off onto a separate line, to clarify that it's work that's intended to happen unconditionally. 
 Woop woop, it feels like we are getting closer :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? On 2016/12/09 00:42:07, Ilya Sherman wrote: > On 2016/12/08 17:14:02, gsennton wrote: > > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > > On 2016/12/07 18:51:43, gsennton wrote: > > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > > Losing crashes seems bad. > > > > > > > > I guess this is a trade-off between storing crash data in the app's data > > > > directory (which is where minidumpFiles are stored before we copy them > > through > > > > the transmitCrashes method) and uploading crashes even when we fail to > > connect > > > > to the copying-service at some point. > > > > > > Yeah. If it's expected that we'd (almost) never lose crash reports here, > then > > > maybe just having a histogram as a sanity-check would be sufficient. But, I > > > wouldn't be surprised if file operations are not the most reliable thing > ever, > > > especially across processes. > > > > Histogram as in UMA? (we don't have UMA yet for WebView - we should have it > > fairly soon, but I'm not sure about the exact time frame). > > Ah, no UMA is sad times. (You have Finch but not UMA? Odd!) Are there > Clearcut counters or any other form of metrics available? Or, are there any > other sanity-check mechanisms available? We don't have Finch either! :/ But it's on our TODO-list. We have a project to implement UMA as well - but it depends on ClearCut which depends on GmsCore. There is a mini-design-doc for WebView-UMA here: https://docs.google.com/a/google.com/document/d/1QeAuYEuYfFMnU2k0NboXbept6Ijd... AFAIK we don't have any other kind of sanity-check mechanisms available :( On the other hand - this project also relies on GmsCore, so ideally UMA will be available very soon after (or even before) this project rolls out to real users. WDYT about leaving this as it is (potentially losing some minidumps - but probably/hopefully not many), and adding a TODO to add a UMA metric for this when UMA is available? https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:63: return; On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Is it worth throwing an exception in this case? > > The point of this code would be to ensure that the appPackageName string is > actually the name of the app that is connecting to us (so that an app can't fake > its own name). So what we would want to do here is probably just return directly > - this API is open to all apps on the device so we wouldn't want to present a > way to crash the service like this :) I've now changed this code to not give a darn about package names :) (only uids). https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:140: Log.w(TAG, "failed to copy minidump from " + fd.toString()); On 2016/12/09 00:42:07, Ilya Sherman wrote: > On 2016/12/08 17:14:02, gsennton wrote: > > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > > On 2016/12/07 18:51:43, gsennton wrote: > > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > > What happens to the crash report in this case? > > > > > > > > As is the current implemenetation in AwBrowserProcess, it would just be > > > deleted. > > > > We might want different behaviour :) > > > > > > Yeah, or at least instrumentation as a sanity-check. > > > > What do you mean by instrumentation here; an instrumentation test? I'm not > sure > > we can test the AwBrowserProcess logic here as it is directing to > > CrashReceiverService through binder, but I might be missing something :) > > I meant an UMA metric =) Ah, yeah, we don't do sanity checks in WebView :/ (until UMA is available). Good to note here is that to use UMA here we will have to draw in the dependency on GmsCore for this service - so maybe I should just accept that we will use gms in these Services (and prototype/test pulling in the dependency locally). https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:40: synchronized (mCurrentMinidumpLock) { On 2016/12/09 00:42:07, Ilya Sherman wrote: > On 2016/12/08 17:14:02, gsennton wrote: > > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > > On 2016/12/07 18:51:43, gsennton wrote: > > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > > Is it possible to enforce single-threadedness rather than using > > > > synchronization > > > > > primitives? > > > > > > > > The reason I'm using synchronization primitives here is that onStopJob is > > > called > > > > on the main thread of the service, while any work we perform (uploading) > > > should > > > > be performed on a separate thread (mWorkerThread). > > > > > > > > This is because all JobScheduler callbacks (onStartJob / onStopJob) are > > called > > > > on the main thread - so if we perform the work on the main thread we won't > > > > notice whenever the JobScheduler tells us to stop. > > > > I want to be able to interrupt the uploading when onStopJob is called - we > > do > > > > that in cancelUploads() in this class through calling setCancelUpload() > and > > > > mWorkerThread.interrupt() - and, without then waiting for the worker > thread > > to > > > > finish the current uploading, perform the correct cleanup operation: > > renaming > > > > the current minidump as a failure, i.e. ".try"+1. We do the renaming on > the > > > main > > > > thread since that is where onStopJob is called (and we thus need to > > > synchronize > > > > the name of the current minidump to reach it from the main thread). I > guess > > it > > > > isn't super important to rename the minidump we happen to be handling when > > the > > > > job is aborted through onStopJob() - if not we could just remove the > > > > setCurrentMinidump/getCurrentMinidump methods. > > > > > > Yeah, it's not clear whether it's worth considering a case like this to be a > > > "failed" try -- onStopJob would be called if the network connection shifted > > out > > > from under us, right? > > > > Correct - onStopJob will be called when the restrictions we have put on our > > JobScheduler job no longer hold (the only restriction we have is for us to be > on > > an unmetered connection). > > Unless I'm forgetting something I don't think there is a strong reason for > > marking an interrupt by onStopJob() as a failure - to be honest I don't know > > what kind of issues we want to protect ourselves against by marking > file-uploads > > as retries in general, do you know? > > I think the idea of retries is that the server might reject some uploads for > whatever reason, or the current network environment might be broken. It makes > more sense if there's some sort of backoff built in... which currently there > isn't :P Right, the current back-off would be shared by all minidumps (since we only post one job for the uploading of all minidumps together). So, I think the only reason we would want to count this as a failure is if the network request for some reason stalled for a long time (without timing out) - and it is then aborted by onStopJob because we switched off wifi. If we want to make that case a failure I think we could just add a time-out to the http request so we fail the request rather than interrupt it. Given this discussion, I'm removing the whole getCurrentMinidump() logic here since onStopJob being called isn't necessarily a good reason to mark an upload as failed. https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:102: return true; On 2016/12/07 18:51:43, gsennton wrote: > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > Hmm, what happens if the client *was* in the sample when the minidump was > > generated, but was ejected by the time the upload code runs -- say, because > they > > didn't have a network connection for a while? > > Currently, we plan to hard-code this value on the WebView-side, so it can't > change over time. > Long-term we will probably want Finch for this - and then it would make sense to > ask for that value here. Note that the reason we want Finch for this in the long-term rather than short-term is because we don't have Finch for WebView yet :) https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:121: return true; On 2016/12/09 00:42:08, Ilya Sherman wrote: > On 2016/12/08 17:14:02, gsennton wrote: > > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > > On 2016/12/07 18:51:43, gsennton wrote: > > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > > Umm, why is this hardcoded to true? Do users not get a say about > whether > > > they > > > > > want to report crashes? > > > > > > > > I'm investigating whether we need to check the permission here or whether > we > > > can > > > > just check it before generating minidumps (and thus if we never generate a > > > > minidump we won't upload anything). > > > > > > In Chrome, we honor the user's decision not to upload anything if they opt > out > > > even after generating the minidump. I guess Android might have a slightly > > > different policy; but I'd definitely encourage you to verify that, and maybe > > > choose the more conservative policy regardless =) > > > > Okidoke, the reason I haven't just put that check here is because right now > this > > code (both the Services added in this CL) doesn't depend on much more than > > components/minidump_uploader, and base/. While, if we would want to check the > > user's decision in this Service, we would have to add a dependency on the code > > that communicates with GmsCore (and also probably add a command-line flag to > > enable developers to override the Android Checkbox choice for > > development/testing - the command-line flag reading code lives in native, > which > > we don't depend on either yet). > > Well, sounds like it's worth investigating more deeply what the requirements > here are. Avoiding unnecessary dependencies is good; as long as it's not > compromising on desired functionality =) We probably want UMA in these services anyway - which would mean we want to pull in the GmsCore dependency. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:446: if (getAllMinidumpFiles(maxTries).length >= MAX_CRASH_REPORTS_TO_KEEP) { On 2016/12/09 00:42:08, Ilya Sherman wrote: > On 2016/12/08 17:14:02, gsennton wrote: > > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > > On 2016/12/07 18:51:43, gsennton wrote: > > > > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > > > > This seems kind of scary, as it would prevent us from reporting new > > crashes > > > if > > > > > we have a backlog. I think we'd prefer to drop old crashes than new > ones. > > > > > I > > > > > think it's okay during the lifetime of the process to exceed > > > > > MAX_CRASH_REPORTS_TO_KEEP, as long as the cleanup code eventually cleans > > > them > > > > > up. > > > > > > > > Yeah, that sounds good (dropping old crashes rather than new ones) - I > just > > > > think we should have some kind of cap to ensure we aren't accumulating INF > > > > minidumps (which we could if something goes wrong / some app decides to > call > > > > into our service to copy stuff over and over again). Though that cap > should > > > > probably be higher than MAX_CRASH_REPORTS_TO_KEEP. > > > > > > Fair enough, a separate cap probably makes sense. > > > > Do you think we should drop old crashes when we hit that cap, or just not copy > > new minidumps? > > I guess if the cap is pretty generous, then it shouldn't matter too much, so > whichever option is simplest probably makes sense. In general, I think newer > reports are more useful than older ones; so if it's easy enough to prioritize > those, then it's probably good to do. Yep, newer-minidumps-replace-old-ones is now implemented in PS7. https://codereview.chromium.org/2515353005/diff/80001/components/minidump_upl... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:479: return UUID.randomUUID().toString(); On 2016/12/08 17:14:02, gsennton wrote: > On 2016/12/07 18:51:43, gsennton wrote: > > On 2016/12/06 22:35:47, Ilya Sherman wrote: > > > Are you sure that UUID.randomUUID() is threadsafe? > > > > Not really, I'll have to look into this more. > > I'm removing this in PS7 - all we need here is the File.createTempFile utility. I added back UUID in PS7 - but I don't sell it as being thread safe ;) (we synchronize the file-copying anyway). https://codereview.chromium.org/2515353005/diff/100001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/100001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:85: if (copyMinidumps(context, appPackageName, fileDescriptors) && scheduleUploads) { On 2016/12/09 00:46:31, Ilya Sherman wrote: > nit: I was tempted, when reading this line, to suggest reordering the checks > (for efficiency) before I realized that the order is intentional. It might be > worth splitting the first check off onto a separate line, to clarify that it's > work that's intended to happen unconditionally. Done. 
 Thanks, Gustav! I think I should probably make another thorough pass through the files and focus extra attention on the overall system design, which I didn't quite get to in this pass. But, I do think we're getting close! Mostly just some nits remaining from looking at the patch set diffs: https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... File android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java (right): https://codereview.chromium.org/2515353005/diff/80001/android_webview/java/sr... android_webview/java/src/org/chromium/android_webview/AwBrowserProcess.java:217: // XXX: are we prepared to just lose crashes in this case? On 2016/12/09 23:08:40, gsennton wrote: > On 2016/12/09 00:42:07, Ilya Sherman wrote: > > On 2016/12/08 17:14:02, gsennton wrote: > > > On 2016/12/08 07:32:30, Ilya Sherman wrote: > > > > On 2016/12/07 18:51:43, gsennton wrote: > > > > > On 2016/12/07 00:59:19, Ilya Sherman wrote: > > > > > > Losing crashes seems bad. > > > > > > > > > > I guess this is a trade-off between storing crash data in the app's data > > > > > directory (which is where minidumpFiles are stored before we copy them > > > through > > > > > the transmitCrashes method) and uploading crashes even when we fail to > > > connect > > > > > to the copying-service at some point. > > > > > > > > Yeah. If it's expected that we'd (almost) never lose crash reports here, > > then > > > > maybe just having a histogram as a sanity-check would be sufficient. But, > I > > > > wouldn't be surprised if file operations are not the most reliable thing > > ever, > > > > especially across processes. > > > > > > Histogram as in UMA? (we don't have UMA yet for WebView - we should have it > > > fairly soon, but I'm not sure about the exact time frame). > > > > Ah, no UMA is sad times. (You have Finch but not UMA? Odd!) Are there > > Clearcut counters or any other form of metrics available? Or, are there any > > other sanity-check mechanisms available? > > We don't have Finch either! :/ But it's on our TODO-list. > We have a project to implement UMA as well - but it depends on ClearCut which > depends on GmsCore. > There is a mini-design-doc for WebView-UMA here: > https://docs.google.com/a/google.com/document/d/1QeAuYEuYfFMnU2k0NboXbept6Ijd... > AFAIK we don't have any other kind of sanity-check mechanisms available :( > > On the other hand - this project also relies on GmsCore, so ideally UMA will be > available very soon after (or even before) this project rolls out to real users. > WDYT about leaving this as it is (potentially losing some minidumps - but > probably/hopefully not many), and adding a TODO to add a UMA metric for this > when UMA is available? It's not my favorite idea, but I'll admit that I don't have any better ones either. So, okay, let's go with this for now. https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:71: if (copyMinidumps(context, uid, fileDescriptors)) { Sorry for not being clearer before: Rather than nested if-stmts, I was suggesting something like: boolean copySucceeded = copyMinidumps(context, uid, fileDescriptors); if (copySucceeded && scheduleUploads) { ... } https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java (right): https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java:24: // TODO if we crash during a job, without calling JobFinished - will the JobScheduler notice, nit: Please write TODO(username): rather than just TODO =) https://codereview.chromium.org/2515353005/diff/120001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/120001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:90: // when we clean out the crash directory - the TO_UPLOAD value is checked everytime we try to nit: s/everytime/every time https://codereview.chromium.org/2515353005/diff/120001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:558: return File.createTempFile("wv_minidump", TMP_SUFFIX, directory); nit: I'd spell out "webview", so that whoever happens to stumble upon these files N years from now has an easier time figuring out what they are =) 
 Cool, yeah looking at the overall design is a good idea :). Let me know if you'd like a VC or any kind of clarification. Note that in PS8 I added SynchronizedWebViewCommandLine.java which is now used to avoid doing any kind of minidump copying/uploading unless the command line flag "--enable-crash-reporter-for-testing" has been added to the file '/data/local/tmp/webview-command-line'. https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:71: if (copyMinidumps(context, uid, fileDescriptors)) { On 2016/12/13 07:24:22, Ilya Sherman wrote: > Sorry for not being clearer before: Rather than nested if-stmts, I was > suggesting something like: > > boolean copySucceeded = copyMinidumps(context, uid, fileDescriptors); > if (copySucceeded && scheduleUploads) { ... } Ah, alright. Done :) https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java (right): https://codereview.chromium.org/2515353005/diff/120001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploadJobService.java:24: // TODO if we crash during a job, without calling JobFinished - will the JobScheduler notice, On 2016/12/13 07:24:22, Ilya Sherman wrote: > nit: Please write TODO(username): rather than just TODO =) I have a bad habit of leaving TODOs that I want done before committing - and I write these as just "TODO" while the ones I intent to leave in the actual codebase are "TODO(gsennton):". So, this TODO is removed in PS8 :). https://codereview.chromium.org/2515353005/diff/120001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/120001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:90: // when we clean out the crash directory - the TO_UPLOAD value is checked everytime we try to On 2016/12/13 07:24:22, Ilya Sherman wrote: > nit: s/everytime/every time Done. https://codereview.chromium.org/2515353005/diff/120001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:558: return File.createTempFile("wv_minidump", TMP_SUFFIX, directory); On 2016/12/13 07:24:22, Ilya Sherman wrote: > nit: I'd spell out "webview", so that whoever happens to stumble upon these > files N years from now has an easier time figuring out what they are =) Done. 
 Hey Toby and Ilya, PS9 is functionally equivalent to PS8 - I just removed debug log printing. ptal :) 
 https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:171: // TODO(gsennton): add UMA metric to ensure we aren't losing too many Will we be able to use UMA in this service? https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:145: if (uploadResult == MinidumpUploadCallable.UPLOAD_FAILURE) { It feels kind of weird that handling failure renaming happens here, but (presumably) success renaming happens elsewhere. Can we handle all cases in one location? Maybe resolve the TODO(acleung) in MinidumpUploadCallable.handleExecutionResponse https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:184: mWorkerThread.interrupt(); Torne and I had a look at this, and it looks like the set of operations that might be interrupted by this is approximately none, so you could remove this. https://codereview.chromium.org/2515353005/diff/160001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/160001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:297: String uidString = Integer.toString(uid); unused https://codereview.chromium.org/2515353005/diff/160001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:510: final int maxIter = 1024 * 1024 / bufSize; // 1MB maximum size My personal preference would be to not cap the number of iterations, but to keep a total of bytes read, and compare that against the target size. If FileInputStream.read consistently reads less than buf.length bytes, then we'll bail before actually reading more than 1MB. 
 https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:171: // TODO(gsennton): add UMA metric to ensure we aren't losing too many On 2016/12/15 13:24:01, Tobias Sargeant wrote: > Will we be able to use UMA in this service? Good question, I haven't looked into this (and just assumed we would be able to use UMA somehow in these Services - even though it might take some extra effort). What might be the problem here, loading native? (if that's how UMA is implemented) https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:145: if (uploadResult == MinidumpUploadCallable.UPLOAD_FAILURE) { On 2016/12/15 13:24:01, Tobias Sargeant wrote: > It feels kind of weird that handling failure renaming happens here, but > (presumably) success renaming happens elsewhere. Can we handle all cases in one > location? Maybe resolve the TODO(acleung) in > MinidumpUploadCallable.handleExecutionResponse Yeah, I had this thought as well, filed crbug.com/659938, started coding a change in https://codereview.chromium.org/2464823002/ and then realized that the change would need to move lots of tests from MinidumpUploadCallableTest.java (that are relying on minidumps being renamed inside MinidumpUploadCallable to ensure we handle different combinations of crash permissions = enabledForTesting/userPermitted/networkAvailable/etc. correctly) to MinidumpUploadServiceTest (and IIRC doing so wasn't straight-forward). So I focused on getting the componentization + this CL done instead. I can take a look at cleaning this up - I don't think it should be blocking this CL though :) https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:184: mWorkerThread.interrupt(); On 2016/12/15 13:24:01, Tobias Sargeant wrote: > Torne and I had a look at this, and it looks like the set of operations that > might be interrupted by this is approximately none, so you could remove this. Done. https://codereview.chromium.org/2515353005/diff/160001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/160001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:297: String uidString = Integer.toString(uid); On 2016/12/15 13:24:01, Tobias Sargeant wrote: > unused Done. https://codereview.chromium.org/2515353005/diff/160001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:510: final int maxIter = 1024 * 1024 / bufSize; // 1MB maximum size On 2016/12/15 13:24:01, Tobias Sargeant wrote: > My personal preference would be to not cap the number of iterations, but to keep > a total of bytes read, and compare that against the target size. If > FileInputStream.read consistently reads less than buf.length bytes, then we'll > bail before actually reading more than 1MB. Nice catch! 
 https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:171: // TODO(gsennton): add UMA metric to ensure we aren't losing too many On 2016/12/15 14:47:21, gsennton wrote: > On 2016/12/15 13:24:01, Tobias Sargeant wrote: > > Will we be able to use UMA in this service? > > Good question, I haven't looked into this (and just assumed we would be able to > use UMA somehow in these Services - even though it might take some extra > effort). > What might be the problem here, loading native? (if that's how UMA is > implemented) I'm not sure either, it just occurs to me that it might not be possible. I believe it's possible to log UMA events without calling into native, but maybe not do everything. There's also the question about the connection with GMSCore, that Paul would be best suited to answer. I guess there's no harm in keeping it a TODO, even if you can't currently DO it. https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/160001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:145: if (uploadResult == MinidumpUploadCallable.UPLOAD_FAILURE) { On 2016/12/15 14:47:21, gsennton wrote: > On 2016/12/15 13:24:01, Tobias Sargeant wrote: > > It feels kind of weird that handling failure renaming happens here, but > > (presumably) success renaming happens elsewhere. Can we handle all cases in > one > > location? Maybe resolve the TODO(acleung) in > > MinidumpUploadCallable.handleExecutionResponse > > Yeah, I had this thought as well, filed crbug.com/659938, started coding a > change in https://codereview.chromium.org/2464823002/ and then realized that the > change would need to move lots of tests from MinidumpUploadCallableTest.java > (that are relying on minidumps being renamed inside MinidumpUploadCallable to > ensure we handle different combinations of crash permissions = > enabledForTesting/userPermitted/networkAvailable/etc. correctly) to > MinidumpUploadServiceTest (and IIRC doing so wasn't straight-forward). So I > focused on getting the componentization + this CL done instead. > > I can take a look at cleaning this up - I don't think it should be blocking this > CL though :) Yep. Definitely don't block this CL on that. Good to know you have it on your radar to clean up. 
 Okay, the high-level structure looks good! Just some nitty gritties =) https://codereview.chromium.org/2515353005/diff/180001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/180001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:39: private static final int JOB_DELAY_IN_MS = 1000 * 60 * 2; nit: It'd be nice to document the motivation for having this constant. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:295: public static File[] filterMinidumpFilesOnUid(File[] minidumpFiles, int uid) { nit: Is there an advantage to returning an array rather than a List, given that there's a conversion involved to do so? https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:456: File[] allMinidumpFiles = getAllMinidumpFiles(maxTries); Hmm, why is maxTries passed in here? I *think* the result is that we keep around minidumps that have exceeded the max number of attempts, and they're cleared out at some future date as part of the general clean up code. Am I understanding the behavior correctly? Is it intentional? https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:469: // Only copy the file if we haven't exceeded our cap already. nit: Please update this comment. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:482: * copying failed. nit: Please indent this line (if I understand Java style correctly, plz ignore if that's incorrect). https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:487: if (!crashDirectory.isDirectory() && !crashDirectory.mkdir()) { What's the reason for the mkdir() call? Is it not guaranteed by previous code that the directory should exist? https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:543: * The 'uniqueness' of the file name lies in it being created from a UUID. It's probably helpful to have a brief note here explaining the difference between "uid" and "uuid" =) https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... File components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java (right): https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:374: 12 /* uid */)); nit: Mebbe add some test cases that have multiple numbers, like "1_23_md.dmp"? https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:374: 12 /* uid */)); nit: Mebbe also try some other (non-numeric) delimiters https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:413: File[] initialMinidumps = fileManager.getAllMinidumpFiles(1 /* maxTries */); Hmm, isn't this guaranteed to be an empty array? https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:442: // Now the crash directory is full - so copying should a new minidump should fail. nit: Please update this comment -- it deletes the oldest minidump rather than failing, right? 
 Wow, thanks for looking at this Ilya! I guess all I need now is a couple of l g t m s ;) https://codereview.chromium.org/2515353005/diff/180001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java (right): https://codereview.chromium.org/2515353005/diff/180001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/CrashReceiverService.java:39: private static final int JOB_DELAY_IN_MS = 1000 * 60 * 2; On 2016/12/19 03:34:03, Ilya Sherman wrote: > nit: It'd be nice to document the motivation for having this constant. Oooh, that's true. This is an arbitrary number :). I chatted with Toby, and we're removing this for now (i.e. let jobs run immediately if possible). We're changing the back-off time to 30 minutes though, so if we run an upload-job and there are still minidumps to upload after the job is done, then we wait 30 minutes before trying again (with exponential back-off). https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:295: public static File[] filterMinidumpFilesOnUid(File[] minidumpFiles, int uid) { On 2016/12/19 03:34:03, Ilya Sherman wrote: > nit: Is there an advantage to returning an array rather than a List, given that > there's a conversion involved to do so? Hm, good question :), I think I did it because the rest of this class returns arrays in similar methods (e.g. in getAllMinidumpFiles). I've changed filterMinidumpFilesOnUid() to return a List now. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:456: File[] allMinidumpFiles = getAllMinidumpFiles(maxTries); On 2016/12/19 03:34:03, Ilya Sherman wrote: > Hmm, why is maxTries passed in here? I *think* the result is that we keep > around minidumps that have exceeded the max number of attempts, and they're > cleared out at some future date as part of the general clean up code. Am I > understanding the behavior correctly? Is it intentional? Well, the result of using maxTries here is that we ignore files that have exceeded the maximum number of attempts when counting minidumps towards the storage-restriction. So these files will indeed be cleared out at some future date, and before then they won't be counted as minidumps. I guess we don't need files that have reached the max number of tries anymore given that we don't have a force-upload alternative for WebView, so maybe we should just delete files as soon as they reach the max-tries limit? I now added a piece of code in MinidumpUploaderImpl#UploadRunnable.run() to delete a file after trying to upload it - if it reached the max number of tries. Also changed MinidumpUploaderTest.testFailUploadingMinidumps() to pass after this (and ensure we do indeed delete a file that reached max-tries). https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:469: // Only copy the file if we haven't exceeded our cap already. On 2016/12/19 03:34:03, Ilya Sherman wrote: > nit: Please update this comment. Done. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:482: * copying failed. On 2016/12/19 03:34:03, Ilya Sherman wrote: > nit: Please indent this line (if I understand Java style correctly, plz ignore > if that's incorrect). You are correct, https://google.github.io/styleguide/javaguide.html#s7.1.3-javadoc-at-clauses Done. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:487: if (!crashDirectory.isDirectory() && !crashDirectory.mkdir()) { On 2016/12/19 03:34:03, Ilya Sherman wrote: > What's the reason for the mkdir() call? Is it not guaranteed by previous code > that the directory should exist? It is guaranteed that mCacheDir exists, not getCrashDirectory() - which is a sub-directory of mCacheDir. For Webview mCacheDir is 'cache/WebView_Crashes/' while getCrashDirectory() is 'cache/WebView_Crashes/Crash Reports/' https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:543: * The 'uniqueness' of the file name lies in it being created from a UUID. On 2016/12/19 03:34:03, Ilya Sherman wrote: > It's probably helpful to have a brief note here explaining the difference > between "uid" and "uuid" =) Done. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... File components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java (right): https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:374: 12 /* uid */)); On 2016/12/19 03:34:03, Ilya Sherman wrote: > nit: Mebbe add some test cases that have multiple numbers, like "1_23_md.dmp"? Done. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:413: File[] initialMinidumps = fileManager.getAllMinidumpFiles(1 /* maxTries */); On 2016/12/19 03:34:03, Ilya Sherman wrote: > Hmm, isn't this guaranteed to be an empty array? Haah, it should be an empty array given that we just deleted all the existing minidumps on the line above :), adding an assert to ensure initialMinidumps.length == 0. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:442: // Now the crash directory is full - so copying should a new minidump should fail. On 2016/12/19 03:34:03, Ilya Sherman wrote: > nit: Please update this comment -- it deletes the oldest minidump rather than > failing, right? Done. 
 Thanks! This basically LGTM. I still have a small quibble, but it could easily be addressed in a follow-up CL. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:456: File[] allMinidumpFiles = getAllMinidumpFiles(maxTries); On 2016/12/19 14:28:26, gsennton wrote: > On 2016/12/19 03:34:03, Ilya Sherman wrote: > > Hmm, why is maxTries passed in here? I *think* the result is that we keep > > around minidumps that have exceeded the max number of attempts, and they're > > cleared out at some future date as part of the general clean up code. Am I > > understanding the behavior correctly? Is it intentional? > > Well, the result of using maxTries here is that we ignore files that have > exceeded the maximum number of attempts when counting minidumps towards the > storage-restriction. So these files will indeed be cleared out at some future > date, and before then they won't be counted as minidumps. > I guess we don't need files that have reached the max number of tries anymore > given that we don't have a force-upload alternative for WebView, so maybe we > should just delete files as soon as they reach the max-tries limit? > > I now added a piece of code in MinidumpUploaderImpl#UploadRunnable.run() to > delete a file after trying to upload it - if it reached the max number of tries. > Also changed MinidumpUploaderTest.testFailUploadingMinidumps() to pass after > this (and ensure we do indeed delete a file that reached max-tries). I guess I'm still not clear on why it would be a problem to not check maxTries here at all. That seems simpler than having some code elsewhere to delete the files when they reach maxTries (in addition to the cleanUp...() code, which does the same thing). =) https://codereview.chromium.org/2515353005/diff/200001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/200001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:154: } Hmm, I'm not sure that it's worth having this in addition to the general clean up code. Or, is there some particular behavior you're thinking of where it would help? https://codereview.chromium.org/2515353005/diff/200001/components/minidump_up... File components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java (right): https://codereview.chromium.org/2515353005/diff/200001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:437: initialMinidumps.length, fileManager.getAllMinidumpFiles(1 /* maxTries */).length); nit: I'd still prefer to drop the initialMinidumps variable entirely in favor of the assert that you added, but I guess I don't feel strongly about it =P 
 Thanks a lot Ilya! I ended up posting a couple of new patch sets - PS13 fixes a bug where I would delete the first file passed from getAllMinidumps if we hit the limit on the number of minidumps to pass -- and because the files returned from getAllMinidumps are sorted from newest to oldest (not oldest to newest) we would end up deleting the newest minidump instead of the oldest ;). This wasn't caught by the tests because the modification times of the files in the tests were the same (not enough time had passed between the copying of the different files to make a difference in the modification-times of the minidumps). The bug is now fixed and the tests are modified to catch this in the future. https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/180001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:456: File[] allMinidumpFiles = getAllMinidumpFiles(maxTries); On 2016/12/19 21:26:54, Ilya Sherman (Away De.29-Ja.4) wrote: > On 2016/12/19 14:28:26, gsennton wrote: > > On 2016/12/19 03:34:03, Ilya Sherman wrote: > > > Hmm, why is maxTries passed in here? I *think* the result is that we keep > > > around minidumps that have exceeded the max number of attempts, and they're > > > cleared out at some future date as part of the general clean up code. Am I > > > understanding the behavior correctly? Is it intentional? > > > > Well, the result of using maxTries here is that we ignore files that have > > exceeded the maximum number of attempts when counting minidumps towards the > > storage-restriction. So these files will indeed be cleared out at some future > > date, and before then they won't be counted as minidumps. > > I guess we don't need files that have reached the max number of tries anymore > > given that we don't have a force-upload alternative for WebView, so maybe we > > should just delete files as soon as they reach the max-tries limit? > > > > I now added a piece of code in MinidumpUploaderImpl#UploadRunnable.run() to > > delete a file after trying to upload it - if it reached the max number of > tries. > > Also changed MinidumpUploaderTest.testFailUploadingMinidumps() to pass after > > this (and ensure we do indeed delete a file that reached max-tries). > > I guess I'm still not clear on why it would be a problem to not check maxTries > here at all. That seems simpler than having some code elsewhere to delete the > files when they reach maxTries (in addition to the cleanUp...() code, which does > the same thing). =) Oooooooooooooooooooooooooooh, aha, now I get your point - my mind was still in the 'if we have reached our limit, then block future copies' mindset (in which counting minidumps that have been retried 'maxTries' times towards our MAX_CRASH_REPORTS limit is not a good idea), but given that we now just delete the oldest files (or rather the ones that have been left untouched for the longest time) we will end up deleting the maxTries files when they're in the way. Thanks for being persistent here :) Removing the maxTries-check and reverting the change to explicitly delete failed files. https://codereview.chromium.org/2515353005/diff/200001/android_webview/java/s... File android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java (right): https://codereview.chromium.org/2515353005/diff/200001/android_webview/java/s... android_webview/java/src/org/chromium/android_webview/crash/MinidumpUploaderImpl.java:154: } On 2016/12/19 21:26:54, Ilya Sherman (Away De.29-Ja.4) wrote: > Hmm, I'm not sure that it's worth having this in addition to the general clean > up code. Or, is there some particular behavior you're thinking of where it > would help? Well, just to make sure we are on the same page, there are edge-cases where we might end up deleting minidumps that still have some upload-tries left in them in favour of deleting minidumps that have reached the maximum # of tries. Namely, if we the last upload-attempt for the minidump that now reached its maxTries happened after the last upload-attempt of a minidump with upload-tries left in it, then we will delete the minidump with upload-tries left in it, first. But given that this should be a fairly uncommon case (I think), it shouldn't be worth adding extra clean-up code just for that case (let me know if you have a different opinion on this :)). Reverting this clean-up code. https://codereview.chromium.org/2515353005/diff/200001/components/minidump_up... File components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java (right): https://codereview.chromium.org/2515353005/diff/200001/components/minidump_up... components/minidump_uploader/android/javatests/src/org/chromium/components/minidump_uploader/CrashFileManagerTest.java:437: initialMinidumps.length, fileManager.getAllMinidumpFiles(1 /* maxTries */).length); On 2016/12/19 21:26:54, Ilya Sherman (Away De.29-Ja.4) wrote: > nit: I'd still prefer to drop the initialMinidumps variable entirely in favor of > the assert that you added, but I guess I don't feel strongly about it =P Oh, woops, I forgot that initialMinidumps was being used here >.<, there is no use for initialMinidumps given that assert, removing initialMinidumps ;) 
 LGTM. I look forward to witnessing the firepower of this fully armed and operational battle station. 
 Cool, thanks Toby! Given that I have fixed Ilya's comments in PS12, PS13 is a small change (bug-fix), and this whole mechanism is disabled by default I'm committing this now :) (working with a big CL like this is a bit of a pain). Ilya: if you have any concerns regarding PS12 or PS13 please let me know and I'll fix them asap :) 
 The CQ bit was checked by gsennton@chromium.org 
 The patchset sent to the CQ was uploaded after l-g-t-m from isherman@chromium.org Link to the patchset: https://codereview.chromium.org/2515353005/#ps240001 (title: "Fix bug that deleted the newest minidump instead of the oldest when copying a minidump after hittin…") 
 CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or... 
 The CQ bit was unchecked by commit-bot@chromium.org 
 Try jobs failed on following builders: win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...) 
 The CQ bit was checked by gsennton@chromium.org to run a CQ dry run 
 The CQ bit was unchecked by gsennton@chromium.org 
 On 2016/12/20 15:37:25, commit-bot: I haz the power wrote: > Try jobs failed on following builders: > win_chromium_x64_rel_ng on master.tryserver.chromium.win (JOB_FAILED, > http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...) lgtm 
 Thanks, Gustav! Still LGTM, with one minor comment that could easily be addressed in a follow-up CL. https://codereview.chromium.org/2515353005/diff/260001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/260001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:571: } Hmm, does this really need to be part of the CrashFileManager file? I bet we could instead tuck this into the test code directly. 
 For the record, Selim's l g t m was for the DEPS changes under android_webview/ Let's hope the trybots pass this time :) https://codereview.chromium.org/2515353005/diff/260001/components/minidump_up... File components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java (right): https://codereview.chromium.org/2515353005/diff/260001/components/minidump_up... components/minidump_uploader/android/java/src/org/chromium/components/minidump_uploader/CrashFileManager.java:571: } On 2016/12/21 03:23:26, Ilya Sherman (Away De.29-Ja.4) wrote: > Hmm, does this really need to be part of the CrashFileManager file? I bet we > could instead tuck this into the test code directly. True :), inlined this in CrashFileManagerTest.testMinidumpStorageRestrictions(). 
 The CQ bit was checked by gsennton@chromium.org 
 The patchset sent to the CQ was uploaded after l-g-t-m from tobiasjs@chromium.org, sgurun@chromium.org, isherman@chromium.org Link to the patchset: https://codereview.chromium.org/2515353005/#ps300001 (title: "Fix findbugs errors.") 
 CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or... 
 The CQ bit was unchecked by commit-bot@chromium.org 
 Try jobs failed on following builders: android_clang_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_clan...) 
 The CQ bit was checked by gsennton@chromium.org 
 The patchset sent to the CQ was uploaded after l-g-t-m from sgurun@chromium.org, isherman@chromium.org, tobiasjs@chromium.org Link to the patchset: https://codereview.chromium.org/2515353005/#ps320001 (title: "Fix findbugs errors for realz!") 
 CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or... 
 CQ is committing da patch.
Bot data: {"patchset_id": 320001, "attempt_start_ts": 1482330496614780,
"parent_rev": "0cad41004d1de351ce92b24210096ecad5498da5", "commit_rev":
"a89ee55d125a1da267409100698a86c244ebd0db"}
          
 
            
              
                Message was sent while issue was closed.
              
            
             
          
        Description was changed from ========== [Android WebView] Implement copying and uploading of Minidumps. During WebView start-up after crashing WebView, we will pass previously created minidumps to CrashReceiverService which copies the minidumps to the current WebView provider's data directory and starts a job through JobScheduler. This job will eventually start a MinidumpUploadJobService which in turn uploads Minidumps to Crash servers. Both the copying-service and the upload-service run in processes separate from the crashing app-process. BUG=652730 ========== to ========== [Android WebView] Implement copying and uploading of Minidumps. During WebView start-up after crashing WebView, we will pass previously created minidumps to CrashReceiverService which copies the minidumps to the current WebView provider's data directory and starts a job through JobScheduler. This job will eventually start a MinidumpUploadJobService which in turn uploads Minidumps to Crash servers. Both the copying-service and the upload-service run in processes separate from the crashing app-process. BUG=652730 Review-Url: https://codereview.chromium.org/2515353005 ========== 
 
            
              
                Message was sent while issue was closed.
              
            
             
          
        Committed patchset #17 (id:320001) 
 
            
              
                Message was sent while issue was closed.
              
            
             
          
        Description was changed from ========== [Android WebView] Implement copying and uploading of Minidumps. During WebView start-up after crashing WebView, we will pass previously created minidumps to CrashReceiverService which copies the minidumps to the current WebView provider's data directory and starts a job through JobScheduler. This job will eventually start a MinidumpUploadJobService which in turn uploads Minidumps to Crash servers. Both the copying-service and the upload-service run in processes separate from the crashing app-process. BUG=652730 Review-Url: https://codereview.chromium.org/2515353005 ========== to ========== [Android WebView] Implement copying and uploading of Minidumps. During WebView start-up after crashing WebView, we will pass previously created minidumps to CrashReceiverService which copies the minidumps to the current WebView provider's data directory and starts a job through JobScheduler. This job will eventually start a MinidumpUploadJobService which in turn uploads Minidumps to Crash servers. Both the copying-service and the upload-service run in processes separate from the crashing app-process. BUG=652730 Committed: https://crrev.com/19b0d1b784a9581b02a5dbe8e6fa1d2556743e75 Cr-Commit-Position: refs/heads/master@{#440130} ========== 
 
            
              
                Message was sent while issue was closed.
              
            
             
          
        Patchset 17 (id:??) landed as https://crrev.com/19b0d1b784a9581b02a5dbe8e6fa1d2556743e75 Cr-Commit-Position: refs/heads/master@{#440130}  | 
    
