|
|
Created:
5 years, 8 months ago by Anand Mistry (off Chromium) Modified:
5 years, 8 months ago CC:
chromium-reviews, chrome-apps-syd-reviews_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionFix two race conditions in ImageDecoder.
The two races are:
1. Call Cancel() immediately after calling Start().
2. OnImageDecoded() or OnDecodeImageFailed() can be called after the
request is cancelled, and the ImageRequest object potentially deleted.
BUG=474590
Patch Set 1 #
Total comments: 1
Patch Set 2 : Remove thread restrictions. #Patch Set 3 : Remove unused WeakPtr<>. #
Total comments: 12
Patch Set 4 : Address comments. #
Total comments: 13
Patch Set 5 : Review comments. #Messages
Total messages: 19 (1 generated)
amistry@chromium.org changed reviewers: + thestig@chromium.org, twellington@chromium.org
https://codereview.chromium.org/1068743002/diff/1/chrome/browser/image_decode... File chrome/browser/image_decoder.cc (right): https://codereview.chromium.org/1068743002/diff/1/chrome/browser/image_decode... chrome/browser/image_decoder.cc:135: DCHECK(image_request->task_runner()->RunsTasksOnCurrentThread()); The header file makes it sound as though ImageDecoder::Cancel() can run on any thread, but internally, it's being restricted here.
On Wednesday, 8 April 2015, <thestig@chromium.org> wrote: > > https://codereview.chromium.org/1068743002/diff/1/chrome/ > browser/image_decoder.cc > File chrome/browser/image_decoder.cc (right): > > https://codereview.chromium.org/1068743002/diff/1/chrome/ > browser/image_decoder.cc#newcode135 > chrome/browser/image_decoder.cc:135: > DCHECK(image_request->task_runner()->RunsTasksOnCurrentThread()); > The header file makes it sound as though ImageDecoder::Cancel() can run > on any thread, but internally, it's being restricted here. I've thought a bit more about this. I can lift this restriction, but only if I allow cancel to block pending completion of a running success/fail. Otherwise, this restriction needs to exist. Also, I realise that race #2 isn't fixed, although I've fixed the use after free. > > https://codereview.chromium.org/1068743002/ > -- Anand K. Mistry Software Engineer *Google* Australia To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
On 2015/04/07 22:52:06, chromium-reviews wrote: > On Wednesday, 8 April 2015, <mailto:thestig@chromium.org> wrote: > > https://codereview.chromium.org/1068743002/diff/1/chrome/ > > browser/image_decoder.cc#newcode135 > > chrome/browser/image_decoder.cc:135: > > DCHECK(image_request->task_runner()->RunsTasksOnCurrentThread()); > > The header file makes it sound as though ImageDecoder::Cancel() can run > > on any thread, but internally, it's being restricted here. > > > I've thought a bit more about this. I can lift this restriction, but only > if I allow cancel to block pending completion of a running success/fail. > Otherwise, this restriction needs to exist. Also, I realise that race #2 > isn't fixed, although I've fixed the use after free. I'm just saying the code and the header should match. I'm not particularly picky about which way you go. If you do make it more restrictive, then we need to make sure the new restriction works will all the existing callers. BTW, is there a bug file for this? Where are we running into these races?
On 2015/04/07 23:10:31, Lei Zhang wrote: > On 2015/04/07 22:52:06, chromium-reviews wrote: > > On Wednesday, 8 April 2015, <mailto:thestig@chromium.org> wrote: > > > https://codereview.chromium.org/1068743002/diff/1/chrome/ > > > browser/image_decoder.cc#newcode135 > > > chrome/browser/image_decoder.cc:135: > > > DCHECK(image_request->task_runner()->RunsTasksOnCurrentThread()); > > > The header file makes it sound as though ImageDecoder::Cancel() can run > > > on any thread, but internally, it's being restricted here. > > > > > > I've thought a bit more about this. I can lift this restriction, but only > > if I allow cancel to block pending completion of a running success/fail. > > Otherwise, this restriction needs to exist. Also, I realise that race #2 > > isn't fixed, although I've fixed the use after free. > > I'm just saying the code and the header should match. I'm not particularly picky > about which way you go. If you do make it more restrictive, then we need to make > sure the new restriction works will all the existing callers. Ok. Stay tuned. > > BTW, is there a bug file for this? Where are we running into these races? Um, I just noticed these races as I was looking at the code. But, magically overnight, someone filed a crash report. Yay!
PTAL. Fixed the threading restriction, test failures, and really fix race #2.
https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... File chrome/browser/image_decoder.cc (right): https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:48: ImageDecoder* decoder_; Isn't there just a single ImageDecoder that will outlive everything else since it's leaky? https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:51: bool is_cancelled_; Just null out |request_| instead? https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:91: decoder_->RemoveJob(this); This and the RemoveJob() call in ImageDecoder::Job::OnFailed() are the only two places that can remove a job from |g_decoder|'s |image_request_id_map_|. If a given job has been cancelled, RemoveJob() never gets to run. Wouldn't the job remain in the RequestMap forever and effectively be leaked? https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:207: // There should only be one. If a caller does Start(), Start(), Cancel(), wouldn't there be 2? https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:304: image_request_id_map_.erase(it++); Why keep going? Isn't there only a single instance of |job| in |image_request_id_map_| ? https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... File chrome/browser/image_decoder.h (right): https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.h:80: // ImageRequest's rask runner. typo
https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... File chrome/browser/image_decoder.cc (right): https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:48: ImageDecoder* decoder_; On 2015/04/08 01:30:10, Lei Zhang wrote: > Isn't there just a single ImageDecoder that will outlive everything else since > it's leaky? Good point. Removed and use lazy instance. https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:51: bool is_cancelled_; On 2015/04/08 01:30:10, Lei Zhang wrote: > Just null out |request_| instead? Can't. Doing so would require always accessing |request_| under |lock_|, which may cause a deadlock between ImageDecoder::CancelImpl and OnSuccess/OnFailed. https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:91: decoder_->RemoveJob(this); On 2015/04/08 01:30:10, Lei Zhang wrote: > This and the RemoveJob() call in ImageDecoder::Job::OnFailed() are the only two > places that can remove a job from |g_decoder|'s |image_request_id_map_|. > > If a given job has been cancelled, RemoveJob() never gets to run. Wouldn't the > job remain in the RequestMap forever and effectively be leaked? ImageDecoder::CancelImpl also does. However, I can see the confusion, so I've moved cancel's removal from the map into Job::Cancel for consistency. It's less efficient, but clearer. https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:207: // There should only be one. On 2015/04/08 01:30:10, Lei Zhang wrote: > If a caller does Start(), Start(), Cancel(), wouldn't there be 2? I wrote that off since it seems like a "Bad Idea", because there's no way of associating a decoded image with the original compressed one. But I guess it's currently supported, so might as well continue doing so. Fixed to support this case. https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.cc:304: image_request_id_map_.erase(it++); On 2015/04/08 01:30:10, Lei Zhang wrote: > Why keep going? Isn't there only a single instance of |job| in > |image_request_id_map_| ? Done. https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... File chrome/browser/image_decoder.h (right): https://codereview.chromium.org/1068743002/diff/40001/chrome/browser/image_de... chrome/browser/image_decoder.h:80: // ImageRequest's rask runner. On 2015/04/08 01:30:10, Lei Zhang wrote: > typo Done.
Ping! Turns out, I am in a bit of a rush (last minute thing). I'll be out in 2 days (Thursday, Sydney time) and won't be back for a week and a half. No corp laptop, so I'll be unable to merge this into M43. Someone else may have to take over this CL.
Sorry for the delay, I was unexpectedly OOO most of last week. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... File chrome/browser/image_decoder.cc (right): https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:43: // Runs on the requests's task runner. nit: sed/requests's/request's https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:75: base::Bind(&ImageDecoder::Job::OnSuccess, this, decoded_image)); Why are NotifyDecodeSucceeded and OnSuccess both needed? Couldn't this method do something like this? g_decoder.Pointer()->RemoveJob(this); request_->task_runner()->PostTask( FROM_HERE, base::Bind(&ImageRequest::OnImageDecoded, base::Unretained(request_), decoded_image)); https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:303: image_request_id_map_.erase(it++); nit: does it++ need to be incremented here since there's a break right after? https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... File chrome/browser/image_decoder.h (right): https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.h:84: class Job; Write a short description of what this class is used for. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.h:105: using RequestMap = std::map<int, scoped_refptr<Job>>; JobMap? https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.h:130: RequestMap image_request_id_map_; job_id_map_?
https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... File chrome/browser/image_decoder.cc (right): https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:43: // Runs on the requests's task runner. On 2015/04/13 18:19:30, twellington wrote: > nit: sed/requests's/request's Done. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:75: base::Bind(&ImageDecoder::Job::OnSuccess, this, decoded_image)); On 2015/04/13 18:19:30, twellington wrote: > Why are NotifyDecodeSucceeded and OnSuccess both needed? Couldn't this method do > something like this? > > g_decoder.Pointer()->RemoveJob(this); > request_->task_runner()->PostTask( > FROM_HERE, base::Bind(&ImageRequest::OnImageDecoded, > base::Unretained(request_), decoded_image)); > This is race condition #2. If you post ImageRequest::OnImageDecoded, it's possible for the ImageRequest to be deleted before the task is run, leading to a use-after-free. Job::OnSuccess needs to run on the task runner to ensure the ImageRequest exists when ImageRequest::OnImageDecoded run. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:303: image_request_id_map_.erase(it++); On 2015/04/13 18:19:30, twellington wrote: > nit: does it++ need to be incremented here since there's a break right after? Done. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... File chrome/browser/image_decoder.h (right): https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.h:84: class Job; On 2015/04/13 18:19:30, twellington wrote: > Write a short description of what this class is used for. Done. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.h:105: using RequestMap = std::map<int, scoped_refptr<Job>>; On 2015/04/13 18:19:30, twellington wrote: > JobMap? Done. https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.h:130: RequestMap image_request_id_map_; On 2015/04/13 18:19:30, twellington wrote: > job_id_map_? Done.
lgtm (I'm not a committer yet or an OWNER of anything, so Lei still needs to finish reviewing) https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... File chrome/browser/image_decoder.cc (right): https://codereview.chromium.org/1068743002/diff/60001/chrome/browser/image_de... chrome/browser/image_decoder.cc:75: base::Bind(&ImageDecoder::Job::OnSuccess, this, decoded_image)); On 2015/04/14 00:52:13, Anand Mistry wrote: > On 2015/04/13 18:19:30, twellington wrote: > > Why are NotifyDecodeSucceeded and OnSuccess both needed? Couldn't this method > do > > something like this? > > > > g_decoder.Pointer()->RemoveJob(this); > > request_->task_runner()->PostTask( > > FROM_HERE, base::Bind(&ImageRequest::OnImageDecoded, > > base::Unretained(request_), decoded_image)); > > > > This is race condition #2. If you post ImageRequest::OnImageDecoded, it's > possible for the ImageRequest to be deleted before the task is run, leading to a > use-after-free. Job::OnSuccess needs to run on the task runner to ensure the > ImageRequest exists when ImageRequest::OnImageDecoded run. Acknowledged.
It's now too late for me to make any changes to this, or merge it into M43 myself. I'll be off for 2 weeks with no access to my desktop. You can either take over this change, or I can submit it and get someone else to merge it.
On 2015/04/15 07:26:20, Anand Mistry wrote: > It's now too late for me to make any changes to this, or merge it into M43 > myself. I'll be off for 2 weeks with no access to my desktop. You can either > take over this change, or I can submit it and get someone else to merge it. I think it's critical for M43 (a clusterfuzz test caught a related heap-use-after-free error). I can take over the patch and make any other necessary changes. I'm not a committer yet, but I'm sure I can find someone to volunteer to merge it back.
ping?
On 2015/04/16 13:41:42, twellington wrote: > ping? Are you pinging thestig or amistry? amistry is now on leave and away from his desktop. So feel free to take over the patch.
On 2015/04/16 21:59:10, benwells wrote: > On 2015/04/16 13:41:42, twellington wrote: > > ping? > > Are you pinging thestig or amistry? > > amistry is now on leave and away from his desktop. So feel free to take over the > patch. I was pinging thestig.
https://codereview.chromium.org/1067593005/ landed instead. Thanks for getting the ball rolling though. |