|
|
DescriptionAdd raster timing to ImageDecodeBench
When making changes to an image decoder, those changes may also effect
how the decoded image is used. This means raster behavior may change as
well.
An example is not correcting mis-reported alpha. Raster may now go down
the has-alpha path where it wouldn't before.
Since image decode timing is tied to the decoded image's usage, that
needs to be tracked as well.
BUG=724657
Patch Set 1 #Patch Set 2 : Rebase #Patch Set 3 : Add includes which were accidentally removed #
Total comments: 4
Patch Set 4 : Re-adding code that didn't get migrated to the new CL #Patch Set 5 : Clarify decoded information. Use 'i' for loop variables #Patch Set 6 : Rebase. Remove blink:: and use fprintf() / exit() like upstream #
Total comments: 6
Patch Set 7 : Rebase #Patch Set 8 : Error when packet size and raster timing are both specified #Patch Set 9 : Rebase #Patch Set 10 : Create decoder inside TimeRaster #
Total comments: 26
Patch Set 11 : Rebase #Patch Set 12 : Error immediately on failure. Simplify comment & variable name #Patch Set 13 : Remove const from variables that are close to their usage. Move variables closer to usage #Patch Set 14 : Rebase #Messages
Total messages: 54 (41 generated)
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
cblume@chromium.org changed reviewers: + noel@chromium.org, scroggo@chromium.org
PTAL This is from a previous CL: https://codereview.chromium.org/2880953002/ I separated that CL into smaller parts. This part adds raster timing.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...) ios-device-xcode-clang on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device-xcode-...) mac_chromium_compile_dbg_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: android_arm64_dbg_recipe on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/android_arm6...)
https://codereview.chromium.org/2893023003/diff/40001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/40001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:312: // Force enough parsing to later determine the image's size. I find this comment confusing. Calling FrameBufferAtIndex(0) parses enough to determine the size immediately. If you only want the size, you can call if (decoder->IsSizeAvailable()) { size = decoder->Size(); } That said, if the size is not available, you probably do not want to continue anyway, so this could be: if (!decoder->IsSizeAvailable()) { printf("failed to decode\n"); return; } const auto size = decoder->Size(); What should we do if the decode fails? We've already ensured that there is a valid size, but it could still fail during parsing, before we decode anything, in which case I think FrameBufferAtIndex will not actually decode anything. It could also fail during decode, in which case we probably don't want to time it. Then the frame will be FramePartial, but since it's guaranteed to have transparent pixels (and be treated that way), I don't think it makes sense to time for the comparison we want. https://codereview.chromium.org/2893023003/diff/40001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:322: for (size_t iteration = 0; iteration < iterations; ++iteration) { nit: I would change "iteration" to "i". It makes it much easier to read this line.
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
https://codereview.chromium.org/2893023003/diff/40001/third_party/WebKit/Sour... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/40001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:312: // Force enough parsing to later determine the image's size. On 2017/05/22 12:50:14, scroggo_chromium wrote: > I find this comment confusing. Calling FrameBufferAtIndex(0) parses enough to > determine the size immediately. If you only want the size, you can call > > if (decoder->IsSizeAvailable()) { > size = decoder->Size(); > } > > That said, if the size is not available, you probably do not want to continue > anyway, so this could be: > > if (!decoder->IsSizeAvailable()) { > printf("failed to decode\n"); > return; > } > > const auto size = decoder->Size(); > > What should we do if the decode fails? We've already ensured that there is a > valid size, but it could still fail during parsing, before we decode anything, > in which case I think FrameBufferAtIndex will not actually decode anything. It > could also fail during decode, in which case we probably don't want to time it. > Then the frame will be FramePartial, but since it's guaranteed to have > transparent pixels (and be treated that way), I don't think it makes sense to > time for the comparison we want. Done. I think you are right that if we don't have FrameComplete we don't want to continue with the timing. I'll do a similar error-on-failure if the first frame could not be completely decoded. https://codereview.chromium.org/2893023003/diff/40001/third_party/WebKit/Sour... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:322: for (size_t iteration = 0; iteration < iterations; ++iteration) { On 2017/05/22 12:50:14, scroggo_chromium wrote: > nit: I would change "iteration" to "i". It makes it much easier to read this > line. Done.
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
I just want to remind you that it is helpful for your reviewers if you stop rebasing along the way. I try to look at what has changed from the last CL I reviewed, and I see changes that I'm already reviewing in another CL. https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:325: exit(3); Right now it looks like we pass the following values to exit: 1: error in the arguments 2: error reading the file 3: error decoding Should we use different numbers for failures to decode the size, versus incomplete image, etc? https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:401: packet_size = strtol(argv[3], &end, 10); What should happen if a user combines packet_size with raster? An error?
Sorry about combining the rebase with code changes. I realized the mistake as I was uploading the patch. https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:325: exit(3); On 2017/05/23 15:57:09, scroggo_chromium wrote: > Right now it looks like we pass the following values to exit: > 1: error in the arguments > 2: error reading the file > 3: error decoding > > Should we use different numbers for failures to decode the size, versus > incomplete image, etc? ImageDecodeBench currently (prior to this patch) reuses exit values for categories. For example, both failing to open the file [1] and failing to read the file [2] will exit(2). Both are in the category of file errors. [1] https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/testi... [2] https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/testi... Like wise for various command line parsing errors, which exit(1). I'm okay with different error codes if we want. But maybe we want to do it in a separate patch since this mirrors existing behavior. https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:401: packet_size = strtol(argv[3], &end, 10); On 2017/05/23 15:57:09, scroggo_chromium wrote: > What should happen if a user combines packet_size with raster? An error? Packet size should be silently ignored in that case. When reading the args, --color-correct and --raster alter argc and argv so other parsing still works. We'll end up parsing packet_size, but we decode the full image (non-packeted) once and use it to time raster. Conceptually, packet size shouldn't matter when timing raster. So silently ignoring it seemed the most correct.
lgtm https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:401: packet_size = strtol(argv[3], &end, 10); On 2017/05/23 18:07:31, cblume wrote: > On 2017/05/23 15:57:09, scroggo_chromium wrote: > > What should happen if a user combines packet_size with raster? An error? > > Packet size should be silently ignored in that case. > When reading the args, --color-correct and --raster alter argc and argv so other > parsing still works. > > We'll end up parsing packet_size, but we decode the full image (non-packeted) > once and use it to time raster. > > Conceptually, packet size shouldn't matter when timing raster. So silently > ignoring it seemed the most correct. My vote is for an error. Since it shouldn't matter, the user may be doing something wrong if they tried to use both.
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Noel@ I think this is ready for your approval. (Although this relates to image decoder, it is in platforms/testing/ and Noel@ covers both areas nicely.) https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/100001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:401: packet_size = strtol(argv[3], &end, 10); On 2017/05/23 18:41:18, scroggo_chromium wrote: > On 2017/05/23 18:07:31, cblume wrote: > > On 2017/05/23 15:57:09, scroggo_chromium wrote: > > > What should happen if a user combines packet_size with raster? An error? > > > > Packet size should be silently ignored in that case. > > When reading the args, --color-correct and --raster alter argc and argv so > other > > parsing still works. > > > > We'll end up parsing packet_size, but we decode the full image (non-packeted) > > once and use it to time raster. > > > > Conceptually, packet size shouldn't matter when timing raster. So silently > > ignoring it seemed the most correct. > > My vote is for an error. Since it shouldn't matter, the user may be doing > something wrong if they tried to use both. Yeah, I think you're right. I added an error for it.
Note: We will want to finish the upstream changes so I don't submit-after-lgtm.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: chromeos_daisy_chromium_compile_only_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromeos_daisy_...) chromium_presubmit on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...) linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_chromeos_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_compile_dbg_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_tsan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:12: // ./out/Release/image_decode_bench file [--raster] [iterations] Remove this change, just basic usage here. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:350: const bool all_data_received = true; bool all_data_received = true https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:354: printf("failed to decode size\n"); Errors during benchmarking are FATAL everywhere in the benchmarking code. Please use ... if (!decoder->IsSizeAvailable()) { fprintf(stderr, "failed to decode size\n"); exit(3); } https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:356: } } const auto size = decoder->Size(); https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:361: printf("failed to decode first frame\n"); Errors during benchmarking are FATAL, fprintf(stderr,...) and exit(3) https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:376: for (double iteration_time : timings) { iteration_time -> raster_draw_time? https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:377: printf("%f,\n", iteration_time); This outputs are stream of raster draw times. I expect this is a sequence of IID random variables, with some sample mean and variance. Why don't we compute the mean and variance here? If I wrote a bench that testing 200 images say, this code would currently produce 200 x [iterations] rows of data, which is unwieldy to deal with to say the least. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:470: TimePacketedDecode(data.Get(), apply_color_correction, packet_size, TimePacketedDecode is not in this patch. I agree with Leon: having the current patch code intermixed with code from some other patch is not good. My approval would apply to changes in this CL that are not included in the CL. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:473: TimeDecode(data.Get(), apply_color_correction, iterations); Ditto.
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:12: // ./out/Release/image_decode_bench file [--raster] [iterations] On 2017/05/25 13:01:59, noel gordon wrote: > Remove this change, just basic usage here. Done. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:350: const bool all_data_received = true; On 2017/05/25 13:01:59, noel gordon wrote: > bool all_data_received = true I've removed the const. But are you sure? The only reason there is a variable at all is to give a name to the "true" as a sort of self-documentation. The value never changes because it isn't intended to be an actual variable. I have a separate bit of work (will be a separate CL) to change this parameter from a bool to an enum so it will be self-documenting. It is amazing how much code calls ImageDecoder::SetData either directly or indirectly. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:354: printf("failed to decode size\n"); On 2017/05/25 13:01:59, noel gordon wrote: > Errors during benchmarking are FATAL everywhere in the benchmarking code. > Please use ... > > if (!decoder->IsSizeAvailable()) { > fprintf(stderr, "failed to decode size\n"); > exit(3); > } Done. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:356: } On 2017/05/25 13:01:59, noel gordon wrote: > } > > const auto size = decoder->Size(); Done. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:361: printf("failed to decode first frame\n"); On 2017/05/25 13:01:59, noel gordon wrote: > Errors during benchmarking are FATAL, fprintf(stderr,...) and exit(3) Done. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:376: for (double iteration_time : timings) { On 2017/05/25 13:01:59, noel gordon wrote: > iteration_time -> raster_draw_time? Done. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:377: printf("%f,\n", iteration_time); On 2017/05/25 13:01:59, noel gordon wrote: > This outputs are stream of raster draw times. I expect this is a sequence of > IID random variables, with some sample mean and variance. > > Why don't we compute the mean and variance here? > > If I wrote a bench that testing 200 images say, this code would currently > produce 200 x [iterations] rows of data, which is unwieldy to deal with to say > the least. I completely agree that it is unwieldy. But I think it is an unfortunate requirement in order to correctly make decisions using the data. Anscombe's quartet [1] has the exact same mean and variance but is drastically different between data sets. This was recently extended by some researchers at Autodesk from 4 data sets to 13 and now with even more wild differences [2]. And while that is theoretically nice and all, it may actually have already caused us problems. While investigating this, I happened upon some odd data that I only noticed because I visualized it [3]. It looks like in an iteration we're only marking the final frame as complete and thus not using the cache of the other frames. If [3] is correct, this is a major bug and a perfect example of why we need to visualize. Only reporting mean & variance is not enough. [1] https://en.wikipedia.org/wiki/Anscombe%27s_quartet [2] https://www.autodeskresearch.com/publications/samestats [3] https://docs.google.com/a/chromium.org/spreadsheets/d/1wBWGmSvWF1gnRMUzW3A8od... https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:470: TimePacketedDecode(data.Get(), apply_color_correction, packet_size, On 2017/05/25 13:01:59, noel gordon wrote: > TimePacketedDecode is not in this patch. I agree with Leon: having the current > patch code intermixed with code from some other patch is not good. My approval > would apply to changes in this CL that are not included in the CL. These are not real changes to TimePacketedDecode/TimeDecode. They are just now inside the } else { branch for if (time_raster) { The code review tool picked up that TimeDecode() was only indented. But it didn't pick up that TimePacketedDecode() was only indented because it seems like: -if (packeted_size) { +if (time_raster) { ... +if (packeted_size) { This CL should only contain changes needed for raster. Anything else would have been a mistake.
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:350: const bool all_data_received = true; On 2017/05/30 21:55:08, cblume wrote: > On 2017/05/25 13:01:59, noel gordon wrote: > > bool all_data_received = true > > I've removed the const. > > But are you sure? Yes I am sure. > The only reason there is a variable at all is to give a name > to the "true" as a sort of self-documentation. Agree it is for self-documentation. > The value never changes because it isn't intended to be an actual variable. line 295 right: constexpr bool total_all_data_received = true; line 313 right: bool all_data_received = position == data->size(); line 350 left: const bool all_data_received = true; If we could do consistently, that'd be good. Where it is used as self-documentation, the const doesn't buy us much. Where it separated by many lines from the point of use, a const can help. Point is, please be consistent. > I have a separate bit of work (will be a separate CL) to change this parameter > from a bool to an enum so it will be self-documenting. It is amazing how much > code calls ImageDecoder::SetData either directly or indirectly. Admirable, but maybe a P4. Main P0 would be to focus on getting skia decoders enabled in blink. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:356: } On 2017/05/30 21:55:08, cblume wrote: > On 2017/05/25 13:01:59, noel gordon wrote: > > } > > > > const auto size = decoder->Size(); > > Done. Ok, next step would be to move it where used. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:377: printf("%f,\n", iteration_time); On 2017/05/30 21:55:08, cblume wrote: > On 2017/05/25 13:01:59, noel gordon wrote: > > This outputs are stream of raster draw times. I expect this is a sequence of > > IID random variables, with some sample mean and variance. > > > > Why don't we compute the mean and variance here? > > > > If I wrote a bench that testing 200 images say, this code would currently > > produce 200 x [iterations] rows of data, which is unwieldy to deal with to say > > the least. > > I completely agree that it is unwieldy. > But I think it is an unfortunate requirement in order to correctly make > decisions using the data. > > Anscombe's quartet [1] has the exact same mean and variance but is drastically > different between data sets. > > This was recently extended by some researchers at Autodesk from 4 data sets to > 13 and now with even more wild differences [2]. The fact that a small data set of samples, with different shapes, end up with the same sample mean is uninteresting. Perturb one point in each data set by a small amount, and none of the data sets agree on the sample mean. Statistical estimation requires that we invoke the Central Limit Theorem (CLT), which says sufficient samples must be taken for the sample mean to converge the actual population mean of the data with low variance. The cost of image decode X is always the same, but variations in the measurement system (your computer) introduce variance in the measured samples of X. The actual measured data points are X + E, where E is a random variable and X is constant. E is noise in the sampled data, and the statistical variance of E is the sample variance. The estimation goal is take to sufficient samples so we could estimate the population mean, and that requires that the measurement system is also controlled, so the E term is controlled. There should be nothing else running on your system when you measure, all applications should be quit, screen savers turned off, correct power management settings, etc. When I measure, I do that, the only thing running on my machine(s) is the benchmark tool. I have never had the variational issues you have mentioned elsewhere. We tend to take 500+ samples per image, and compute the sample mean (aka compute the average). By the CLT, if we have controlled the measurement environment, the result will have low sample variance. If N is the number of samples, the sample variance of E is proportion to 1 / (N-1), which -> 0 as N increases. This fact allows the CLT to converge to the population mean. Averaging (ie computing the sample mean) is related to low-pass filtering. The number of taps in your filter is N, and as N increases (more samples), the filter output becomes constant since the filter will only allow the DC frequency component of the data to pass through to the output (the sample mean). Any variation of the data about the DC component is completely filtered out. If there is anything to visualize, it would be the CLT at work on your data. If there are sources of variation (noise) coming from your system, then that should be controlled / removed. If you do not do that, then the only option is to significantly increase N so the CLT can converge, because the mathematics assures us it will always converge to the population mean, X. Plot the sample mean and variance as a function of N. Aside: Non Linear Filtering and the Median Often people doing measurement are impatient (small N), and there measurements are known to have outliers (impulsive noise). In that case compute the sample median. Given a data sample: 5, 6, 4, 5, 6, 35, 4, 4, 5 sort the data, and pick the middle value. 4, 4, 4, 5, 5, 5, 6, 6, 35 result: 5, the sample median. This operation is median filtering, it is non-linear, but it has the property the impulsive noise is completely removed from the data. Linear filters can do this too (due to the CLT) but require a much larger N to do so. > And while that is theoretically nice and all, it may actually have already > caused us problems. While investigating this, I happened upon some odd data that > I only noticed because I visualized it [3]. It looks like in an iteration we're > only marking the final frame as complete and thus not using the cache of the > other frames. So are you telling me that if I decode a PNG, a single frame image, the frame is not marked frame complete? Perhaps the problem is related to animated image frames only, which is either a bug in our decoders (doubtful) or a bug in the changes you have made to the image benchmark tool. > If [3] is correct, this is a major bug and a perfect example of why we need to > visualize. Visualizing is good to help identify and remove such bugs, but ... > Only reporting mean & variance is not enough. once you have removed any bugs, the https://en.wikipedia.org/wiki/Central_limit_theorem says that the sample mean and variance are all that is needed to reliably estimate the population mean. > [1] https://en.wikipedia.org/wiki/Anscombe%27s_quartet > [2] https://www.autodeskresearch.com/publications/samestats > [3] > https://docs.google.com/a/chromium.org/spreadsheets/d/1wBWGmSvWF1gnRMUzW3A8od...
https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:350: const bool all_data_received = true; On 2017/05/31 16:31:29, noel gordon wrote: > line 295 right: > constexpr bool total_all_data_received = true; > line 313 right: > bool all_data_received = position == data->size(); > line 350 left: > const bool all_data_received = true; > Point is, please be consistent. I'm embarrassed :) Consolidated to non-const(expr). > > I have a separate bit of work (will be a separate CL) to change this parameter > > from a bool to an enum so it will be self-documenting. It is amazing how much > > code calls ImageDecoder::SetData either directly or indirectly. > > Admirable, but maybe a P4. Main P0 would be to focus on getting skia decoders > enabled in blink. I strongly agree. That's why I haven't gotten to the point of it having a CL. It's ultra-backburner. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:356: } On 2017/05/31 16:31:29, noel gordon wrote: > On 2017/05/30 21:55:08, cblume wrote: > > On 2017/05/25 13:01:59, noel gordon wrote: > > > } > > > > > > const auto size = decoder->Size(); > > > > Done. > > Ok, next step would be to move it where used. Done. Also removed the const since it is now close. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:377: printf("%f,\n", iteration_time); On 2017/05/31 16:31:29, noel gordon wrote: > Statistical estimation requires that we invoke the Central Limit Theorem (CLT), > which says sufficient samples must be taken for the sample mean to converge the > actual population mean of the data with low variance. I agree with what you are saying that we can get the mean & variance. I feel like if I only had those I would be missing important information still. Although, perhaps this isn't always true. For example, when timing raster when we corrected alpha I saw that the min values were the same. But there was added noise when not correcting alpha, which was a clue to me. If I simply took enough samples to get a small variance, I would be ignoring the fact that the noise was important. It clued me into what may have been causing a different mean. There is a similar story for the only-Complete-the-last-frame. However, I do strongly agree that this might be so unwieldy that 1.) it discourages use of this tool at all, and 2.) maybe we don't always need it. What do you think about me adding a command line parameter to output this? That way by default it can only output the mean & variance. But if you want to look under the hood you can. > So are you telling me that if I decode a PNG, a single frame image, the frame is > not marked frame complete? Perhaps the problem is related to animated image > frames only, which is either a bug in our decoders (doubtful) or a bug in the > changes you have made to the image benchmark tool. For single-frame images, once decoded, the frame is marked as complete. For multi-frame images, only the last frame seems to be marked as complete. It may only be a bug in my changes. I hope that is the case. It showed up in https://codereview.chromium.org/2880953002/ patch set 11. Patch set 12 creates a new image decoder for each iteration because the frame cache was not actually timing decodes. But that mistake is why I noticed at all. I'll be investigating it further soon.
The CQ bit was checked by cblume@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... File third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp (right): https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:350: const bool all_data_received = true; On 2017/05/31 22:49:32, cblume wrote: > On 2017/05/31 16:31:29, noel gordon wrote: > > line 295 right: > > constexpr bool total_all_data_received = true; > > line 313 right: > > bool all_data_received = position == data->size(); > > line 350 left: > > const bool all_data_received = true; > > Point is, please be consistent. > > I'm embarrassed :) > Consolidated to non-const(expr). Thanks. > > > I have a separate bit of work (will be a separate CL) to change this > parameter > > > from a bool to an enum so it will be self-documenting. It is amazing how > much > > > code calls ImageDecoder::SetData either directly or indirectly. > > > > Admirable, but maybe a P4. Main P0 would be to focus on getting skia decoders > > enabled in blink. > > I strongly agree. That's why I haven't gotten to the point of it having a CL. > It's ultra-backburner. Nod. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:356: } On 2017/05/31 22:49:32, cblume wrote: > On 2017/05/31 16:31:29, noel gordon wrote: > > On 2017/05/30 21:55:08, cblume wrote: > > > On 2017/05/25 13:01:59, noel gordon wrote: > > > > } > > > > > > > > const auto size = decoder->Size(); > > > > > > Done. > > > > Ok, next step would be to move it where used. > > Done. > Also removed the const since it is now close. Good. https://codereview.chromium.org/2893023003/diff/180001/third_party/WebKit/Sou... third_party/WebKit/Source/platform/testing/ImageDecodeBench.cpp:377: printf("%f,\n", iteration_time); On 2017/05/31 22:49:32, cblume wrote: > On 2017/05/31 16:31:29, noel gordon wrote: > > Statistical estimation requires that we invoke the Central Limit Theorem > (CLT), > > which says sufficient samples must be taken for the sample mean to converge > the > > actual population mean of the data with low variance. > > I agree with what you are saying that we can get the mean & variance. Well, we agree with the CLT :) > I feel like if I only had those I would be missing important information still. > Although, perhaps this isn't always true. > > For example, when timing raster when we corrected alpha I saw that the min > values were the same. But there was added noise when not correcting alpha, which > was a clue to me. > > If I simply took enough samples to get a small variance, I would be ignoring the > fact that the noise was important. It clued me into what may have been causing a > different mean. > > There is a similar story for the only-Complete-the-last-frame. OK. > However, I do strongly agree that this might be so unwieldy that 1.) it > discourages use of this tool at all, and 2.) maybe we don't always need it. > > What do you think about me adding a command line parameter to output this? That > way by default it can only output the mean & variance. But if you want to look > under thf e hood you can. Sounds like a good idea to me. We should be able to look under the hood, to see the variance if wanted, and to see the per-run decode timings (eg., 500 decode times for a given image) if wanted. > > So are you telling me that if I decode a PNG, a single frame image, the frame > is > > not marked frame complete? Perhaps the problem is related to animated image > > frames only, which is either a bug in our decoders (doubtful) or a bug in the > > changes you have made to the image benchmark tool. > > For single-frame images, once decoded, the frame is marked as complete. > For multi-frame images, only the last frame seems to be marked as complete. > > It may only be a bug in my changes. I hope that is the case. > It showed up in https://codereview.chromium.org/2880953002/ patch set 11. > Patch set 12 creates a new image decoder for each iteration because the frame > cache was not actually timing decodes. > But that mistake is why I noticed at all. > > I'll be investigating it further soon. Yes please investigate the source of this error. All frames should be marked complete, there should be no partial frame decodes, since we have all the encoded data of valid images, and should be sending all that data to the image decoder. |