|
|
DescriptionAdd color correction benchmark - with comparison to qcms
--colorImages is empty by default so this won't affect the runtime of
the bots. To run locally, use --colorImages <path> and
--benchType skcolorcodec.
Two takeaways so far:
Color correction is (currently) slower than jpeg decodes.
Our reference solution is slower than qcms.
TBR=reed@google.com
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2035793002
Committed: https://skia.googlesource.com/skia/+/2cee902847a940c4bab56b42dabbb3721be3f9ac
Patch Set 1 #
Total comments: 22
Patch Set 2 : Response to comments #
Total comments: 7
Patch Set 3 : More responses #
Total comments: 2
Patch Set 4 : #Patch Set 5 : Rebase #
Messages
Total messages: 27 (12 generated)
Description was changed from ========== Add color correction benchmark - with comparison to qcms --colorImages is empty by default so this won't affect the runtime of the bots. To run locally, use --colorImages <path> and --benchType skcolorcodec. Two takeaways so far: Color correction is (currently) slower than jpeg decodes. Our reference solution is slower than qcms. BUG=skia: ========== to ========== Add color correction benchmark - with comparison to qcms --colorImages is empty by default so this won't affect the runtime of the bots. To run locally, use --colorImages <path> and --benchType skcolorcodec. Two takeaways so far: Color correction is (currently) slower than jpeg decodes. Our reference solution is slower than qcms. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2035793002 ==========
msarett@google.com changed reviewers: + scroggo@google.com
https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:14: #if !defined(GOOGLE3) nit: My general preference would be to have a separate flag for QCMS, which gets set to false/undefined when GOOGLE3 is defined. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:21: DEFINE_bool(xform_only, false, "Only bench the color xform, do not include the decode time"); nit: "bench" -> "time" https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:51: if (srcSpace) { Should this be if (!srcSpace)? https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:54: sk_sp<SkColorSpace> dstSpace = SkColorSpace::NewICC(dstProfile->data(), dstProfile->size()); Do you want this to be part of the timing? Otherwise you could create this once outside the loop. (I assume in practice it doesn't need to be done for every decode?) https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:112: static void xform_only(SkData* encoded, void* dst, void* srcRow, const SkImageInfo& info, nit: Maybe comment out srcRow, since you're not using it? https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:115: if (srcSpace) { !srcSpace https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:124: // Transform in place Why is this in place, but not some others? https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:186: void ColorCodecBench::onDraw(int n, SkCanvas* canvas) { nit: no need for the name "canvas" here, which is unused. https://codereview.chromium.org/2035793002/diff/1/bench/nanobench.cpp File bench/nanobench.cpp (right): https://codereview.chromium.org/2035793002/diff/1/bench/nanobench.cpp#newcode932 bench/nanobench.cpp:932: return new ColorCodecBench(SkOSPath::Basename(path.c_str()).c_str(), encoded); nit: Should we std::move(encoded)? Also, should there be an error message if the data could not be read from the file? https://codereview.chromium.org/2035793002/diff/1/include/codec/SkCodec.h File include/codec/SkCodec.h (right): https://codereview.chromium.org/2035793002/diff/1/include/codec/SkCodec.h#new... include/codec/SkCodec.h:723: // FIXME: Remove this when we are done comparing with qcms. this -> these?
https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:14: #if !defined(GOOGLE3) On 2016/06/02 18:51:50, scroggo wrote: > nit: My general preference would be to have a separate flag for QCMS, which gets > set to false/undefined when GOOGLE3 is defined. I'll follow up with this change - so I can fix it here and elsewhere. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:21: DEFINE_bool(xform_only, false, "Only bench the color xform, do not include the decode time"); On 2016/06/02 18:51:50, scroggo wrote: > nit: "bench" -> "time" Done. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:51: if (srcSpace) { On 2016/06/02 18:51:50, scroggo wrote: > Should this be if (!srcSpace)? Yes! Good catch. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:54: sk_sp<SkColorSpace> dstSpace = SkColorSpace::NewICC(dstProfile->data(), dstProfile->size()); On 2016/06/02 18:51:50, scroggo wrote: > Do you want this to be part of the timing? Otherwise you could create this once > outside the loop. (I assume in practice it doesn't need to be done for every > decode?) Good point, let's create this outside the loop. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:112: static void xform_only(SkData* encoded, void* dst, void* srcRow, const SkImageInfo& info, On 2016/06/02 18:51:49, scroggo wrote: > nit: Maybe comment out srcRow, since you're not using it? Done. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:115: if (srcSpace) { On 2016/06/02 18:51:50, scroggo wrote: > !srcSpace Done. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:124: // Transform in place On 2016/06/02 18:51:50, scroggo wrote: > Why is this in place, but not some others? I decided to time the "xform only" benches in place. I think this makes more sense than keeping an entire src buffer and dst buffer. I'm not sure if the extra memory pressure would impact the timing (probably not), but it doesn't match Chrome's use case (where they only have a single dst buffer). When we time the decode and the xform together, we can do exactly what chrome does. First decode into a row buffer, then xform to the dst buffer. If this is too strange, I think we can make both src->dst without worrying too much about it changing anything. https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:186: void ColorCodecBench::onDraw(int n, SkCanvas* canvas) { On 2016/06/02 18:51:50, scroggo wrote: > nit: no need for the name "canvas" here, which is unused. Done. https://codereview.chromium.org/2035793002/diff/1/bench/nanobench.cpp File bench/nanobench.cpp (right): https://codereview.chromium.org/2035793002/diff/1/bench/nanobench.cpp#newcode932 bench/nanobench.cpp:932: return new ColorCodecBench(SkOSPath::Basename(path.c_str()).c_str(), encoded); On 2016/06/02 18:51:50, scroggo wrote: > nit: Should we std::move(encoded)? > > Also, should there be an error message if the data could not be read from the > file? Yes and yes. Done for both. https://codereview.chromium.org/2035793002/diff/1/include/codec/SkCodec.h File include/codec/SkCodec.h (right): https://codereview.chromium.org/2035793002/diff/1/include/codec/SkCodec.h#new... include/codec/SkCodec.h:723: // FIXME: Remove this when we are done comparing with qcms. On 2016/06/02 18:51:50, scroggo wrote: > this -> these? Done.
https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:124: // Transform in place On 2016/06/02 19:34:39, msarett wrote: > On 2016/06/02 18:51:50, scroggo wrote: > > Why is this in place, but not some others? > > I decided to time the "xform only" benches in place. I think this makes more > sense than keeping an entire src buffer and dst buffer. I'm not sure if the > extra memory pressure would impact the timing (probably not), but it doesn't > match Chrome's use case (where they only have a single dst buffer). > > When we time the decode and the xform together, we can do exactly what chrome > does. First decode into a row buffer, then xform to the dst buffer. > > If this is too strange, I think we can make both src->dst without worrying too > much about it changing anything. Looking at this a little harder, won't the second iteration transform the output from the first? And the third one will transform the output from the second? Don't you want to transform the same/original data each time? FWIW, I don't think you ever need an entire dst buffer - you only need one row, right? We're never going to look at the output of this test, so can you just write over it? Or might the transformation do some blending? https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:62: codec->getScanlines(srcRow, 1, 0); This will be awfully slow on interlaced PNG, and the output will be upside down/scrambled for bottom-up BMP/interlaced GIF. The latter may be fine for this benchmark, since you're not looking at the output, but it does make me think about how color transforms will interact with incrementalDecode... One nice thing about incrementalDecode is that the client doesn't need to care about the SkScanlineOrder - the codec knows the whole block of destination memory and writes to it in the right place. But if the client doesn't know what lines were written, they won't know which lines to transform. Some possible solutions: - do the transformation on every row - inefficient when the image is partially decoded, since we'll transform each row on each pass - make incrementalDecode() somehow report the rows it decoded into - also helps with the filling problem - but I don't yet know what that API would look like - pass the destination space to the codec, so it can do the transformation as it decodes. (This is more or less what Chrome does now, right?) - this seems the cleanest to me, but I thought we had reasons not to do it that way? - one is that the same image may be drawn to different destination spaces - e.g. two different monitors with different profiles Generally it's not clear to me when the transformation will be applied, and whether or not it will be in place. Maybe I'm missing something that makes this simpler...? https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:183: sk_sp<SkColorSpace> dstSpace = nullptr; Why not do all this setup in onDelayedSetup? If nanobench decides it only needs one loop to time this benchmark, this will still be included in the time. https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:198: fProc(fEncoded.get(), fDst.get(), fSrcRow.get(), fInfo, dstProfile, What if you made fProc call a member method? Then you wouldn't need to pass 6 parameters, and you wouldn't need to be cast dstProfile. OTOH, then fProc could mess with all the member variables in ways you do not intend.
https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/1/bench/ColorCodecBench.cpp#n... bench/ColorCodecBench.cpp:124: // Transform in place On 2016/06/02 20:42:31, scroggo wrote: > On 2016/06/02 19:34:39, msarett wrote: > > On 2016/06/02 18:51:50, scroggo wrote: > > > Why is this in place, but not some others? > > > > I decided to time the "xform only" benches in place. I think this makes more > > sense than keeping an entire src buffer and dst buffer. I'm not sure if the > > extra memory pressure would impact the timing (probably not), but it doesn't > > match Chrome's use case (where they only have a single dst buffer). > > > > When we time the decode and the xform together, we can do exactly what chrome > > does. First decode into a row buffer, then xform to the dst buffer. > > > > If this is too strange, I think we can make both src->dst without worrying too > > much about it changing anything. > > Looking at this a little harder, won't the second iteration transform the output > from the first? And the third one will transform the output from the second? > Don't you want to transform the same/original data each time? > You're right, I didn't think past the first loop iteration... I'll decode into a srcBuffer and then xform into the dstBuffer. > FWIW, I don't think you ever need an entire dst buffer - you only need one row, > right? We're never going to look at the output of this test, so can you just > write over it? Or might the transformation do some blending? You're right. I think I'll keep the dstBuffer though, just to match the actual use case as close as we can. https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:62: codec->getScanlines(srcRow, 1, 0); On 2016/06/02 20:42:31, scroggo wrote: > This will be awfully slow on interlaced PNG, and the output will be upside > down/scrambled for bottom-up BMP/interlaced GIF. The latter may be fine for this > benchmark, since you're not looking at the output, but it does make me think > about how color transforms will interact with incrementalDecode... > > One nice thing about incrementalDecode is that the client doesn't need to care > about the SkScanlineOrder - the codec knows the whole block of destination > memory and writes to it in the right place. > > But if the client doesn't know what lines were written, they won't know which > lines to transform. > > Some possible solutions: > - do the transformation on every row > - inefficient when the image is partially decoded, since we'll transform > each row on each pass > - make incrementalDecode() somehow report the rows it decoded into > - also helps with the filling problem > - but I don't yet know what that API would look like > - pass the destination space to the codec, so it can do the transformation as it > decodes. (This is more or less what Chrome does now, right?) > - this seems the cleanest to me, but I thought we had reasons not to do it > that way? > - one is that the same image may be drawn to different destination spaces > - e.g. two different monitors with different profiles > > Generally it's not clear to me when the transformation will be applied, and > whether or not it will be in place. Maybe I'm missing something that makes this > simpler...? "pass the destination space to the codec, so it can do the transformation as it decodes. (This is more or less what Chrome does now, right?)" Yes I think this needs to happen inside the codec, the way Chrome does it right now. We already have a way to pass the destination space to the codec (since there is an SkColorSpace on SkImageInfo). Maybe I'm wrong to write benchmarks before it's integrated with our codecs. Because maybe we'll need new ones after... But I'm actually thinking about integrating with Chrome's codecs first. I think this is a good way to measure the impact that we'll have there (starting with jpeg in particular). I'm really far away from thinking about BMP and GIF, I don't think anybody color corrects those anyway. Interlaced PNG/JPEG is an interesting thought. It looks Chrome will redo the correction every time a row is updated. ------------------------------------------- I also think the color xtransform logic needs to exist outside of the codecs, for other uses. Which I guess is why it's floating around in src/core right now. https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:183: sk_sp<SkColorSpace> dstSpace = nullptr; On 2016/06/02 20:42:31, scroggo wrote: > Why not do all this setup in onDelayedSetup? If nanobench decides it only needs > one loop to time this benchmark, this will still be included in the time. Even better. Done. https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:198: fProc(fEncoded.get(), fDst.get(), fSrcRow.get(), fInfo, dstProfile, On 2016/06/02 20:42:31, scroggo wrote: > What if you made fProc call a member method? Then you wouldn't need to pass 6 > parameters, and you wouldn't need to be cast dstProfile. OTOH, then fProc could > mess with all the member variables in ways you do not intend. I've made this change, and I think it makes things a lot cleaner :)
lgtm https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/20001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:62: codec->getScanlines(srcRow, 1, 0); On 2016/06/02 22:17:47, msarett wrote: > On 2016/06/02 20:42:31, scroggo wrote: > > This will be awfully slow on interlaced PNG, and the output will be upside > > down/scrambled for bottom-up BMP/interlaced GIF. The latter may be fine for > this > > benchmark, since you're not looking at the output, but it does make me think > > about how color transforms will interact with incrementalDecode... > > > > One nice thing about incrementalDecode is that the client doesn't need to care > > about the SkScanlineOrder - the codec knows the whole block of destination > > memory and writes to it in the right place. > > > > But if the client doesn't know what lines were written, they won't know which > > lines to transform. > > > > Some possible solutions: > > - do the transformation on every row > > - inefficient when the image is partially decoded, since we'll transform > > each row on each pass > > - make incrementalDecode() somehow report the rows it decoded into > > - also helps with the filling problem > > - but I don't yet know what that API would look like > > - pass the destination space to the codec, so it can do the transformation as > it > > decodes. (This is more or less what Chrome does now, right?) > > - this seems the cleanest to me, but I thought we had reasons not to do it > > that way? > > - one is that the same image may be drawn to different destination spaces > > - e.g. two different monitors with different profiles > > > > Generally it's not clear to me when the transformation will be applied, and > > whether or not it will be in place. Maybe I'm missing something that makes > this > > simpler...? > > "pass the destination space to the codec, so it can do the transformation as it > decodes. (This is more or less what Chrome does now, right?)" > > Yes I think this needs to happen inside the codec, the way Chrome does it right > now. Whew! That makes the most sense to me, with the caveat of I don't know what to do if the same image is used on multiple screens with different properties. That's probably an uncommon use case, though (and we don't handle it today). The simplest approach will be to decode twice, which is not perfect, but maybe it's okay. > We already have a way to pass the destination space to the codec (since > there is an SkColorSpace on SkImageInfo). Of course! I had been trying to wrap my head around putting the color space in the image info, but I think this demonstrates why it makes sense. > > Maybe I'm wrong to write benchmarks before it's integrated with our codecs. > Because maybe we'll need new ones after... No, I think it does make sense to write these benchmarks now. Even if it changes later you can find out how to improve the current code, which will likely still apply. > But I'm actually thinking about > integrating with Chrome's codecs first. I think this is a good way to measure > the impact that we'll have there (starting with jpeg in particular). I think that's the right approach. SkCodec still needs to finish incremental decoding and to support animation. And there's still some Chromium plumbing work to be done. > > I'm really far away from thinking about BMP and GIF, I don't think anybody color > corrects those anyway. Probably not, although we'll want to do it eventually. > > Interlaced PNG/JPEG is an interesting thought. It looks Chrome will redo the > correction every time a row is updated. > > ------------------------------------------- > > I also think the color xtransform logic needs to exist outside of the codecs, > for other uses. Which I guess is why it's floating around in src/core right > now. sgtm https://codereview.chromium.org/2035793002/diff/40001/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/40001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:21: , fDstSpaceQCMS(nullptr) This method is only declared if !GOOGLE3, so this needs to do the same.
https://codereview.chromium.org/2035793002/diff/40001/bench/ColorCodecBench.cpp File bench/ColorCodecBench.cpp (right): https://codereview.chromium.org/2035793002/diff/40001/bench/ColorCodecBench.c... bench/ColorCodecBench.cpp:21: , fDstSpaceQCMS(nullptr) On 2016/06/03 14:25:36, scroggo wrote: > This method is only declared if !GOOGLE3, so this needs to do the same. Yup, thanks!
The CQ bit was checked by msarett@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2035793002/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/2035793002/60001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: Build-Win-MSVC-x86_64-Debug-Trybot on client.skia.compile (JOB_FAILED, http://build.chromium.org/p/client.skia.compile/builders/Build-Win-MSVC-x86_6...)
The CQ bit was checked by msarett@google.com to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2035793002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/2035793002/80001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by msarett@google.com
The patchset sent to the CQ was uploaded after l-g-t-m from scroggo@google.com Link to the patchset: https://codereview.chromium.org/2035793002/#ps80001 (title: "Rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2035793002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/2035793002/80001
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: skia_presubmit-Trybot on client.skia.fyi (JOB_FAILED, http://build.chromium.org/p/client.skia.fyi/builders/skia_presubmit-Trybot/bu...)
Description was changed from ========== Add color correction benchmark - with comparison to qcms --colorImages is empty by default so this won't affect the runtime of the bots. To run locally, use --colorImages <path> and --benchType skcolorcodec. Two takeaways so far: Color correction is (currently) slower than jpeg decodes. Our reference solution is slower than qcms. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2035793002 ========== to ========== Add color correction benchmark - with comparison to qcms --colorImages is empty by default so this won't affect the runtime of the bots. To run locally, use --colorImages <path> and --benchType skcolorcodec. Two takeaways so far: Color correction is (currently) slower than jpeg decodes. Our reference solution is slower than qcms. TBR=reed@google.com BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2035793002 ==========
The CQ bit was checked by msarett@google.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2035793002/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/2035793002/80001
Message was sent while issue was closed.
Description was changed from ========== Add color correction benchmark - with comparison to qcms --colorImages is empty by default so this won't affect the runtime of the bots. To run locally, use --colorImages <path> and --benchType skcolorcodec. Two takeaways so far: Color correction is (currently) slower than jpeg decodes. Our reference solution is slower than qcms. TBR=reed@google.com BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2035793002 ========== to ========== Add color correction benchmark - with comparison to qcms --colorImages is empty by default so this won't affect the runtime of the bots. To run locally, use --colorImages <path> and --benchType skcolorcodec. Two takeaways so far: Color correction is (currently) slower than jpeg decodes. Our reference solution is slower than qcms. TBR=reed@google.com BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2035793002 Committed: https://skia.googlesource.com/skia/+/2cee902847a940c4bab56b42dabbb3721be3f9ac ==========
Message was sent while issue was closed.
Committed patchset #5 (id:80001) as https://skia.googlesource.com/skia/+/2cee902847a940c4bab56b42dabbb3721be3f9ac |