|
|
Created:
5 years, 11 months ago by Noel Gordon Modified:
5 years, 10 months ago CC:
chromium-reviews, Stephen White, Mike Lawther (Google) Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
DescriptionAvoid divisions creating sample points in the float cube LUT builder
Cube samplers create input sample points in some order. Document the QCMS
order (x,y,z) which is the same as OSX ColorSync and cube LUT interchange
standards such as the Adobe Cube LUT Specification 1.0, 2013. The samples
are RGB order with R changing the slowest, B the fastest.
Hoist the division out of the sample point creation loop: takes 0.0003939
seconds on MacAir for 32^3 sample points, 0.0001619 with hoisting. That's
fine, but small beans compared to the overall costs of cube creation. The
majorty of the work is in qcms_chain_transform (8ms) where floating-point
is used to maintain transform precision.
The output ColorCube LUT produced by qcms_chain_transform is indexed into
transform in RGB order (time cost ~0). Add comments about order.
BUG=443863
Committed: https://crrev.com/5bc47112d6be524ab82b8f9afc0b67a832e32380
Cr-Commit-Position: refs/heads/master@{#314425}
Patch Set 1 #Patch Set 2 : Compute the inverse. #
Total comments: 7
Patch Set 3 : Patch for landing. #
Messages
Total messages: 24 (5 generated)
noel@chromium.org changed reviewers: + sugoi@chromium.org
reed@google.com changed reviewers: + reed@google.com
drive-by: is there an "official" repo for this library? If so, should we submit this sort of fix there as well?
https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... File third_party/qcms/src/transform.c (right): https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b You're going to have "samples * samples * samples * 3" number of multiplications here and you're paying for int to float conversions. You could reduce that to only "samples" number of multiplications by doing: float* multLut = malloc(samples*sizeof(float)); // Should fit in cache for (x = 0; x < samples; ++x) { multLut = x * inverse; } for (x = 0; x < samples; ++x) { for (y = 0; y < samples; ++y) { for (z = 0; z < samples; ++z) { src[l++] = multLut[x]; src[l++] = multLut[y]; src[l++] = multLut[z]; } } } free(multLut); Also, is "samples" likely to change between calls? If not, then multLut could be static, and only be reallocated and recomputed when "samples" changes.
On 2015/01/27 14:10:50, reed1 wrote: > drive-by: is there an "official" repo for this library? Per the README.chromium the code drop came from mozilla source. A github was added later but it has no pulse. No real "official" repro, so we record our changes locally for now. When/if we spot security issues, mozilla is advised of course.
https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... File third_party/qcms/src/transform.c (right): https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b On 2015/01/27 15:42:58, sugoi1 wrote: > Also, is "samples" likely to change between calls? If not, then multLut could be > static, and only be reallocated and recomputed when "samples" changes. yes, "samples" change b/w calls. https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b On 2015/01/27 15:42:58, sugoi1 wrote: > You're going to have "samples * samples * samples * 3" number of multiplications > here and you're paying for int to float conversions. And what followed seems fine, but the cost of this function would still be 8ms I believe since the sample creation costs here are tiny overall. Perhaps I could better explain; someone asked me a question. The point of this patch is not really the hoist, but to present a diff that 1) shows how the sampling code works so we all look at it, 2) describes where time goes, 3) documents sample order (x,y,z) [exactly like ColorSync, Adobe, LCMS, etc] and 4) documents output LUT format order (RGB[A]). This to help resolve other questions: Why does Skia sample (z,y,x) order (viz., backwards compared to libs)? Why does it use BGR output order? And should we care about this in Skia, or should we fix sample order and output format to match the library user user-case? (just imagine I want to swap out qcms for ColorSync on mac, say). Bug is 443863, make it easier to use for users. I don't personally care what the sample order / output format is, but Skia might and color-correction library users of Skia might, so I thought I'd better advise via this bug - thoughts skia-peeps?
https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... File third_party/qcms/src/transform.c (right): https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b On 2015/01/28 05:57:29, noel gordon wrote: > On 2015/01/27 15:42:58, sugoi1 wrote: > > > Also, is "samples" likely to change between calls? If not, then multLut could > be > > static, and only be reallocated and recomputed when "samples" changes. > > yes, "samples" change b/w calls. I understand that it *can* change, but if that represents the size of the cube of the 3D lut, why would we change this number frequently? Do we use multiple sizes of cubes? Is "samples" used to define something else than the size of a cube? https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b On 2015/01/28 05:57:29, noel gordon wrote: The 443863 bug is described as an SkColorCube issue but I don't think it can be limited to that. If we're using an SkFilter tree for the GPU case and libqcms as a software fallback, then image would need to be in a different format for each case. If you look at JPEGImageDecoder, you'll see (you probably wrote this) that, in order to use libqcms, there's some temporary swizzling happening in the decoder. JPEGImageDecoder, and every other decoder, will output to the format skia uses and everything that touches pixels in Blink/Chromium will expect pixels to be stored that way. Wouldn't it be much easier to get the fairly small libqcms to use Skia/Blink/Chrome's color format than to change all of the rest of the code? If we eventually use another library, then we'd do what we do currently in JPEGImageDecoder when using libqcms, which is to swizzle in and out of the library when the formats don't match.
https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... File third_party/qcms/src/transform.c (right): https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b On 2015/01/28 16:44:40, sugoi1 wrote: > On 2015/01/28 05:57:29, noel gordon wrote: > > On 2015/01/27 15:42:58, sugoi1 wrote: > > > > > Also, is "samples" likely to change between calls? If not, then multLut > could > > be > > > static, and only be reallocated and recomputed when "samples" changes. > > > > yes, "samples" change b/w calls. > > I understand that it *can* change, but if that represents the size of the cube > of the 3D lut, why would we change this number frequently? Do we use multiple > sizes of cubes? I might use a smaller or larger cube depending on circumstances (image size, pre-rendering, printing ...) but have not experimented with that as yet. At least for now, I expect this number won't change frequently. We could optimize speed around here sure, but as we've noted here and off-line, our time might better spent elsewhere in this code. > Is "samples" used to define something else than the size of a cube? Nope - "samples" and "cubeSize" are the same thing, and we've used both terms interchangeably in discussions / code without confusion thus far (which is ok with me), sorry if there was any confusion.
https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... File third_party/qcms/src/transform.c (right): https://codereview.chromium.org/863233003/diff/20001/third_party/qcms/src/tra... third_party/qcms/src/transform.c:1135: src[l++] = z * inverse; // b On 2015/01/28 16:44:40, sugoi1 wrote: > On 2015/01/28 05:57:29, noel gordon wrote: > > The 443863 bug is described as an SkColorCube issue but I don't think it can be > limited to that. 443863 is about cube data format I think. Some code might help, so I changed SkColorCube as follows: https://gist.github.com/anonymous/1d62b4ead69bbebb5814 Didn't test the software path. The GPU path seems to color-correct properly when I provide (x,y,z) sampled, RGBA order cube data from qcms. > If we're using an SkFilter tree for the GPU case and libqcms as > a software fallback, then image would need to be in a different format for each > case. We must decode / handle images in Skia platform format always: rule #1, no exceptions :) > If you look at JPEGImageDecoder, you'll see (you probably wrote this) > that, in order to use libqcms, there's some temporary swizzling happening in the > decoder. JPEGImageDecoder, and every other decoder, will output to the format > skia uses and everything that touches pixels in Blink/Chromium will expect > pixels to be stored that way. Yeap. That temporary swizzling happens when we bake profiles into decoded image frames. If decoded frames need qcms-ing, the decoders swizzle (at no cost mind) to RGBA for input to qcms and we make qcms output platform format (rule #1). > Wouldn't it be much easier to get the fairly small > libqcms to use Skia/Blink/Chrome's color format than to change all of the rest > of the code? The only thing that'd change is the color cube data format. > If we eventually use another library, then we'd do what we do > currently in JPEGImageDecoder when using libqcms, which is to swizzle in and out > of the library when the formats don't match. One thing to note: all libraries support RGB input, so swapping libraries means no code changes at ImageDecoder level. That said, the swizzle code there will go away once we stop baking color profiles into image frames at decode time. In summary, I'm not proposing to change image frame format, only the cube data format (gist above). Point of this patch was to explore the issues. I'm also fine with leaving things as is, and when I create cube data in qcms for Skia use, I use (z,y,x) sampling, BGRA order. If that's still your preference, then let's do that. (good questions btw, thanks for those).
The current patch is orthogonal to the above, so we can go ahead and submit. New patch uploaded. r+?
On 2015/01/31 04:29:26, noel gordon wrote: > The current patch is orthogonal to the above, so we can go ahead and submit. > New patch uploaded. r+? Sure, LGTM, you can submit this code if you want.
The CQ bit was checked by noel@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/863233003/40001
The CQ bit was unchecked by commit-bot@chromium.org
No LGTM from a valid reviewer yet. Only full committers are accepted. Even if an LGTM may have been provided, it was from a non-committer, _not_ a full super star committer. See http://www.chromium.org/getting-involved/become-a-committer Note that this has nothing to do with OWNERS files.
The CQ bit was checked by reed@google.com
lgtm
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/863233003/40001
Message was sent while issue was closed.
Committed patchset #3 (id:40001)
Message was sent while issue was closed.
Patchset 3 (id:??) landed as https://crrev.com/5bc47112d6be524ab82b8f9afc0b67a832e32380 Cr-Commit-Position: refs/heads/master@{#314425}
Message was sent while issue was closed.
On 2015/02/03 23:01:38, I haz the power (commit-bot) wrote: > In summary, I'm not proposing to change image frame format, only the cube data > format (gist above). Point of this patch was to explore the issues. I'm also > fine with leaving things as is, and when I create cube data in qcms for Skia > use, I use (z,y,x) sampling, BGRA order. If that's still your preference, then > let's do that. ... I guess I will just have to assume the this backwards sampling is skia-team's preference. I note it's not clear from the code comments in SkColorCubeFilter{.cpp,.h} that the expected sampling order is (z,y,x). Anyho ...
Message was sent while issue was closed.
On 2015/02/04 01:38:30, noel gordon wrote: > On 2015/02/03 23:01:38, I haz the power (commit-bot) wrote: > > > In summary, I'm not proposing to change image frame format, only the cube data > > format (gist above). Point of this patch was to explore the issues. I'm also > > fine with leaving things as is, and when I create cube data in qcms for Skia > > use, I use (z,y,x) sampling, BGRA order. If that's still your preference, then > > let's do that. > > ... I guess I will just have to assume the this backwards sampling is > skia-team's preference. I note it's not clear from the code comments in > SkColorCubeFilter{.cpp,.h} that the expected sampling order is (z,y,x). Anyho > ... Hmmm, my approval was just perfunctory so it could land. I have not considered that convention one way or the other.
Message was sent while issue was closed.
On 2015/02/04 01:43:15, reed1 wrote: > On 2015/02/04 01:38:30, noel gordon wrote: > > ... I guess I will just have to assume the this backwards sampling is > > skia-team's preference. I note it's not clear from the code comments in > > SkColorCubeFilter{.cpp,.h} that the expected sampling order is (z,y,x). Anyho > > ... > > Hmmm, my approval was just perfunctory so it could land. I have not considered > that convention one way or the other. For now, when I create color cube data in qcms for Skia use, I will use the (z,y,x) sampling, BGRA order that SkColorCubeFIlter currently expects. Want to change that to industry-standard ordering in future? https://gist.github.com/anonymous/1d62b4ead69bbebb5814 and the 3- or 4-sided patch dance, and skia gm test rebaselines, etc, etc. /me moving on the qcms patches that depended on the ordering decision. |