|
|
Descriptionmake LUT calculation 3x faster
CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel
Committed: https://crrev.com/924d0f75f79193e850ffe9b62849767f14ebab5d
Cr-Commit-Position: refs/heads/master@{#421946}
Patch Set 1 #
Total comments: 3
Messages
Total messages: 19 (8 generated)
Description was changed from ========== make LUT calculation 3x faster ========== to ========== make LUT calculation 3x faster CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ==========
The CQ bit was checked by hubbe@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
hubbe@chromium.org changed reviewers: + danakj@chromium.org
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Ping?
https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.cc File cc/output/color_lut_cache.cc (right): https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.c... cc/output/color_lut_cache.cc:59: samples[u].set_x(y * inverse); Wait.. this is setting samples[0] once for each value of u, once or each value of v, once for each value of u. But only keeping the value for y=v=lut_samples-1. This seems like you can make this exponentially faster?
https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.cc File cc/output/color_lut_cache.cc (right): https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.c... cc/output/color_lut_cache.cc:59: samples[u].set_x(y * inverse); On 2016/09/29 20:09:08, danakj wrote: > Wait.. this is setting samples[0] once for each value of u, once or each value > of v, once for each value of u. But only keeping the value for > y=v=lut_samples-1. > > This seems like you can make this exponentially faster? It's not *keeping* any of the values. We iterate over all y,u,v, values, but batch-transform one "row" of values and save the output into lutp. I could calculate y * inverse and u * inverse in an outer loop, but I assume that the compiler is smart enough to do that, and even if it is not, it wouldn't save that much time I think.
I don't know that this makes calculating 3x faster.. it would use 3x less memory, but memory allocation is not O(# of bytes) really is it? https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.cc File cc/output/color_lut_cache.cc (right): https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.c... cc/output/color_lut_cache.cc:59: samples[u].set_x(y * inverse); On 2016/09/29 20:31:52, hubbe wrote: > On 2016/09/29 20:09:08, danakj wrote: > > Wait.. this is setting samples[0] once for each value of u, once or each value > > of v, once for each value of u. But only keeping the value for > > y=v=lut_samples-1. > > > > This seems like you can make this exponentially faster? > > It's not *keeping* any of the values. > We iterate over all y,u,v, values, but batch-transform one "row" of values and > save the output into lutp. Ah, the transform() happens for each y and v also, I see. > I could calculate y * inverse and u * inverse in an outer loop, but I assume > that the compiler is smart enough to do that, and even if it is not, it wouldn't > save that much time I think.
LGTM anyway
On 2016/09/29 20:38:07, danakj wrote: > I don't know that this makes calculating 3x faster.. it would use 3x less > memory, but memory allocation is not O(# of bytes) really is it? > > https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.cc > File cc/output/color_lut_cache.cc (right): > > https://codereview.chromium.org/2371413002/diff/1/cc/output/color_lut_cache.c... > cc/output/color_lut_cache.cc:59: samples[u].set_x(y * inverse); > On 2016/09/29 20:31:52, hubbe wrote: > > On 2016/09/29 20:09:08, danakj wrote: > > > Wait.. this is setting samples[0] once for each value of u, once or each > value > > > of v, once for each value of u. But only keeping the value for > > > y=v=lut_samples-1. > > > > > > This seems like you can make this exponentially faster? > > > > It's not *keeping* any of the values. > > We iterate over all y,u,v, values, but batch-transform one "row" of values and > > save the output into lutp. > > Ah, the transform() happens for each y and v also, I see. > > > I could calculate y * inverse and u * inverse in an outer loop, but I assume > > that the compiler is smart enough to do that, and even if it is not, it > wouldn't > > save that much time I think. Most of the time is spent in transform(), which is called with data.size()....
The CQ bit was checked by hubbe@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On Thu, Sep 29, 2016 at 1:39 PM, <hubbe@chromium.org> wrote: > On 2016/09/29 20:38:07, danakj wrote: > > I don't know that this makes calculating 3x faster.. it would use 3x less > > memory, but memory allocation is not O(# of bytes) really is it? > > > > https://codereview.chromium.org/2371413002/diff/1/cc/ > output/color_lut_cache.cc > > File cc/output/color_lut_cache.cc (right): > > > > > https://codereview.chromium.org/2371413002/diff/1/cc/ > output/color_lut_cache.cc#newcode59 > > cc/output/color_lut_cache.cc:59: samples[u].set_x(y * inverse); > > On 2016/09/29 20:31:52, hubbe wrote: > > > On 2016/09/29 20:09:08, danakj wrote: > > > > Wait.. this is setting samples[0] once for each value of u, once or > each > > value > > > > of v, once for each value of u. But only keeping the value for > > > > y=v=lut_samples-1. > > > > > > > > This seems like you can make this exponentially faster? > > > > > > It's not *keeping* any of the values. > > > We iterate over all y,u,v, values, but batch-transform one "row" of > values > and > > > save the output into lutp. > > > > Ah, the transform() happens for each y and v also, I see. > > > > > I could calculate y * inverse and u * inverse in an outer loop, but I > assume > > > that the compiler is smart enough to do that, and even if it is not, it > > wouldn't > > > save that much time I think. > > Most of the time is spent in transform(), which is called with > data.size().... > oooh.. hehe woops. :) Cool! Maybe it should use lut_samples then the compiler might optimize that better. > > > https://codereview.chromium.org/2371413002/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Message was sent while issue was closed.
Committed patchset #1 (id:1)
Message was sent while issue was closed.
Description was changed from ========== make LUT calculation 3x faster CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ========== to ========== make LUT calculation 3x faster CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel Committed: https://crrev.com/924d0f75f79193e850ffe9b62849767f14ebab5d Cr-Commit-Position: refs/heads/master@{#421946} ==========
Message was sent while issue was closed.
Patchset 1 (id:??) landed as https://crrev.com/924d0f75f79193e850ffe9b62849767f14ebab5d Cr-Commit-Position: refs/heads/master@{#421946} |