Issue 1634763002: Multithreaded dng host implementation

ebrauer

Description was changed from ========== WIP: This is how I like to do the multithreaded ...

4 years, 11 months ago (2016-01-25 18:23:19 UTC) #1

ebrauer

Description was changed from ========== WIP: This is how I like to do the multithreaded ...

4 years, 11 months ago (2016-01-25 18:25:21 UTC) #2

ebrauer

ebrauer@google.com changed reviewers: + msarett@google.com, scroggo@google.com

4 years, 11 months ago (2016-01-25 18:25:21 UTC) #3

ebrauer

ebrauer@google.com changed reviewers: + djsollen@google.com

4 years, 11 months ago (2016-01-25 19:38:40 UTC) #5

mtklein

mtklein@google.com changed reviewers: + mtklein@google.com

4 years, 11 months ago (2016-01-25 19:50:58 UTC) #7

mtklein

Are you sure this is a situation that's best handled by multithreading, as opposed to ...

4 years, 11 months ago (2016-01-25 19:50:59 UTC) #8

ebrauer

On 2016/01/25 19:50:59, mtklein wrote: > Are you sure this is a situation that's best ...

4 years, 11 months ago (2016-01-25 20:15:43 UTC) #9

mtklein

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp#newcode89 src/codec/SkRawCodec.cpp:89: threadPool.add(func); Something like, threadPool.add([&task, this, taskIndex, threadArea, tileSize] { ...

4 years, 11 months ago (2016-01-25 21:02:01 UTC) #10

msarett

"> Are you sure this is a situation that's best handled by multithreading, as > ...

4 years, 11 months ago (2016-01-25 21:49:00 UTC) #11

adaubert

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp#newcode306 src/codec/SkRawCodec.cpp:306: SkAutoTDelete<SkDngHost> host(fHost.release()); This change is not necessary. https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp#newcode399 src/codec/SkRawCodec.cpp:399: ...

4 years, 11 months ago (2016-01-26 09:26:38 UTC) #12

ebrauer

On 2016/01/25 21:49:00, msarett wrote: > "> Are you sure this is a situation that's ...

4 years, 11 months ago (2016-01-26 09:50:48 UTC) #13

ebrauer

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp#newcode71 src/codec/SkRawCodec.cpp:71: SkTaskGroup threadPool; On 2016/01/25 19:50:59, mtklein wrote: > This ...

4 years, 11 months ago (2016-01-26 09:53:08 UTC) #14

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cpp
File src/codec/SkRawCodec.cpp (right):

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:71: SkTaskGroup threadPool;
On 2016/01/25 19:50:59, mtklein wrote:
> This isn't really an independent thread pool; it represents a group of tasks
> that are multiplexed onto a global threadpool.  Creating it does not create a
> thread pool, nor does destroying it destroy any threads.
> 
> You should be fine to break your problem up into as many small parts as
> necessary.  It'll usually perform better if you add more tasks to the task
group
> than there are cpu threads, to allow for uneven task runtime.

Changed the name. The dng_task has a maximum we must not exceed.

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:87: std::function<void()> func =
std::bind(&dng_area_task::ProcessOnThread, &task,
On 2016/01/25 19:50:59, mtklein wrote:
> Generally I'd rather see a lambda for this than std::bind.
> You can usually pass them right into SkTaskGroup.add() without giving them a
> name.

Done.

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:89: threadPool.add(func);
On 2016/01/25 21:02:01, mtklein wrote:
> Something like,
> 
> threadPool.add([&task, this, taskIndex, threadArea, tileSize] {
>     task.ProcessOnThread(taskIndex, threadArea, tileSize, this->Sniffer());
> });

Done.

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:306: SkAutoTDelete<SkDngHost> host(fHost.release());
On 2016/01/26 09:26:37, adaubert wrote:
> This change is not necessary.

Done.

https://codereview.chromium.org/1634763002/diff/20001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:399: SkAutoTDelete<SkDngHost> fHost;
On 2016/01/26 09:26:38, adaubert wrote:
> This change is not necessary.

Done.

adaubert

https://codereview.chromium.org/1634763002/diff/60001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/60001/src/codec/SkRawCodec.cpp#newcode101 src/codec/SkRawCodec.cpp:101: ++taskIndex; ++taskIndex; should be in the inner scope.

4 years, 11 months ago (2016-01-26 14:20:24 UTC) #15

ebrauer

https://codereview.chromium.org/1634763002/diff/60001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/60001/src/codec/SkRawCodec.cpp#newcode101 src/codec/SkRawCodec.cpp:101: ++taskIndex; On 2016/01/26 14:20:24, adaubert wrote: > ++taskIndex; should ...

4 years, 11 months ago (2016-01-26 14:28:17 UTC) #16

msarett

“We have benchmarks for Linux and manually tested on Android N with the Files app. ...

4 years, 11 months ago (2016-01-26 21:34:03 UTC) #17

“We have benchmarks for Linux and manually tested on Android N with the Files
app.  In both cases we have experienced a speedup of 2x. (See CL/112784559)“

Can you link to this CL?  I’m not really sure what you’re referring to.

“I can see your concern with this approach, but the code was not dramatically
slow when we decode a smaller preview, but on the full size rendering it took a
long time.  I think the general use case is to decode a small preview to show
the image in e.g. the Files app and to render the full image if the user tries
to open the image in e.g. Fotos, for which our multithreading approach on a
single image seems to be a good option.”

I’m not sure I really understand what you’re saying here.  My understanding is
that we will always decode the preview (using Piex) when possible.  And when
it’s not possible, we will always decode the full image (using dng_sdk).  So we
cannot choose whether to decode a preview or the full image depending on the
situation?  We choose depending on if Piex supports the image?

I do agree that the multi-threading approach is more practical when we know that
we are only decoding a single image.  And that use case is also important.


In general, I’m feeling better about this, given that the dng_sdk seems to be
designed to encourage multi-threaded decodes.  Maybe this should have some
influence on how we choose to design SkCodec?

Still 32 (or 8) threads seems like a lot, if we are assuming that clients will
be multi-threading decodes at the BitmapFactory level.  Of course, I am just
speculating.  It’d be great if we could test how this impacts multiple decodes
in parallel using BitmapFactory.  Is there a way to parallelize your test using
the Files app?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp
File src/codec/SkRawCodec.cpp (right):

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:44: return dng_point((areaSize.v + tileSize.v - 1) /
tileSize.v,
nit:

This is non-critical, but can we use a ceil_div() helper function?  SkGifCodec
actually already has one.  Can we share it in SkCodecPriv.h?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:60: if (tilesInThread.h < tilesInArea.h) {
So, when possible, we will always split into tiles horizontally (as much as
possible) before splitting into tiles vertically?

Is there a reason for this?  Or does it not matter?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:65: ThrowProgramError("num_tiles_per_thread calculation
is wrong.");
Can we signal an error here without throwing an exception?  (Or maybe not,
because this is used in overriding a function in the dng_sdk?)

If it's code that will never be reached, you might just omit the else case?

Or maybe more Skia-like:
SkASSERT(false);
return {0,0};

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:76: const int maxThreads =
static_cast<int>(kMaxMPThreads);
So this is 32 on 64-bit systems and 8 on 32-bit systems?

Can you explain why we need to get this value from the dng_sdk?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:79: const dng_point tileSize(task.FindTileSize(area));
Can you add a comment explaining how the dng_sdk determines the tile size?

Side note: It would be useful for me to know some typical values of "area" and
"tileSize" (and how those compare to the image size).  I've looked at this
function in dng_sdk and find it a little confusing.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:108: uint32 PerformAreaTaskThreads() override {
I'm not sure how this is used, but the comments in dng_host.h make it seem like
this should return the actual number of threads (rather than the max).

ebrauer

On 2016/01/26 21:34:03, msarett wrote: > “We have benchmarks for Linux and manually tested on ...

4 years, 11 months ago (2016-01-26 22:26:53 UTC) #18

On 2016/01/26 21:34:03, msarett wrote:
> “We have benchmarks for Linux and manually tested on Android N with the Files
> app.  In both cases we have experienced a speedup of 2x. (See CL/112784559)“
> 
> Can you link to this CL?  I’m not really sure what you’re referring to.
I was referring to this cl: https://critique.corp.google.com/#review/112784559
> 
> “I can see your concern with this approach, but the code was not dramatically
> slow when we decode a smaller preview, but on the full size rendering it took
a
> long time.  I think the general use case is to decode a small preview to show
> the image in e.g. the Files app and to render the full image if the user tries
> to open the image in e.g. Fotos, for which our multithreading approach on a
> single image seems to be a good option.”
> 
> I’m not sure I really understand what you’re saying here.  My understanding is
> that we will always decode the preview (using Piex) when possible.  And when
> it’s not possible, we will always decode the full image (using dng_sdk).  So
we
> cannot choose whether to decode a preview or the full image depending on the
> situation?  We choose depending on if Piex supports the image?
Just wanted to say that DNGs from Android do not contain preview images. Yes, we
then use the dng sdk.
However it is possible to render small, not full size images that are faster.
> 
> I do agree that the multi-threading approach is more practical when we know
that
> we are only decoding a single image.  And that use case is also important.
> 
> 
> In general, I’m feeling better about this, given that the dng_sdk seems to be
> designed to encourage multi-threaded decodes.  Maybe this should have some
> influence on how we choose to design SkCodec?
Basically the SkTaskGroup and its threadpool could also influence the amount of
parallel tasks.
> 
> Still 32 (or 8) threads seems like a lot, if we are assuming that clients will
> be multi-threading decodes at the BitmapFactory level.  Of course, I am just
> speculating.  It’d be great if we could test how this impacts multiple decodes
> in parallel using BitmapFactory.  Is there a way to parallelize your test
using
> the Files app?
Lets see. I can not react from home on this task.
> 
> https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp
> File src/codec/SkRawCodec.cpp (right):
> 
>
https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
> src/codec/SkRawCodec.cpp:44: return dng_point((areaSize.v + tileSize.v - 1) /
> tileSize.v,
> nit:
> 
> This is non-critical, but can we use a ceil_div() helper function?  SkGifCodec
> actually already has one.  Can we share it in SkCodecPriv.h?
> 
>
https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
> src/codec/SkRawCodec.cpp:60: if (tilesInThread.h < tilesInArea.h) {
> So, when possible, we will always split into tiles horizontally (as much as
> possible) before splitting into tiles vertically?
Yes, that is the goal here.
> 
> Is there a reason for this?  Or does it not matter?
The reason was a better locality mboehme asked for in that cl I have referenced
above.
> 
>
https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
> src/codec/SkRawCodec.cpp:65: ThrowProgramError("num_tiles_per_thread
calculation
> is wrong.");
> Can we signal an error here without throwing an exception?  (Or maybe not,
> because this is used in overriding a function in the dng_sdk?)
> 
> If it's code that will never be reached, you might just omit the else case?
> 
> Or maybe more Skia-like:
> SkASSERT(false);
> return {0,0};
> 
>
https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
> src/codec/SkRawCodec.cpp:76: const int maxThreads =
> static_cast<int>(kMaxMPThreads);
> So this is 32 on 64-bit systems and 8 on 32-bit systems?
> 
> Can you explain why we need to get this value from the dng_sdk?
> 
>
https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
> src/codec/SkRawCodec.cpp:79: const dng_point
tileSize(task.FindTileSize(area));
> Can you add a comment explaining how the dng_sdk determines the tile size?
> 
> Side note: It would be useful for me to know some typical values of "area" and
> "tileSize" (and how those compare to the image size).  I've looked at this
> function in dng_sdk and find it a little confusing.
> 
>
https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
> src/codec/SkRawCodec.cpp:108: uint32 PerformAreaTaskThreads() override {
> I'm not sure how this is used, but the comments in dng_host.h make it seem
like
> this should return the actual number of threads (rather than the max).

ebrauer

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp#newcode44 src/codec/SkRawCodec.cpp:44: return dng_point((areaSize.v + tileSize.v - 1) / tileSize.v, On ...

4 years, 11 months ago (2016-01-26 22:50:59 UTC) #19

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp
File src/codec/SkRawCodec.cpp (right):

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:44: return dng_point((areaSize.v + tileSize.v - 1) /
tileSize.v,
On 2016/01/26 21:34:03, msarett wrote:
> nit:
> 
> This is non-critical, but can we use a ceil_div() helper function?  SkGifCodec
> actually already has one.  Can we share it in SkCodecPriv.h?

Would it be Ok to add a Fixit?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:60: if (tilesInThread.h < tilesInArea.h) {
On 2016/01/26 21:34:03, msarett wrote:
> So, when possible, we will always split into tiles horizontally (as much as
> possible) before splitting into tiles vertically?
> 
> Is there a reason for this?  Or does it not matter?

I think it's preferable to make them as wide as possible -- i.e. to increase
tiles_in_thread.h as long as possible while keeping tiles_in_thread.v at 1 --
because it gives better locality of reference.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:65: ThrowProgramError("num_tiles_per_thread calculation
is wrong.");
On 2016/01/26 21:34:03, msarett wrote:
> Can we signal an error here without throwing an exception?  (Or maybe not,
> because this is used in overriding a function in the dng_sdk?)
> 
> If it's code that will never be reached, you might just omit the else case?
> 
> Or maybe more Skia-like:
> SkASSERT(false);
> return {0,0};
It might be reached on program errors or invalid data. This prevents the
function from producing erroneous results if the caller should ever pass in
max_threads == 0. I think I like the assert.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:76: const int maxThreads =
static_cast<int>(kMaxMPThreads);
On 2016/01/26 21:34:03, msarett wrote:
> So this is 32 on 64-bit systems and 8 on 32-bit systems?
> 
> Can you explain why we need to get this value from the dng_sdk?

I think we should usetask.MaxThreads(), because the task that's passed in to
PerformAreaTask() may request to be processed on a smaller number of threads,
though it seems to be either 8 or 32.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:79: const dng_point tileSize(task.FindTileSize(area));
On 2016/01/26 21:34:03, msarett wrote:
> Can you add a comment explaining how the dng_sdk determines the tile size?
> 
> Side note: It would be useful for me to know some typical values of "area" and
> "tileSize" (and how those compare to the image size).  I've looked at this
> function in dng_sdk and find it a little confusing.

area is typically something like (4000, 3000) while tileSize is (256, 256). I
have always seen tileSize to be that which is maxTileSize in the dng sdk.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:108: uint32 PerformAreaTaskThreads() override {
On 2016/01/26 21:34:03, msarett wrote:
> I'm not sure how this is used, but the comments in dng_host.h make it seem
like
> this should return the actual number of threads (rather than the max).

The documentation for PerformAreaTaskThreads() isn't completely clear, but my
understanding is that it's supposed to return the maximum number of threads that
PerformAreaTask() is potentially able to use. (This also makes sense given that
PerformAreaTaskThreads() is used in multiple places to check whether
multithreading is possible or not.)

adaubert

The CQ bit was checked by adaubert@google.com to run a CQ dry run

4 years, 11 months ago (2016-01-27 13:25:01 UTC) #20

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1634763002/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1634763002/120001

4 years, 11 months ago (2016-01-27 13:25:07 UTC) #21

commit-bot: I haz the power

Note for Reviewers: The CQ is waiting for an approval. If you believe that the ...

4 years, 11 months ago (2016-01-27 13:25:08 UTC) #22

msarett

Just a few nits. https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp#newcode60 src/codec/SkRawCodec.cpp:60: if (tilesInThread.h < tilesInArea.h) { ...

4 years, 11 months ago (2016-01-27 15:34:43 UTC) #23

Just a few nits.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp
File src/codec/SkRawCodec.cpp (right):

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:60: if (tilesInThread.h < tilesInArea.h) {
On 2016/01/26 22:50:58, ebrauer wrote:
> On 2016/01/26 21:34:03, msarett wrote:
> > So, when possible, we will always split into tiles horizontally (as much as
> > possible) before splitting into tiles vertically?
> > 
> > Is there a reason for this?  Or does it not matter?
> 
> I think it's preferable to make them as wide as possible -- i.e. to increase
> tiles_in_thread.h as long as possible while keeping tiles_in_thread.v at 1 --
> because it gives better locality of reference.

Great!  Can you add a comment explaining this?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:65: ThrowProgramError("num_tiles_per_thread calculation
is wrong.");
On 2016/01/26 22:50:59, ebrauer wrote:
> On 2016/01/26 21:34:03, msarett wrote:
> > Can we signal an error here without throwing an exception?  (Or maybe not,
> > because this is used in overriding a function in the dng_sdk?)
> > 
> > If it's code that will never be reached, you might just omit the else case?
> > 
> > Or maybe more Skia-like:
> > SkASSERT(false);
> > return {0,0};
> It might be reached on program errors or invalid data. This prevents the
> function from producing erroneous results if the caller should ever pass in
> max_threads == 0. I think I like the assert.

Sorry for being unclear here.

If reaching this code indicates programmer error, then let's use SkASSERT.  This
will assert in Debug mode and do nothing in Release mode.

If reaching this code might indicate invalid input data, we need to indicate
that the decode failed.  AFAICT, the only way to do this is using an exception? 
That we will catch in SkRawCodec and return a failure?

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:76: const int maxThreads =
static_cast<int>(kMaxMPThreads);
On 2016/01/26 22:50:59, ebrauer wrote:
> On 2016/01/26 21:34:03, msarett wrote:
> > So this is 32 on 64-bit systems and 8 on 32-bit systems?
> > 
> > Can you explain why we need to get this value from the dng_sdk?
> 
> I think we should usetask.MaxThreads(), because the task that's passed in to
> PerformAreaTask() may request to be processed on a smaller number of threads,
> though it seems to be either 8 or 32.

sgtm.  Thanks for the comment.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:79: const dng_point tileSize(task.FindTileSize(area));
On 2016/01/26 22:50:58, ebrauer wrote:
> On 2016/01/26 21:34:03, msarett wrote:
> > Can you add a comment explaining how the dng_sdk determines the tile size?
> > 
> > Side note: It would be useful for me to know some typical values of "area"
and
> > "tileSize" (and how those compare to the image size).  I've looked at this
> > function in dng_sdk and find it a little confusing.
> 
> area is typically something like (4000, 3000) while tileSize is (256, 256). I
> have always seen tileSize to be that which is maxTileSize in the dng sdk.

Can you add a comment that is something like this?

// tileSize is typically 256x256.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:108: uint32 PerformAreaTaskThreads() override {
On 2016/01/26 22:50:58, ebrauer wrote:
> On 2016/01/26 21:34:03, msarett wrote:
> > I'm not sure how this is used, but the comments in dng_host.h make it seem
> like
> > this should return the actual number of threads (rather than the max).
> 
> The documentation for PerformAreaTaskThreads() isn't completely clear, but my
> understanding is that it's supposed to return the maximum number of threads
that
> PerformAreaTask() is potentially able to use. (This also makes sense given
that
> PerformAreaTaskThreads() is used in multiple places to check whether
> multithreading is possible or not.)

Yeah I noticed that they mostly compare it to 1, so we definitely need to return
a value greater than 1.

They also seem to use it to determine the size of the AreaTask.

I don't have insight into how this might matter, but I think the code looks fine
as is.

ebrauer

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp File src/codec/SkRawCodec.cpp (right): https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp#newcode44 src/codec/SkRawCodec.cpp:44: return dng_point((areaSize.v + tileSize.v - 1) / tileSize.v, On ...

4 years, 11 months ago (2016-01-27 15:54:37 UTC) #24

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cpp
File src/codec/SkRawCodec.cpp (right):

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:44: return dng_point((areaSize.v + tileSize.v - 1) /
tileSize.v,
On 2016/01/26 22:50:59, ebrauer wrote:
> On 2016/01/26 21:34:03, msarett wrote:
> > nit:
> > 
> > This is non-critical, but can we use a ceil_div() helper function? 
SkGifCodec
> > actually already has one.  Can we share it in SkCodecPriv.h?
> 
> Would it be Ok to add a Fixit?

Done.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:60: if (tilesInThread.h < tilesInArea.h) {
On 2016/01/27 15:34:42, msarett wrote:
> On 2016/01/26 22:50:58, ebrauer wrote:
> > On 2016/01/26 21:34:03, msarett wrote:
> > > So, when possible, we will always split into tiles horizontally (as much
as
> > > possible) before splitting into tiles vertically?
> > > 
> > > Is there a reason for this?  Or does it not matter?
> > 
> > I think it's preferable to make them as wide as possible -- i.e. to increase
> > tiles_in_thread.h as long as possible while keeping tiles_in_thread.v at 1
--
> > because it gives better locality of reference.
> 
> Great!  Can you add a comment explaining this?

Done.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:65: ThrowProgramError("num_tiles_per_thread calculation
is wrong.");
On 2016/01/27 15:34:42, msarett wrote:
> On 2016/01/26 22:50:59, ebrauer wrote:
> > On 2016/01/26 21:34:03, msarett wrote:
> > > Can we signal an error here without throwing an exception?  (Or maybe not,
> > > because this is used in overriding a function in the dng_sdk?)
> > > 
> > > If it's code that will never be reached, you might just omit the else
case?
> > > 
> > > Or maybe more Skia-like:
> > > SkASSERT(false);
> > > return {0,0};
> > It might be reached on program errors or invalid data. This prevents the
> > function from producing erroneous results if the caller should ever pass in
> > max_threads == 0. I think I like the assert.
> 
> Sorry for being unclear here.
> 
> If reaching this code indicates programmer error, then let's use SkASSERT. 
This
> will assert in Debug mode and do nothing in Release mode.
> 
> If reaching this code might indicate invalid input data, we need to indicate
> that the decode failed.  AFAICT, the only way to do this is using an
exception? 
> That we will catch in SkRawCodec and return a failure?

Discussed this with Anton and we agreed that it is better to be safe here and
throw.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:76: const int maxThreads =
static_cast<int>(kMaxMPThreads);
On 2016/01/27 15:34:42, msarett wrote:
> On 2016/01/26 22:50:59, ebrauer wrote:
> > On 2016/01/26 21:34:03, msarett wrote:
> > > So this is 32 on 64-bit systems and 8 on 32-bit systems?
> > > 
> > > Can you explain why we need to get this value from the dng_sdk?
> > 
> > I think we should usetask.MaxThreads(), because the task that's passed in to
> > PerformAreaTask() may request to be processed on a smaller number of
threads,
> > though it seems to be either 8 or 32.
> 
> sgtm.  Thanks for the comment.

Done.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:79: const dng_point tileSize(task.FindTileSize(area));
On 2016/01/27 15:34:42, msarett wrote:
> On 2016/01/26 22:50:58, ebrauer wrote:
> > On 2016/01/26 21:34:03, msarett wrote:
> > > Can you add a comment explaining how the dng_sdk determines the tile size?
> > > 
> > > Side note: It would be useful for me to know some typical values of "area"
> and
> > > "tileSize" (and how those compare to the image size).  I've looked at this
> > > function in dng_sdk and find it a little confusing.
> > 
> > area is typically something like (4000, 3000) while tileSize is (256, 256).
I
> > have always seen tileSize to be that which is maxTileSize in the dng sdk.
> 
> Can you add a comment that is something like this?
> 
> // tileSize is typically 256x256.

Done.

https://codereview.chromium.org/1634763002/diff/80001/src/codec/SkRawCodec.cp...
src/codec/SkRawCodec.cpp:108: uint32 PerformAreaTaskThreads() override {
On 2016/01/27 15:34:42, msarett wrote:
> On 2016/01/26 22:50:58, ebrauer wrote:
> > On 2016/01/26 21:34:03, msarett wrote:
> > > I'm not sure how this is used, but the comments in dng_host.h make it seem
> > like
> > > this should return the actual number of threads (rather than the max).
> > 
> > The documentation for PerformAreaTaskThreads() isn't completely clear, but
my
> > understanding is that it's supposed to return the maximum number of threads
> that
> > PerformAreaTask() is potentially able to use. (This also makes sense given
> that
> > PerformAreaTaskThreads() is used in multiple places to check whether
> > multithreading is possible or not.)
> 
> Yeah I noticed that they mostly compare it to 1, so we definitely need to
return
> a value greater than 1.
> 
> They also seem to use it to determine the size of the AreaTask.
> 
> I don't have insight into how this might matter, but I think the code looks
fine
> as is.

Acknowledged.

ebrauer

Description was changed from ========== WIP: This is how I like to do the multithreaded ...

4 years, 11 months ago (2016-01-27 16:01:03 UTC) #25

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1634763002/160001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1634763002/160001

4 years, 11 months ago (2016-01-27 16:04:56 UTC) #28

commit-bot: I haz the power

Description was changed from ========== It derives the dng_host and overrides PerformAreaTask() to split the ...

4 years, 11 months ago (2016-01-27 16:21:05 UTC) #29

commit-bot: I haz the power

4 years, 11 months ago (2016-01-27 16:21:06 UTC) #30

Message was sent while issue was closed.

Committed patchset #9 (id:160001) as
https://skia.googlesource.com/skia/+/b84b5b42c1d848aa2b87216b9bda2b8c9e5781c1

Issue 1634763002: Multithreaded dng host implementation (Closed)

Description

Patch Set 1 #

Patch Set 2 : Fixes the tile organization. #

Patch Set 3 : Uses lambda instead of bind now. #

Patch Set 4 : Simplifies the thread area calculation. #

Patch Set 5 : Addressed comment from adaubert. #

Patch Set 6 : Addressed a couple of comments. #

Patch Set 7 : Differentiates between internal sub-tasks and threads. #

Patch Set 8 : Synced with internal code. #

Patch Set 9 : Resolves comments #

Messages