Issue 1739993004: content: Implement dynamic priorities for raster threads.

prashant.n

Description was changed from ========== [WIP] content: Dynamic priorities for raster threads. Implement dynamic priorities ...

4 years, 9 months ago (2016-02-26 16:48:52 UTC) #1

prashant.n

prashant.n@samsung.com changed reviewers: + ericrk@chromium.org, reveman@chromium.org, vmpstr@chromium.org

4 years, 9 months ago (2016-02-26 16:48:52 UTC) #2

reveman

The preferred sandbox used by chrome doesn't allow us to dynamically change the priority of ...

4 years, 9 months ago (2016-02-26 19:16:41 UTC) #4

ericrk

On 2016/02/26 19:16:41, reveman wrote: > The preferred sandbox used by chrome doesn't allow us ...

4 years, 9 months ago (2016-02-26 20:54:30 UTC) #5

prashant.n

reveman@, 1. Here I'm trying to change the priority of current thread from thread itself. ...

4 years, 9 months ago (2016-02-27 12:33:38 UTC) #6

prashant.n

Changing priority of thread within thread before task is getting executed is not a good ...

4 years, 9 months ago (2016-03-01 17:03:53 UTC) #7

reveman

On 2016/03/01 at 17:03:53, prashant.n wrote: > Changing priority of thread within thread before task ...

4 years, 9 months ago (2016-03-01 20:03:36 UTC) #8

prashant.n

> Sorry, I'm failing to see the motivation for changing the code in the first ...

4 years, 9 months ago (2016-03-02 06:26:04 UTC) #9

reveman

On 2016/03/02 at 06:26:04, prashant.n wrote: > > Sorry, I'm failing to see the motivation ...

4 years, 9 months ago (2016-03-02 15:24:22 UTC) #10

prashant.n

On 2016/03/02 15:24:22, reveman wrote: > On 2016/03/02 at 06:26:04, prashant.n wrote: > > > ...

4 years, 9 months ago (2016-03-02 18:47:28 UTC) #11

On 2016/03/02 15:24:22, reveman wrote:
> On 2016/03/02 at 06:26:04, prashant.n wrote:
> > > Sorry, I'm failing to see the motivation for changing the code in the
first
> > > place. Is there some discussion I've missed?
> > 
> > I've been referring to crbug.com/504515. My motivation to change code was -
> > 
> > 1. With 1 background thread we could not use CPU cores efficiently when
> remaining cores of CPU are idle and background tasks are being run.
> 
> That's intentional. We don't want to waste too much power on background tasks
> that are not necessarily needed.

Ok. I understand now.
The tiles which are now bin are the only which are categorized as foreground
tasks, we may be interested in soon bin tiles too, may be runner thread priority
inbetween normal and background.


> 
> > 
> > For this we should either have N foreground tasks threads (normal priority)
> and M background threads (background priority). So this will be better than
> existing implementation. Suppose N = 4 and M = 4 and no. of cores is 8, we
will
> have many raster threads, tut this will ensure efficient use of cores.
> 
> I'm not sure we want efficient use of cores in these situations. For
foreground
> work maximizing usage of cores makes sense but for background work I think
we'd
> rather minimize power usage.
> 
> > 
> > 2. The above solution still is still leaves 1 problem. If there is a task t1
> being executed by background thread and after user has scrolled, t1 is needed
on
> high priority, there without increasing priority of runner thread, it cannot
be
> finished quickly. So increasing priority of the thread from outside of thread
is
> quite needed.
> 
> Yes, this problem does exist in theory but I think we should first collect
some
> data showing that this is a problem in practice before we add more complexity
to
> the raster-worker-pool implementation to solve it.

I've been working on it, sooner I'll try to get valid data.

prashant.n

Description was changed from ========== [WIP] content: Dynamic priorities for raster threads. Implement dynamic priorities ...

4 years, 9 months ago (2016-03-04 15:17:48 UTC) #12

prashant.n

reveman@, patchset2 gives the data showing problem and occurrence is frequent, even on simple pages ...

4 years, 9 months ago (2016-03-04 15:19:51 UTC) #13

reveman

Check with vmpstr@ and ericrk@ who are making changes to raster work.

4 years, 9 months ago (2016-03-04 20:10:34 UTC) #14

prashant.n

On 2016/03/04 20:10:34, reveman wrote: > Check with vmpstr@ and ericrk@ who are making changes ...

4 years, 9 months ago (2016-03-05 03:06:46 UTC) #15

ericrk

On 2016/03/05 03:06:46, prashant.n wrote: > On 2016/03/04 20:10:34, reveman wrote: > > Check with ...

4 years, 9 months ago (2016-03-07 18:51:56 UTC) #16

prashant.n

On 2016/03/07 18:51:56, ericrk wrote: > On 2016/03/05 03:06:46, prashant.n wrote: > > On 2016/03/04 ...

4 years, 9 months ago (2016-03-07 19:17:03 UTC) #17

On 2016/03/07 18:51:56, ericrk wrote:
> On 2016/03/05 03:06:46, prashant.n wrote:
> > On 2016/03/04 20:10:34, reveman wrote:
> > > Check with vmpstr@ and ericrk@ who are making changes to raster work.
> > 
> > vmpstr@ and ericrk@, I'm interested to work on this?
> 
> When you say that patchset2 gives the data showing that the problem/occurrence
> is frequent, can you explain a bit more - I see the log statements you've
> inserted, but I'd like a bit more detail.

Yes. The logs come when running tasks get rescheduled with different priority.
Most of the times we could see low priority tasks being rescheduled at high
priotity.
> 
> It seems that you are logging when low priority background tasks are not
> completed by the time a they get bumped to high-priority, indicating that we
> have a backlog of background work. While running more background threads would
> potentially alleviate this, it does so at the cost of blocking future
foreground
> work that may come in, especially on pages with a lot of invalidations. This
is
> an issue we've run into in the past when trying to run more threads - if we
end
> up running 4 background threads and new foreground work comes in (due to an
> invalidation), these 4 background threads may starve the foreground work (and
> may not even be needed in the case of an invalidation).

Pls. ignore the code implemented in patchset 1. It does not implement the
purpose I intent to say.

> 
> I guess what I'm saying is that I'm not currently too concerned if we see that
> some background work is not always finished by the time it is bumped to
> foreground, unless we are also seeing drops in some metric we care about
> (smoothness, framerate, etc...) - we are intentionally underutilizing cores
for
> background work to ensure foreground work does not get blocked...

I could see some background raster tasks take lot of time, upto 35 ms. I'll
check how much time running task for which priority is changed takes, which will
clarify the intent.

> 
> Do you have specific test sites where we see a drop in smoothness or
frame-rate
> due to under-utilization of background threads? I took a quick look at
telemetry
> and didn't see any significant changes when this CL landed, but we could have
> definitely missed something.

I've not checked any specific site, but my previous analysis on fast pinch zoom
had lead me to this and on google.com also it shows problem, may be I'll check
actual times by which raster tasks are delayed which would give us better
picture.

prashant.n

1. Data for two sites. (Contents zoomed and flinged from top to bottom). It is ...

4 years, 9 months ago (2016-03-08 15:19:08 UTC) #18

1. Data for two sites. (Contents zoomed and flinged from top to bottom). It is
observed that background priority thread take longer time to finish task (data
shown only for those tasks which are re-scheduled with different priority),
which could have been finished early if threads would have normal priority. From
the logs below, entries ending "DELAYED TASK ****" is our point of interest.

2. For log entries ending with "WASTED" we need to device some other solution -
something like cancellation.

google.com

PRAS::RasterTask [0x81cbeab0], Thread Priority = background, Total Execution
Time = 11.7 ms, Old Cat = background, old_cat_time = 6.38 ms, New Cat =
foreground, new_cat_time = 5.34 ms DELAYED TASK ****
PRAS::RasterTask [0x812ebda8], Thread Priority = normal, Total Execution Time =
9.03 ms, Old Cat = foreground, old_cat_time = 7.9 ms, New Cat = background,
new_cat_time = 1.13 ms WASTED
PRAS::RasterTask [0x814fb3f8], Thread Priority = normal, Total Execution Time =
7.6 ms, Old Cat = foreground, old_cat_time = 1.19 ms, New Cat = background,
new_cat_time = 6.41 ms WASTED
PRAS::RasterTask [0x8118f148], Thread Priority = background, Total Execution
Time = 5.55 ms, Old Cat = background, old_cat_time = 1.13 ms, New Cat =
foreground, new_cat_time = 4.42 ms DELAYED TASK ****
PRAS::RasterTask [0x829220c0], Thread Priority = background, Total Execution
Time = 3.14 ms, Old Cat = background, old_cat_time = 1.74 ms, New Cat =
foreground, new_cat_time = 1.4 ms DELAYED TASK ****
PRAS::RasterTask [0x812c16e0], Thread Priority = normal, Total Execution Time =
8.18 ms, Old Cat = foreground, old_cat_time = 7.02 ms, New Cat = background,
new_cat_time = 1.16 ms WASTED
PRAS::RasterTask [0x826cc900], Thread Priority = background, Total Execution
Time = 28.1 ms, Old Cat = background, old_cat_time = 28 ms, New Cat =
foreground, new_cat_time = 0.122 ms DELAYED TASK ****
PRAS::RasterTask [0x82608410], Thread Priority = background, Total Execution
Time = 15.4 ms, Old Cat = background, old_cat_time = 3.05 ms, New Cat =
foreground, new_cat_time = 12.4 ms DELAYED TASK ****
PRAS::RasterTask [0x826b8268], Thread Priority = background, Total Execution
Time = 6.32 ms, Old Cat = background, old_cat_time = 5.31 ms, New Cat =
foreground, new_cat_time = 1.01 ms DELAYED TASK ****
PRAS::RasterTask [0x81186ff0], Thread Priority = background, Total Execution
Time = 20.9 ms, Old Cat = background, old_cat_time = 10.8 ms, New Cat =
foreground, new_cat_time = 10.1 ms DELAYED TASK ****
PRAS::RasterTask [0x811cce80], Thread Priority = background, Total Execution
Time = 14.6 ms, Old Cat = background, old_cat_time = 7.11 ms, New Cat =
foreground, new_cat_time = 7.45 ms DELAYED TASK ****



theverge.com

PRAS::RasterTask [0x820eea68], Thread Priority = background, Total Execution
Time = 16.3 ms, Old Cat = background, old_cat_time = 15.3 ms, New Cat =
foreground, new_cat_time = 0.977 ms DELAYED TASK ****
PRAS::RasterTask [0x82091e50], Thread Priority = background, Total Execution
Time = 2.69 ms, Old Cat = background, old_cat_time = 1.86 ms, New Cat =
foreground, new_cat_time = 0.824 ms DELAYED TASK ****
PRAS::RasterTask [0x810f47e8], Thread Priority = background, Total Execution
Time = 663 ms, Old Cat = background, old_cat_time = 187 ms, New Cat =
foreground, new_cat_time = 477 ms DELAYED TASK ****
PRAS::RasterTask [0x7d845d50], Thread Priority = background, Total Execution
Time = 7.78 ms, Old Cat = background, old_cat_time = 0.336 ms, New Cat =
foreground, new_cat_time = 7.45 ms DELAYED TASK ****
PRAS::RasterTask [0x80d3cf20], Thread Priority = background, Total Execution
Time = 8.82 ms, Old Cat = background, old_cat_time = 7.72 ms, New Cat =
foreground, new_cat_time = 1.1 ms DELAYED TASK ****
PRAS::RasterTask [0x829a0dd8], Thread Priority = background, Total Execution
Time = 4.55 ms, Old Cat = background, old_cat_time = 2.5 ms, New Cat =
foreground, new_cat_time = 2.04 ms DELAYED TASK ****
PRAS::RasterTask [0x81132030], Thread Priority = background, Total Execution
Time = 3.14 ms, Old Cat = background, old_cat_time = 0.854 ms, New Cat =
foreground, new_cat_time = 2.29 ms DELAYED TASK ****
PRAS::RasterTask [0x82089f90], Thread Priority = background, Total Execution
Time = 27.4 ms, Old Cat = background, old_cat_time = 24.2 ms, New Cat =
foreground, new_cat_time = 3.23 ms DELAYED TASK ****
PRAS::RasterTask [0x81113240], Thread Priority = background, Total Execution
Time = 5.04 ms, Old Cat = background, old_cat_time = 0.214 ms, New Cat =
foreground, new_cat_time = 4.82 ms DELAYED TASK ****
PRAS::RasterTask [0x810e2738], Thread Priority = background, Total Execution
Time = 2.78 ms, Old Cat = background, old_cat_time = 1.1 ms, New Cat =
foreground, new_cat_time = 1.68 ms DELAYED TASK ****
PRAS::RasterTask [0x811dedb8], Thread Priority = background, Total Execution
Time = 19.3 ms, Old Cat = background, old_cat_time = 1.44 ms, New Cat =
foreground, new_cat_time = 17.9 ms DELAYED TASK ****
PRAS::RasterTask [0x7d85d910], Thread Priority = background, Total Execution
Time = 5.31 ms, Old Cat = background, old_cat_time = 0.244 ms, New Cat =
foreground, new_cat_time = 5.07 ms DELAYED TASK ****
PRAS::RasterTask [0x81180da0], Thread Priority = background, Total Execution
Time = 1.68 ms, Old Cat = background, old_cat_time = 1.19 ms, New Cat =
foreground, new_cat_time = 0.488 ms DELAYED TASK ****
PRAS::RasterTask [0x7d846438], Thread Priority = background, Total Execution
Time = 10.7 ms, Old Cat = background, old_cat_time = 8.73 ms, New Cat =
foreground, new_cat_time = 2.01 ms DELAYED TASK ****

prashant.n

As per https://code.google.com/p/chromium/codesearch#chromium/src/sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.cc&l=270&rcl=1457411445 we can change priority of thread from current process. Correct me if ...

4 years, 9 months ago (2016-03-08 15:22:21 UTC) #19

reveman

On 2016/03/08 at 15:22:21, prashant.n wrote: > As per https://code.google.com/p/chromium/codesearch#chromium/src/sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.cc&l=270&rcl=1457411445 > > we can change ...

4 years, 9 months ago (2016-03-08 17:06:49 UTC) #20

prashant.n

> Btw, I'm not sure how to interpret the data above. Thread Priority = priority ...

4 years, 9 months ago (2016-03-08 18:15:13 UTC) #21

prashant.n

When I re-read the comment at https://code.google.com/p/chromium/codesearch#chromium/src/sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.h&l=66 it would give EPERM error if |who| is ...

4 years, 9 months ago (2016-03-09 04:42:02 UTC) #22

prashant.n

On 2016/03/09 04:42:02, prashant.n wrote: > When I re-read the comment at > > https://code.google.com/p/chromium/codesearch#chromium/src/sandbox/linux/seccomp-bpf-helpers/syscall_parameters_restrictions.h&l=66 ...

4 years, 9 months ago (2016-03-09 14:31:31 UTC) #23

On 2016/03/09 04:42:02, prashant.n wrote:
> When I re-read the comment at 
> 
>
https://code.google.com/p/chromium/codesearch#chromium/src/sandbox/linux/secc...
> 
> it would give EPERM error if |who| is not 0.
> 
> But when we want to change the priority for the given thread, we need to pass
> thread id. Let me check further.

The above comment can be related to main thread, so we don't need to worry.

I've modified code for android, not yet assessed whether priority is getting
changed actually or not, but per requests made it looks like changing. See given
below the logs for it.

Current = Current priority
Default = Default priority
Low = Low priority when to slow down
High = High priority when to speed up


PRAS::RasterWorkerPool::Start NumRasterThreads = 2
PRAS:: [0x7cc428f8]  Current = normal,  { Default = background, Low =
background, High = normal}.  Runner speeded up ^^^^^^^^^
PRAS:: [0x7cc428f8] Current = background,  { Default = background, Low =
background, High = normal}.  Restored to default.
PRAS:: [0x7cc420a8]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc3e930]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc413e8]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc413e8] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc420a8] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc3e930] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc428f8]  Current = normal,  { Default = background, Low =
background, High = normal}.  Runner speeded up ^^^^^^^^^
PRAS:: [0x7cc428f8] Current = background,  { Default = background, Low =
background, High = normal}.  Restored to default.
PRAS:: [0x7cc420a8]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc420a8] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc413e8]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc3e930]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc413e8] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc413e8]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc420a8]  Current = background,  { Default = normal, Low =
background, High = normal}.  Runner slowed down.
PRAS:: [0x7cc413e8] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc3e930] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.
PRAS:: [0x7cc420a8] Current = normal,  { Default = normal, Low = background,
High = normal}.  Restored to default.

However I felt flinge little better on Galaxy S4.

prashant.n

https://codereview.chromium.org/1784623005/ is the perf tryjob on android. I tested smoothness.top_25_smooth. I don't know exactly what ...

4 years, 9 months ago (2016-03-10 09:29:00 UTC) #24

prashant.n

Description was changed from ========== [WIP] content: Dynamic priorities for raster threads. Implement dynamic priorities ...

4 years, 9 months ago (2016-03-10 13:57:25 UTC) #25

reveman

On 2016/03/10 at 09:29:00, prashant.n wrote: > https://codereview.chromium.org/1784623005/ is the perf tryjob on android. I ...

4 years, 9 months ago (2016-03-10 17:11:39 UTC) #26

prashant.n

> You're changing the number of threads used for raster as part of that patch ...

4 years, 9 months ago (2016-03-10 18:07:28 UTC) #27

ericrk

On 2016/03/10 18:07:28, prashant.n wrote: > > You're changing the number of threads used for ...

4 years, 9 months ago (2016-03-11 00:55:27 UTC) #28

prashant.n

> I'm currently doing work that will allow GPU raster to effectively use > background ...

4 years, 9 months ago (2016-03-11 04:42:02 UTC) #29

prashant.n

Results for 1 foreground thread and 1 dynamic thread at https://codereview.chromium.org/1787453003/ For easy reference - ...

4 years, 9 months ago (2016-03-11 14:44:05 UTC) #30

prashant.n

https://codereview.chromium.org/1739993004/diff/140001/cc/raster/task_graph_runner.cc File cc/raster/task_graph_runner.cc (right): https://codereview.chromium.org/1739993004/diff/140001/cc/raster/task_graph_runner.cc#newcode8 cc/raster/task_graph_runner.cc:8: #include <iomanip> Remove. https://codereview.chromium.org/1739993004/diff/140001/cc/raster/task_graph_runner.cc#newcode23 cc/raster/task_graph_runner.cc:23: void Task::AttachWorker(TaskWorker* worker) { ...

4 years, 9 months ago (2016-03-11 17:49:35 UTC) #32

prashant.n

Description was changed from ========== Test dynamic priorities for raster threads. Implement dynamic priorities for ...

4 years, 9 months ago (2016-03-14 14:10:50 UTC) #33

prashant.n

prashant.n@samsung.com changed reviewers: + jam@chromium.org

4 years, 9 months ago (2016-03-14 14:10:50 UTC) #34

prashant.n

PTAL, I'll implement win/linux/freebsd and slowing down task in subsequent patches.

4 years, 9 months ago (2016-03-14 14:19:37 UTC) #35

reveman

IMO the current version of this patch is much too complicated for the benefit it's ...

4 years, 9 months ago (2016-03-15 01:49:30 UTC) #36

prashant.n

On 2016/03/15 01:49:30, reveman wrote: > IMO the current version of this patch is much ...

4 years, 9 months ago (2016-03-15 04:34:17 UTC) #37

prashant.n

While writing unit test and verifying it on linux, I got that on linux we ...

4 years, 9 months ago (2016-03-15 13:46:37 UTC) #38

reveman

On 2016/03/15 at 13:46:37, prashant.n wrote: > While writing unit test and verifying it on ...

4 years, 9 months ago (2016-03-15 15:09:05 UTC) #39

prashant.n

> That's what I was suspecting. I don't think is worthwhile unless it works across ...

4 years, 9 months ago (2016-03-15 18:07:53 UTC) #40

prashant.n

As per https://code.google.com/p/chromium/codesearch#chromium/src/base/threading/platform_thread_unittest.cc&q=platform_thread_uni&sq=package:chromium&l=214, it would work on Windows. So there are 2 platforms which can ...

4 years, 9 months ago (2016-03-16 05:28:34 UTC) #41

prashant.n

> I'm focusing more on reducing tasks and maintaining states of it. task size*

4 years, 9 months ago (2016-03-16 05:29:14 UTC) #42

reveman

On 2016/03/16 at 05:29:14, prashant.n wrote: > > I'm focusing more on reducing tasks and ...

4 years, 9 months ago (2016-03-16 14:59:15 UTC) #43

prashant.n

On 2016/03/16 14:59:15, reveman wrote: > On 2016/03/16 at 05:29:14, prashant.n wrote: > > > ...

4 years, 9 months ago (2016-03-16 17:18:47 UTC) #44

prashant.n

prashant.n@samsung.com changed reviewers: - jam@chromium.org

4 years, 9 months ago (2016-03-16 17:19:39 UTC) #45

reveman

On 2016/03/16 at 17:18:47, prashant.n wrote: > On 2016/03/16 14:59:15, reveman wrote: > > On ...

4 years, 9 months ago (2016-03-16 18:20:51 UTC) #46

On 2016/03/16 at 17:18:47, prashant.n wrote:
> On 2016/03/16 14:59:15, reveman wrote:
> > On 2016/03/16 at 05:29:14, prashant.n wrote:
> > > > I'm focusing more on reducing tasks and maintaining states of it.
> > > task size*
> > 
> > I would prefer if we didn't use different logic for different platforms
unless
> > absolutely necessary. Let's focus on reducing the size of tasks and we can
> > revisit these type of dynamic thread priority adjustments later if
necessary.
> > Thanks for being thorough and checking what platforms this is and is not
> > supported on.
> 
> Yes. The logic I was thinking for slow down can also be used for speed up,
which will make changes in non-platform layer code.
> 
> I was thinking of splitting raster buffer playback into small atomic task/work
items and storing needed data in  raster task. These work items can be executed
one at time (we can progressively make sure that work items are small so that
running it would consume few cpu/gpu cycles.) and before executing next work
item check whether to continue/cancel/reschedule the task. If to be rescheduled,
the same raster task can be worked upon by another appropriate priority thread.
Here we'll keep thread priorities predefined. Only we need to implement raster
task's work items to be executed by different threads, sequentially.
> 
> e.g. If there are 5 work items for the given raster task, and suppose 2 work
items are executed by current thread and 3rd is being executed and suppose
raster task's category is changed at the same time, then current thread can
continue to work on only on 3rd item, 4 and 5 would be scheduled to be run by
needed priority thread.) 
> 
> If this looks okay, I'll create a prototype for it.

Raster tasks are not typically that large. I think it would be much better to
focus on image decode tasks as they can be really large.

Note that to split tile raster into multiple tasks we need to first refactor or
remove the TileTaskRunner interface.

ericrk

On 2016/03/16 18:20:51, reveman wrote: > On 2016/03/16 at 17:18:47, prashant.n wrote: > > On ...

4 years, 9 months ago (2016-03-16 18:48:08 UTC) #47

On 2016/03/16 18:20:51, reveman wrote:
> On 2016/03/16 at 17:18:47, prashant.n wrote:
> > On 2016/03/16 14:59:15, reveman wrote:
> > > On 2016/03/16 at 05:29:14, prashant.n wrote:
> > > > > I'm focusing more on reducing tasks and maintaining states of it.
> > > > task size*
> > > 
> > > I would prefer if we didn't use different logic for different platforms
> unless
> > > absolutely necessary. Let's focus on reducing the size of tasks and we can
> > > revisit these type of dynamic thread priority adjustments later if
> necessary.
> > > Thanks for being thorough and checking what platforms this is and is not
> > > supported on.
> > 
> > Yes. The logic I was thinking for slow down can also be used for speed up,
> which will make changes in non-platform layer code.
> > 
> > I was thinking of splitting raster buffer playback into small atomic
task/work
> items and storing needed data in  raster task. These work items can be
executed
> one at time (we can progressively make sure that work items are small so that
> running it would consume few cpu/gpu cycles.) and before executing next work
> item check whether to continue/cancel/reschedule the task. If to be
rescheduled,
> the same raster task can be worked upon by another appropriate priority
thread.
> Here we'll keep thread priorities predefined. Only we need to implement raster
> task's work items to be executed by different threads, sequentially.
> > 
> > e.g. If there are 5 work items for the given raster task, and suppose 2 work
> items are executed by current thread and 3rd is being executed and suppose
> raster task's category is changed at the same time, then current thread can
> continue to work on only on 3rd item, 4 and 5 would be scheduled to be run by
> needed priority thread.) 
> > 
> > If this looks okay, I'll create a prototype for it.
> 
> Raster tasks are not typically that large. I think it would be much better to
> focus on image decode tasks as they can be really large.
> 
> Note that to split tile raster into multiple tasks we need to first refactor
or
> remove the TileTaskRunner interface.

FYI, both vmpstr@ and myself are doing significant work to pull image decodes
and scaling out of both SW and GPU raster tasks. It might make sense to hold off
on trying to split up image decodes until after this work is completed.

In software, splitting up image decodes/scaling may not be too tricky (at least
scaling seems like it could be tiled). On the GPU side of things, splitting
image decodes is a bit trickier. GPU image decode and upload is pretty much a
black box to CC. This is by design, we don't want CC to worry about whether we
are decoding to yuv, compressed textures, generating mips, etc.. Allowing this
to be split up would involve pushing changes to the Skia API as well.

Happy to hear thoughts you have in this area.

vmpstr

On 2016/03/16 18:48:08, ericrk wrote: > On 2016/03/16 18:20:51, reveman wrote: > > On 2016/03/16 ...

4 years, 9 months ago (2016-03-16 18:52:06 UTC) #48

On 2016/03/16 18:48:08, ericrk wrote:
> On 2016/03/16 18:20:51, reveman wrote:
> > On 2016/03/16 at 17:18:47, prashant.n wrote:
> > > On 2016/03/16 14:59:15, reveman wrote:
> > > > On 2016/03/16 at 05:29:14, prashant.n wrote:
> > > > > > I'm focusing more on reducing tasks and maintaining states of it.
> > > > > task size*
> > > > 
> > > > I would prefer if we didn't use different logic for different platforms
> > unless
> > > > absolutely necessary. Let's focus on reducing the size of tasks and we
can
> > > > revisit these type of dynamic thread priority adjustments later if
> > necessary.
> > > > Thanks for being thorough and checking what platforms this is and is not
> > > > supported on.
> > > 
> > > Yes. The logic I was thinking for slow down can also be used for speed up,
> > which will make changes in non-platform layer code.
> > > 
> > > I was thinking of splitting raster buffer playback into small atomic
> task/work
> > items and storing needed data in  raster task. These work items can be
> executed
> > one at time (we can progressively make sure that work items are small so
that
> > running it would consume few cpu/gpu cycles.) and before executing next work
> > item check whether to continue/cancel/reschedule the task. If to be
> rescheduled,
> > the same raster task can be worked upon by another appropriate priority
> thread.
> > Here we'll keep thread priorities predefined. Only we need to implement
raster
> > task's work items to be executed by different threads, sequentially.
> > > 
> > > e.g. If there are 5 work items for the given raster task, and suppose 2
work
> > items are executed by current thread and 3rd is being executed and suppose
> > raster task's category is changed at the same time, then current thread can
> > continue to work on only on 3rd item, 4 and 5 would be scheduled to be run
by
> > needed priority thread.) 
> > > 
> > > If this looks okay, I'll create a prototype for it.
> > 
> > Raster tasks are not typically that large. I think it would be much better
to
> > focus on image decode tasks as they can be really large.
> > 
> > Note that to split tile raster into multiple tasks we need to first refactor
> or
> > remove the TileTaskRunner interface.
> 
> FYI, both vmpstr@ and myself are doing significant work to pull image decodes
> and scaling out of both SW and GPU raster tasks. It might make sense to hold
off
> on trying to split up image decodes until after this work is completed.
> 
> In software, splitting up image decodes/scaling may not be too tricky (at
least
> scaling seems like it could be tiled). On the GPU side of things, splitting
> image decodes is a bit trickier. GPU image decode and upload is pretty much a
> black box to CC. This is by design, we don't want CC to worry about whether we
> are decoding to yuv, compressed textures, generating mips, etc.. Allowing this
> to be split up would involve pushing changes to the Skia API as well.
> 
> Happy to hear thoughts you have in this area.

+1 I'd prefer that we hold off on splitting up decoding work until the work
we've been doing had a chance to become a bit more stable.

prashant.n

On 2016/03/16 18:52:06, vmpstr wrote: > On 2016/03/16 18:48:08, ericrk wrote: > > On 2016/03/16 ...

4 years, 9 months ago (2016-03-17 02:17:05 UTC) #49

On 2016/03/16 18:52:06, vmpstr wrote:
> On 2016/03/16 18:48:08, ericrk wrote:
> > On 2016/03/16 18:20:51, reveman wrote:
> > > On 2016/03/16 at 17:18:47, prashant.n wrote:
> > > > On 2016/03/16 14:59:15, reveman wrote:
> > > > > On 2016/03/16 at 05:29:14, prashant.n wrote:
> > > > > > > I'm focusing more on reducing tasks and maintaining states of it.
> > > > > > task size*
> > > > > 
> > > > > I would prefer if we didn't use different logic for different
platforms
> > > unless
> > > > > absolutely necessary. Let's focus on reducing the size of tasks and we
> can
> > > > > revisit these type of dynamic thread priority adjustments later if
> > > necessary.
> > > > > Thanks for being thorough and checking what platforms this is and is
not
> > > > > supported on.
> > > > 
> > > > Yes. The logic I was thinking for slow down can also be used for speed
up,
> > > which will make changes in non-platform layer code.
> > > > 
> > > > I was thinking of splitting raster buffer playback into small atomic
> > task/work
> > > items and storing needed data in  raster task. These work items can be
> > executed
> > > one at time (we can progressively make sure that work items are small so
> that
> > > running it would consume few cpu/gpu cycles.) and before executing next
work
> > > item check whether to continue/cancel/reschedule the task. If to be
> > rescheduled,
> > > the same raster task can be worked upon by another appropriate priority
> > thread.
> > > Here we'll keep thread priorities predefined. Only we need to implement
> raster
> > > task's work items to be executed by different threads, sequentially.
> > > > 
> > > > e.g. If there are 5 work items for the given raster task, and suppose 2
> work
> > > items are executed by current thread and 3rd is being executed and suppose
> > > raster task's category is changed at the same time, then current thread
can
> > > continue to work on only on 3rd item, 4 and 5 would be scheduled to be run
> by
> > > needed priority thread.) 
> > > > 
> > > > If this looks okay, I'll create a prototype for it.
> > > 
> > > Raster tasks are not typically that large. I think it would be much better
> to
> > > focus on image decode tasks as they can be really large.
> > > 
> > > Note that to split tile raster into multiple tasks we need to first
refactor
> > or
> > > remove the TileTaskRunner interface.
> > 
> > FYI, both vmpstr@ and myself are doing significant work to pull image
decodes
> > and scaling out of both SW and GPU raster tasks. It might make sense to hold
> off
> > on trying to split up image decodes until after this work is completed.
> > 
> > In software, splitting up image decodes/scaling may not be too tricky (at
> least
> > scaling seems like it could be tiled). On the GPU side of things, splitting
> > image decodes is a bit trickier. GPU image decode and upload is pretty much
a
> > black box to CC. This is by design, we don't want CC to worry about whether
we
> > are decoding to yuv, compressed textures, generating mips, etc.. Allowing
this
> > to be split up would involve pushing changes to the Skia API as well.
> > 
> > Happy to hear thoughts you have in this area.
> 
> +1 I'd prefer that we hold off on splitting up decoding work until the work
> we've been doing had a chance to become a bit more stable.

I was thinking to split up raster tasks first as I saw they consume much time in
heavy sites like 55bbs.com, theverge.com, etc (many times > 16.66 ms). May be
this work can go in parallel with what you have been working on for image decode
tasks.

ericrk

On 2016/03/17 02:17:05, prashant.n wrote: > On 2016/03/16 18:52:06, vmpstr wrote: > > On 2016/03/16 ...

4 years, 9 months ago (2016-03-18 18:40:32 UTC) #50

On 2016/03/17 02:17:05, prashant.n wrote:
> On 2016/03/16 18:52:06, vmpstr wrote:
> > On 2016/03/16 18:48:08, ericrk wrote:
> > > On 2016/03/16 18:20:51, reveman wrote:
> > > > On 2016/03/16 at 17:18:47, prashant.n wrote:
> > > > > On 2016/03/16 14:59:15, reveman wrote:
> > > > > > On 2016/03/16 at 05:29:14, prashant.n wrote:
> > > > > > > > I'm focusing more on reducing tasks and maintaining states of
it.
> > > > > > > task size*
> > > > > > 
> > > > > > I would prefer if we didn't use different logic for different
> platforms
> > > > unless
> > > > > > absolutely necessary. Let's focus on reducing the size of tasks and
we
> > can
> > > > > > revisit these type of dynamic thread priority adjustments later if
> > > > necessary.
> > > > > > Thanks for being thorough and checking what platforms this is and is
> not
> > > > > > supported on.
> > > > > 
> > > > > Yes. The logic I was thinking for slow down can also be used for speed
> up,
> > > > which will make changes in non-platform layer code.
> > > > > 
> > > > > I was thinking of splitting raster buffer playback into small atomic
> > > task/work
> > > > items and storing needed data in  raster task. These work items can be
> > > executed
> > > > one at time (we can progressively make sure that work items are small so
> > that
> > > > running it would consume few cpu/gpu cycles.) and before executing next
> work
> > > > item check whether to continue/cancel/reschedule the task. If to be
> > > rescheduled,
> > > > the same raster task can be worked upon by another appropriate priority
> > > thread.
> > > > Here we'll keep thread priorities predefined. Only we need to implement
> > raster
> > > > task's work items to be executed by different threads, sequentially.
> > > > > 
> > > > > e.g. If there are 5 work items for the given raster task, and suppose
2
> > work
> > > > items are executed by current thread and 3rd is being executed and
suppose
> > > > raster task's category is changed at the same time, then current thread
> can
> > > > continue to work on only on 3rd item, 4 and 5 would be scheduled to be
run
> > by
> > > > needed priority thread.) 
> > > > > 
> > > > > If this looks okay, I'll create a prototype for it.
> > > > 
> > > > Raster tasks are not typically that large. I think it would be much
better
> > to
> > > > focus on image decode tasks as they can be really large.
> > > > 
> > > > Note that to split tile raster into multiple tasks we need to first
> refactor
> > > or
> > > > remove the TileTaskRunner interface.
> > > 
> > > FYI, both vmpstr@ and myself are doing significant work to pull image
> decodes
> > > and scaling out of both SW and GPU raster tasks. It might make sense to
hold
> > off
> > > on trying to split up image decodes until after this work is completed.
> > > 
> > > In software, splitting up image decodes/scaling may not be too tricky (at
> > least
> > > scaling seems like it could be tiled). On the GPU side of things,
splitting
> > > image decodes is a bit trickier. GPU image decode and upload is pretty
much
> a
> > > black box to CC. This is by design, we don't want CC to worry about
whether
> we
> > > are decoding to yuv, compressed textures, generating mips, etc.. Allowing
> this
> > > to be split up would involve pushing changes to the Skia API as well.
> > > 
> > > Happy to hear thoughts you have in this area.
> > 
> > +1 I'd prefer that we hold off on splitting up decoding work until the work
> > we've been doing had a chance to become a bit more stable.
> 
> I was thinking to split up raster tasks first as I saw they consume much time
in
> heavy sites like http://55bbs.com, http://theverge.com, etc (many times >
16.66 ms). May be
> this work can go in parallel with what you have been working on for image
decode
> tasks.

There definitely may be sites with slow raster tasks, but note that depending on
the settings (medium filter quality, gpu raster, etc...), we may be decoding /
scaling images within raster tasks, making them look longer than expected. We
should make sure this isn't the case before doing too much work here. Try
tracing these sites with the "blink" and "cc.debug" categories - if there are
image decodes happening during raster, you should see them with those categories
enabled. Thanks!

prashant.n

> There definitely may be sites with slow raster tasks, but note that depending on ...

4 years, 9 months ago (2016-03-19 02:57:47 UTC) #51

vmpstr

On 2016/03/19 02:57:47, prashant.n wrote: > > There definitely may be sites with slow raster ...

4 years, 9 months ago (2016-03-21 19:55:17 UTC) #52

reveman

On 2016/03/21 at 19:55:17, vmpstr wrote: > On 2016/03/19 02:57:47, prashant.n wrote: > > > ...

4 years, 9 months ago (2016-03-21 20:50:32 UTC) #53

prashant.n

On 2016/03/21 20:50:32, reveman wrote: > On 2016/03/21 at 19:55:17, vmpstr wrote: > > On ...

4 years, 9 months ago (2016-03-21 21:52:51 UTC) #54

On 2016/03/21 20:50:32, reveman wrote:
> On 2016/03/21 at 19:55:17, vmpstr wrote:
> > On 2016/03/19 02:57:47, prashant.n wrote:
> > > > There definitely may be sites with slow raster tasks, but note that
> depending
> > > on
> > > > the settings (medium filter quality, gpu raster, etc...), we may be
> decoding /
> > > > scaling images within raster tasks, making them look longer than
expected.
> We
> > > > should make sure this isn't the case before doing too much work here.
Try
> > > > tracing these sites with the "blink" and "cc.debug" categories - if
there
> are
> > > > image decodes happening during raster, you should see them with those
> > > categories
> > > > enabled. Thanks!
> > > 
> > > I had checked with tracing that on an average 1 raster task takes 0.5 ms
to
> run
> > > (on platform linux i7, 12 GB ram) and on slower devices like samsung
galaxy
> s4,
> > > it even consumes more.
> > > 
> > > My point for raster tasks is - if we get on an average 0.1 ms from 1
running
> > > task, we could get at max 0.4 ms with 4 raster threads while tasks are
> > > recategorized as background tasks, which might give us 1 needed raster
task
> run,
> > > which could reduce white patches when not all "now" tiles are rasterized
> within
> > > frame time.
> > > 
> > > I'll analyse more soon.
> > 
> > I'm not sure how you're proposing to split raster tasks into smaller tasks
(if
> that's what you're proposing)? We can have smaller raster tasks by making
> smaller tiles. However, we have found that there is a fairly expensive
overhead
> associated with every new raster task. In fact, it is usually faster to have
> larger tiles. If I'm misunderstanding things, please let me know.
> 
> Yes, tasks would have to be split in a way that doesn't add significant
> overhead. X number of scanlines per task instead of smaller tiles is probably
> better but I'm not sure that's good enough. I'm guessing we'd have to make
some
> changes to skia to make this work well.

Yes. we need to do skia changes. Creating small tiles is not going to give
ultimate solution when the number of layers grows to larger. I've shared my
approach in google doc. Kindly have a look at it.

prashant.n

On 2016/03/21 20:50:32, reveman wrote: > On 2016/03/21 at 19:55:17, vmpstr wrote: > > On ...

4 years, 9 months ago (2016-03-21 21:52:55 UTC) #55

On 2016/03/21 20:50:32, reveman wrote:
> On 2016/03/21 at 19:55:17, vmpstr wrote:
> > On 2016/03/19 02:57:47, prashant.n wrote:
> > > > There definitely may be sites with slow raster tasks, but note that
> depending
> > > on
> > > > the settings (medium filter quality, gpu raster, etc...), we may be
> decoding /
> > > > scaling images within raster tasks, making them look longer than
expected.
> We
> > > > should make sure this isn't the case before doing too much work here.
Try
> > > > tracing these sites with the "blink" and "cc.debug" categories - if
there
> are
> > > > image decodes happening during raster, you should see them with those
> > > categories
> > > > enabled. Thanks!
> > > 
> > > I had checked with tracing that on an average 1 raster task takes 0.5 ms
to
> run
> > > (on platform linux i7, 12 GB ram) and on slower devices like samsung
galaxy
> s4,
> > > it even consumes more.
> > > 
> > > My point for raster tasks is - if we get on an average 0.1 ms from 1
running
> > > task, we could get at max 0.4 ms with 4 raster threads while tasks are
> > > recategorized as background tasks, which might give us 1 needed raster
task
> run,
> > > which could reduce white patches when not all "now" tiles are rasterized
> within
> > > frame time.
> > > 
> > > I'll analyse more soon.
> > 
> > I'm not sure how you're proposing to split raster tasks into smaller tasks
(if
> that's what you're proposing)? We can have smaller raster tasks by making
> smaller tiles. However, we have found that there is a fairly expensive
overhead
> associated with every new raster task. In fact, it is usually faster to have
> larger tiles. If I'm misunderstanding things, please let me know.
> 
> Yes, tasks would have to be split in a way that doesn't add significant
> overhead. X number of scanlines per task instead of smaller tiles is probably
> better but I'm not sure that's good enough. I'm guessing we'd have to make
some
> changes to skia to make this work well.

Yes. we need to do skia changes. Creating small tiles is not going to give
ultimate solution when the number of layers grows to larger. I've shared my
approach in google doc. Kindly have a look at it.

prashant.n

> ultimate solution when the number of layers grows to larger or recorded operations are ...

4 years, 9 months ago (2016-03-21 21:55:28 UTC) #56

prashant.n

> ultimate solution when the number of layers grows to larger or recorded operations are ...

4 years, 9 months ago (2016-03-21 21:55:31 UTC) #57

prashant.n

From the attached traces (scrolled theverge.com from top to bottom), it can be found that ...

4 years, 9 months ago (2016-03-23 13:21:50 UTC) #58

reveman

On 2016/03/23 at 13:21:50, prashant.n wrote: > From the attached traces (scrolled theverge.com from top ...

4 years, 9 months ago (2016-03-23 15:54:50 UTC) #59

prashant.n

> I'm skeptical. Sounds like a lot of work and a lot of added complexity ...

4 years, 9 months ago (2016-03-23 18:46:03 UTC) #60

ericrk

On 2016/03/23 18:46:03, prashant.n wrote: > > I'm skeptical. Sounds like a lot of work ...

4 years, 9 months ago (2016-03-24 16:14:12 UTC) #61

prashant.n

> A few questions: > - Can you attach the trace of theverge.com? I'd be ...

4 years, 9 months ago (2016-03-24 17:00:30 UTC) #62

prashant.n

> I'd attached traces from http://theverge.com only, https://codereview.chromium.org/1739993004/patch/260001/270002 are the traces I added, but I ...

4 years, 9 months ago (2016-03-24 18:05:47 UTC) #63

prashant.n

Skia changes used for testing, are uploaded at https://codereview.chromium.org/1828233003/

4 years, 9 months ago (2016-03-24 18:16:10 UTC) #64

prashant.n

> A few questions: > - Can you attach the trace of theverge.com? I'd be ...

4 years, 9 months ago (2016-03-26 10:40:16 UTC) #65

prashant.n

There is mistake in the patchset 15, I'll take data once again.

4 years, 8 months ago (2016-03-28 04:31:09 UTC) #66

prashant.n

The different traces added are Task::Rescheduled::Speedup, Task::Rescheduled::Slowdown, Task::Rescheduled::Same, Task::Scheduled::New, Task::Rescheduled::Cancelled. 1. With patch-set 16, I ...

4 years, 8 months ago (2016-03-28 13:57:54 UTC) #68

prashant.n

4 years, 8 months ago (2016-03-29 12:28:41 UTC) #70

I checked touch pinch zoom cases, touch scolling cases with traces in patchset
16 on Galaxy s4 device and observed following -
1. Flinging shows speedups and slowdowns [0.2 to 2%] of total tasks.
2. Pinch zoom shows cancels [1 to 4.5%] of total tasks.

So the task slitup would help in these gestures.

Issue 1739993004: content: Implement dynamic priorities for raster threads.

Description

Patch Set 1 #

Patch Set 2 : Test mismatch. #

Patch Set 3 : Test delays. #

Patch Set 4 : Don't check yet. #

Patch Set 5 : #

Patch Set 6 : test this. #

Patch Set 7 : android 1F + 1B #

Patch Set 8 : preparing for checkin. #

Patch Set 9 : fixed build error. #

Patch Set 10 : modified RWP::start function. #

Patch Set 11 : nits. #

Patch Set 12 : android only. #

Patch Set 13 : stop. #

Patch Set 14 : analyse playback #

Patch Set 15 : Test reschedule task. #

Patch Set 16 : Traces corrected. #

Messages