Issue 2461973002: [Command buffer] Feedback loop detection between texture and framebuffer attachments

yunchao

Description was changed from ========== Feedback loop detection for multiple drawbuffers BUG= ========== to ========== ...

4 years, 1 month ago (2016-10-31 09:37:55 UTC) #1

yunchao

The CQ bit was checked by yunchao.he@intel.com to run a CQ dry run

4 years, 1 month ago (2016-10-31 09:38:08 UTC) #2

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2461973002/1

4 years, 1 month ago (2016-10-31 09:38:24 UTC) #3

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-10-31 09:45:46 UTC) #4

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: cast_shell_android on master.tryserver.chromium.android (JOB_FAILED, https://build.chromium.org/p/tryserver.chromium.android/builders/cast_shell_android/builds/154879)

4 years, 1 month ago (2016-10-31 09:45:47 UTC) #5

yunchao

The CQ bit was checked by yunchao.he@intel.com to run a CQ dry run

4 years, 1 month ago (2016-10-31 10:13:16 UTC) #7

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2461973002/20001

4 years, 1 month ago (2016-10-31 10:13:33 UTC) #8

yunchao

The CQ bit was checked by yunchao.he@intel.com to run a CQ dry run

4 years, 1 month ago (2016-10-31 10:34:59 UTC) #11

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2461973002/40001

4 years, 1 month ago (2016-10-31 10:35:08 UTC) #12

yunchao

Description was changed from ========== Feedback loop detection for multiple drawbuffers BUG= CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel ========== to ...

4 years, 1 month ago (2016-10-31 11:04:09 UTC) #13

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-10-31 11:49:30 UTC) #14

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 1 month ago (2016-10-31 11:49:31 UTC) #15

yunchao

Description was changed from ========== [Command buffer] Feedback loop detection between texture and framebuffer attachments. ...

4 years, 1 month ago (2016-10-31 13:44:54 UTC) #16

yunchao

yunchao.he@intel.com changed reviewers: + kbr@chromium.org, qiankun.miao@intel.com, zmo@chromium.org

4 years, 1 month ago (2016-10-31 13:44:54 UTC) #17

yunchao

The pr at https://github.com/KhronosGroup/WebGL/pull/2121 will fail in latest Chromium. However, with this patch applied, the ...

4 years, 1 month ago (2016-10-31 13:46:25 UTC) #18

yunchao

Description was changed from ========== [Command buffer] Feedback loop detection between texture and framebuffer attachments. ...

4 years, 1 month ago (2016-10-31 13:53:54 UTC) #19

yunchao

Description was changed from ========== [Command buffer] Feedback loop detection between texture and framebuffer attachments ...

4 years, 1 month ago (2016-10-31 13:54:58 UTC) #20

Zhenyao Mo

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc File gpu/command_buffer/service/gles2_cmd_decoder.cc (right): https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc#newcode8678 gpu/command_buffer/service/gles2_cmd_decoder.cc:8678: if (feature_info_->IsWebGL1OrES2Context()) { We are able to expose MRT ...

4 years, 1 month ago (2016-10-31 18:03:47 UTC) #21

yunchao

Thanks for your review, Zhenyao. Please see the comments inline. https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc File gpu/command_buffer/service/gles2_cmd_decoder.cc (right): https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc#newcode8700 ...

4 years, 1 month ago (2016-11-01 15:25:22 UTC) #22

Zhenyao Mo

zmo@chromium.org changed reviewers: + piman@chromium.org

4 years, 1 month ago (2016-11-01 22:55:45 UTC) #23

Zhenyao Mo

piman, kbr: do you have any suggestions in this? https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc File gpu/command_buffer/service/gles2_cmd_decoder.cc (right): https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc#newcode8700 gpu/command_buffer/service/gles2_cmd_decoder.cc:8700: ...

4 years, 1 month ago (2016-11-01 22:55:46 UTC) #24

piman, kbr: do you have any suggestions in this?

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/serv...
File gpu/command_buffer/service/gles2_cmd_decoder.cc (right):

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/serv...
gpu/command_buffer/service/gles2_cmd_decoder.cc:8700: if
(CheckDrawingFeedbackLoopsHelper(draw_buffer))
On 2016/11/01 15:25:22, yunchao wrote:
> On 2016/10/31 18:03:46, Zhenyao Mo wrote:
> > We will want to optimize on this.  This is not trivia and we should be able
to
> > avoid doing it per draw call. i.e., we should be able to only do it when
> related
> > states changed, including drawBuffers, framebufferTex calls, etc.
> 
> The cache info is not easy if we try to optimize on this, we need to cache
lots
> of infos. In addition, if we cache the info, we still need to either 1)
iterate
> and calculate the current drawbuffers mask (to compare), iterate the
framebuffer
> attachments masks, iterate the active texture/sampler binding points masks, or
> 2) we set dirty states for all these related call sites. The code will be much
> more complicated to detect feedback loop. If anyone is dirty, we still need to
> do the validation.
> 
> For the current code, I think it will early 'continue' for common cases. we
> still only have COLOR_ATTACHMENT0 as draw buffer for the majority of draw
calls.
> Even for MRT, most of the for loop will early 'continue' too in
> CheckDrawFeedbackLoopsHelper(). 
> 
> WDYT, Zhenyao?
> 
> If you still think the optimization can bring in a great improvement against
> current implementation, I will have a try. 

I think we just need one state, indicating if a feedback loop exists between
drawing dst / sampling src.

Then we could do some rough dirty-bit updating:

1) At program link succeed time, gather which texture binding points are used by
this program, i.e., a set of (active_texture, sampler_type)
2) set dirty flag to the state if the following happens
* drawBuffers()
* framebufferTexture{2|3}D (if the binding point is in set from step 1)
* changing of the current program (useProgram that changed the program,
linkProgram of the currently used program)
* bindTexture (which changed the binding) if the binding point is in set from
step 1)

piman

On Tue, Nov 1, 2016 at 3:55 PM, <zmo@chromium.org> wrote: > piman, kbr: do you ...

4 years, 1 month ago (2016-11-02 02:33:22 UTC) #25

On Tue, Nov 1, 2016 at 3:55 PM, <zmo@chromium.org> wrote:

> piman, kbr: do you have any suggestions in this?
>
>
> https://codereview.chromium.org/2461973002/diff/40001/gpu/
> command_buffer/service/gles2_cmd_decoder.cc
> File gpu/command_buffer/service/gles2_cmd_decoder.cc (right):
>
> https://codereview.chromium.org/2461973002/diff/40001/gpu/
> command_buffer/service/gles2_cmd_decoder.cc#newcode8700
> gpu/command_buffer/service/gles2_cmd_decoder.cc:8700: if
> (CheckDrawingFeedbackLoopsHelper(draw_buffer))
> On 2016/11/01 15:25:22, yunchao wrote:
> > On 2016/10/31 18:03:46, Zhenyao Mo wrote:
> > > We will want to optimize on this. This is not trivia and we should
> be able to
> > > avoid doing it per draw call. i.e., we should be able to only do it
> when
> > related
> > > states changed, including drawBuffers, framebufferTex calls, etc.
> >
> > The cache info is not easy if we try to optimize on this, we need to
> cache lots
> > of infos. In addition, if we cache the info, we still need to either
> 1) iterate
> > and calculate the current drawbuffers mask (to compare), iterate the
> framebuffer
> > attachments masks, iterate the active texture/sampler binding points
> masks, or
> > 2) we set dirty states for all these related call sites. The code will
> be much
> > more complicated to detect feedback loop. If anyone is dirty, we still
> need to
> > do the validation.
> >
> > For the current code, I think it will early 'continue' for common
> cases. we
> > still only have COLOR_ATTACHMENT0 as draw buffer for the majority of
> draw calls.
> > Even for MRT, most of the for loop will early 'continue' too in
> > CheckDrawFeedbackLoopsHelper().
> >
> > WDYT, Zhenyao?
> >
> > If you still think the optimization can bring in a great improvement
> against
> > current implementation, I will have a try.
>
> I think we just need one state, indicating if a feedback loop exists
> between drawing dst / sampling src.
>
> Then we could do some rough dirty-bit updating:
>
> 1) At program link succeed time, gather which texture binding points are
> used by this program, i.e., a set of (active_texture, sampler_type)
> 2) set dirty flag to the state if the following happens
> * drawBuffers()
> * framebufferTexture{2|3}D (if the binding point is in set from step 1)
> * changing of the current program (useProgram that changed the program,
> linkProgram of the currently used program)
> * bindTexture (which changed the binding) if the binding point is in set
> from step 1)
>

In GLES2DecoderImpl::PrepareTexturesForRender we're already looping through
textures bound to program uniforms. Maybe it would be very good to merge
this logic with that - turning it around (for each texture check if it's
also an attachment, instead of for each attachment check if it's a bound
texture). In particular, we have an upper bound on the number of
attachments, so we can look them up outside the loop into a fixed size
buffer on the stack - and the very common case will be to only have 1 of
them making the inner loop cheap.

We used to do dirty bit tracking for unrenderable textures, but we ended up
removing it - see https://codereview.chromium.org/1505343003 - because of
the added complexity, and it didn't seem to significantly affect
performance. Of course, death by a 1000 cuts can be a thing. I think if we
do the above transformation, we should be in a pretty good state (it may
even be cheaper than the current code because we don't need to loop over
textures multiple times).


> https://codereview.chromium.org/2461973002/
>

-- 
You received this message because you are subscribed to the Google Groups
"Chromium-reviews" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to chromium-reviews+unsubscribe@chromium.org.

piman

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc File gpu/command_buffer/service/gles2_cmd_decoder.cc (right): https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/service/gles2_cmd_decoder.cc#newcode8691 gpu/command_buffer/service/gles2_cmd_decoder.cc:8691: for (uint32_t ii = 0; ii < group_->max_draw_buffers(); ++ii) ...

4 years, 1 month ago (2016-11-02 02:33:57 UTC) #26

yunchao

Thanks for your review and your suggestions, zmo@ and piman@. Since the related crbug is ...

4 years, 1 month ago (2016-11-04 15:30:01 UTC) #27

Thanks for your review and your suggestions, zmo@ and piman@. Since the related
crbug is in WebGL 2.0.1, it is not an urgent task. Besides, in order to get
better performance, this revision may need to take some time. So I would like to
lower the priority.

I will revisit this as soon as I go through all undefined behaviors in ES3 spec.

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/serv...
File gpu/command_buffer/service/gles2_cmd_decoder.cc (right):

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/serv...
gpu/command_buffer/service/gles2_cmd_decoder.cc:8678: if
(feature_info_->IsWebGL1OrES2Context()) {
On 2016/10/31 18:03:46, Zhenyao Mo wrote:
> We are able to expose MRT in WebGL1 through extension, so you will need to
> consider that possibility.

Done.

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/serv...
gpu/command_buffer/service/gles2_cmd_decoder.cc:8691: for (uint32_t ii = 0; ii <
group_->max_draw_buffers(); ++ii) {
On 2016/11/02 02:33:57, piman wrote:
> Do we need to handle depth attachment vs depth texture too? (and ditto
stencil).

I think so.

https://codereview.chromium.org/2461973002/diff/40001/gpu/command_buffer/serv...
gpu/command_buffer/service/gles2_cmd_decoder.cc:8700: if
(CheckDrawingFeedbackLoopsHelper(draw_buffer))
On 2016/11/01 22:55:46, Zhenyao Mo wrote:
> On 2016/11/01 15:25:22, yunchao wrote:
> > On 2016/10/31 18:03:46, Zhenyao Mo wrote:
> > > We will want to optimize on this.  This is not trivia and we should be
able
> to
> > > avoid doing it per draw call. i.e., we should be able to only do it when
> > related
> > > states changed, including drawBuffers, framebufferTex calls, etc.
> > 
> > The cache info is not easy if we try to optimize on this, we need to cache
> lots
> > of infos. In addition, if we cache the info, we still need to either 1)
> iterate
> > and calculate the current drawbuffers mask (to compare), iterate the
> framebuffer
> > attachments masks, iterate the active texture/sampler binding points masks,
or
> > 2) we set dirty states for all these related call sites. The code will be
much
> > more complicated to detect feedback loop. If anyone is dirty, we still need
to
> > do the validation.
> > 
> > For the current code, I think it will early 'continue' for common cases. we
> > still only have COLOR_ATTACHMENT0 as draw buffer for the majority of draw
> calls.
> > Even for MRT, most of the for loop will early 'continue' too in
> > CheckDrawFeedbackLoopsHelper(). 
> > 
> > WDYT, Zhenyao?
> > 
> > If you still think the optimization can bring in a great improvement against
> > current implementation, I will have a try. 
> 
> I think we just need one state, indicating if a feedback loop exists between
> drawing dst / sampling src.
> 
> Then we could do some rough dirty-bit updating:
> 
> 1) At program link succeed time, gather which texture binding points are used
by
> this program, i.e., a set of (active_texture, sampler_type)
> 2) set dirty flag to the state if the following happens
> * drawBuffers()
> * framebufferTexture{2|3}D (if the binding point is in set from step 1)
> * changing of the current program (useProgram that changed the program,
> linkProgram of the currently used program)
> * bindTexture (which changed the binding) if the binding point is in set from
> step 1)

That's a pretty good suggestion. I will update the code here and find out a good
solution after I go through all undefined behavior in ES3 spec.

yunchao

The CQ bit was checked by yunchao.he@intel.com to run a CQ dry run

4 years, 1 month ago (2016-11-04 15:30:46 UTC) #28

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.org/2461973002/60001

4 years, 1 month ago (2016-11-04 15:31:05 UTC) #29

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 1 month ago (2016-11-04 17:11:37 UTC) #30

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 1 month ago (2016-11-04 17:11:38 UTC) #31

yunchao

I found that ANGLE folks have fixed this bug in ANGLE project. The failed test ...

3 years, 10 months ago (2017-02-15 07:16:19 UTC) #32

Zhenyao Mo

3 years, 10 months ago (2017-02-15 22:54:56 UTC) #33

On 2017/02/15 07:16:19, yunchao wrote:
> I found that ANGLE folks have fixed this bug in ANGLE project. 
> The failed test (rendering-sampling-feedback-loop.html) can pass in Chromium
> with command line option --use-gl=angle --use-passthrough-cmd-decoder.
> So I think it is not necessary to do the feedback loop detection in command
> buffer,
> considering that passthrough command buffer may take effect in near future.
>  
> What do you think, Zhenyao and Ken. If you think it is valuable to going
> forward,
> I can continue to work on this patch. It will not take too much effort.

Sure.  Let's halt this and wait for MANGLE.

Issue 2461973002: [Command buffer] Feedback loop detection between texture and framebuffer attachments

Description

Patch Set 1 : Feedback loop detection for multiple drawbuffers #

Patch Set 2 : Addressed zmo@'s feedback: MRT can be supported in WebGL1 or ES2 by extension #

Messages