|
|
Chromium Code Reviews
DescriptionV4L2VDA: do not allocate input buffers in GPU child thread.
Allocating input buffers can be very slow. Doing it in GPU
child thread can cause gpu process watchdog timeout. Move
it to decoder thread.
BUG=chrome-os-partner:56452
TEST=Run VDA unittest and play video.
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
Committed: https://crrev.com/f160854f267cb8bc627b2043182d11161eb478f2
Cr-Commit-Position: refs/heads/master@{#434910}
Patch Set 1 #
Total comments: 4
Patch Set 2 : address Pawel's comments #
Total comments: 5
Patch Set 3 : address Kuang-che's comments #
Total comments: 2
Patch Set 4 : address Kuang-che's comment #
Total comments: 2
Patch Set 5 : address Kuang-che's comment #Messages
Total messages: 27 (9 generated)
Description was changed from ========== V4L2VDA: do not allocate input buffers in GPU child thread. Allocating input buffers can be very slow. Doing it in GPU child thread can cause gpu process watchdog timeout. Move it to decoder thread. BUG=chrome-os-partner:56452 TEST=Run VDA unittest and play video. ========== to ========== V4L2VDA: do not allocate input buffers in GPU child thread. Allocating input buffers can be very slow. Doing it in GPU child thread can cause gpu process watchdog timeout. Move it to decoder thread. BUG=chrome-os-partner:56452 TEST=Run VDA unittest and play video. CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel ==========
wuchengli@chromium.org changed reviewers: + kcwu@chromium.org, posciak@chromium.org
PTAL.
wuchengli@chromium.org changed reviewers: + owenlin@chromium.org
V4L2SVDA already allocates in decoder thread. This changes V4L2VDA to be the same as V4L2SVDA.
lgtm % 1 nit. https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... media/gpu/v4l2_video_decode_accelerator.cc:313: if (!StartDevicePoll()) Add a comment that StartDevicePoll with NOTIFY_ERROR,
https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... media/gpu/v4l2_video_decode_accelerator.cc:313: if (!StartDevicePoll()) On 2016/11/24 07:44:23, Owen Lin wrote: > Add a comment that StartDevicePoll with NOTIFY_ERROR, Two other StartDevicePoll doesn't have this comment. I'll keep the code as is to be consistent.
The CQ bit was checked by wuchengli@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... media/gpu/v4l2_video_decode_accelerator.cc:287: decoder_cmd_supported_ = IsDecoderCmdSupported(); Could this and the subscribe event ioctl above be run on decoder thread as well?
The CQ bit was unchecked by wuchengli@chromium.org
PTAL. https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/1/media/gpu/v4l2_video_decode... media/gpu/v4l2_video_decode_accelerator.cc:287: decoder_cmd_supported_ = IsDecoderCmdSupported(); On 2016/11/24 08:28:35, Pawel Osciak wrote: > Could this and the subscribe event ioctl above be run on decoder thread as well? Done.
https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:206: DCHECK_EQ(decoder_state_, kUninitialized); Two questions, 1. should we add kInitializing state. Without this, we cannot detect Initialize() twice before InitializeTask done. 2. should we add lock for decoder_state_? https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:298: DCHECK(decoder_thread_.task_runner()->BelongsToCurrentThread()); DCHECK_EQ(decoder_state_, kUninitialized);
https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:206: DCHECK_EQ(decoder_state_, kUninitialized); On 2016/11/28 06:31:14, kcwu wrote: > Two questions, > 1. should we add kInitializing state. > Without this, we cannot detect Initialize() twice before InitializeTask done. I don't think it's very useful to add a state to just detect Initialize() is called twice before InitializeTask done. > > 2. should we add lock for decoder_state_? Decoder thread is started in Initialize(). That's why Initialize touches many variables that should be only used on decode thread (e.g. SetupFormats). If Initialize is called after decode thread starts, the result is really undefined.
Done. PTAL. https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:206: DCHECK_EQ(decoder_state_, kUninitialized); On 2016/11/28 07:04:49, wuchengli wrote: > On 2016/11/28 06:31:14, kcwu wrote: > > Two questions, > > 1. should we add kInitializing state. > > Without this, we cannot detect Initialize() twice before InitializeTask done. > I don't think it's very useful to add a state to just detect Initialize() is > called twice before InitializeTask done. > > > > 2. should we add lock for decoder_state_? > Decoder thread is started in Initialize(). That's why Initialize touches many > variables that should be only used on decode thread (e.g. SetupFormats). If > Initialize is called after decode thread starts, the result is really undefined. As discussed, moved decoder_state_ = kInitialized; back to Initialize(). https://codereview.chromium.org/2530493003/diff/20001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:298: DCHECK(decoder_thread_.task_runner()->BelongsToCurrentThread()); On 2016/11/28 06:31:14, kcwu wrote: > DCHECK_EQ(decoder_state_, kUninitialized); Done.
https://codereview.chromium.org/2530493003/diff/40001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/40001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:300: DCHECK_EQ(decoder_state_, kUninitialized); You added line 286, then this DCHECK will fail.
Done. PTAL. https://codereview.chromium.org/2530493003/diff/40001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.cc (right): https://codereview.chromium.org/2530493003/diff/40001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.cc:300: DCHECK_EQ(decoder_state_, kUninitialized); On 2016/11/28 10:53:51, kcwu wrote: > You added line 286, then this DCHECK will fail. Thanks for catching this.
lgtm; nits https://codereview.chromium.org/2530493003/diff/60001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/2530493003/diff/60001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.h:139: kInitialized, // InitializeTask() succeeds; ready to start decoding. this is still "Initialize() returned true"
https://codereview.chromium.org/2530493003/diff/60001/media/gpu/v4l2_video_de... File media/gpu/v4l2_video_decode_accelerator.h (right): https://codereview.chromium.org/2530493003/diff/60001/media/gpu/v4l2_video_de... media/gpu/v4l2_video_decode_accelerator.h:139: kInitialized, // InitializeTask() succeeds; ready to start decoding. On 2016/11/29 03:51:46, kcwu wrote: > this is still "Initialize() returned true" Done.
The CQ bit was checked by wuchengli@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from owenlin@chromium.org, kcwu@chromium.org Link to the patchset: https://codereview.chromium.org/2530493003/#ps80001 (title: "address Kuang-che's comment")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch.
Bot data: {"patchset_id": 80001, "attempt_start_ts": 1480396709061960,
"parent_rev": "67ccabbf49b1cd8789b418130c1ca5edc6b970f7", "commit_rev":
"14cd05cef8765cb2bb23accd7c97f5d431bad6e7"}
Message was sent while issue was closed.
Committed patchset #5 (id:80001)
Message was sent while issue was closed.
Description was changed from ========== V4L2VDA: do not allocate input buffers in GPU child thread. Allocating input buffers can be very slow. Doing it in GPU child thread can cause gpu process watchdog timeout. Move it to decoder thread. BUG=chrome-os-partner:56452 TEST=Run VDA unittest and play video. CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel ========== to ========== V4L2VDA: do not allocate input buffers in GPU child thread. Allocating input buffers can be very slow. Doing it in GPU child thread can cause gpu process watchdog timeout. Move it to decoder thread. BUG=chrome-os-partner:56452 TEST=Run VDA unittest and play video. CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Committed: https://crrev.com/f160854f267cb8bc627b2043182d11161eb478f2 Cr-Commit-Position: refs/heads/master@{#434910} ==========
Message was sent while issue was closed.
Patchset 5 (id:??) landed as https://crrev.com/f160854f267cb8bc627b2043182d11161eb478f2 Cr-Commit-Position: refs/heads/master@{#434910}
Message was sent while issue was closed.
On 2016/11/29 06:19:47, commit-bot: I haz the power wrote: > Patchset 5 (id:??) landed as > https://crrev.com/f160854f267cb8bc627b2043182d11161eb478f2 > Cr-Commit-Position: refs/heads/master@{#434910} GPU process WDOG timeout occurs after 10 seconds. It what scenario does allocating buffers take ~10 seconds? What kind of buffers are we allocating here, how big are they, and how long does it normally take? How often do we allocate these buffers? I guess what I am trying to ask is if this is exposing another issue where v4l2 buffer allocation is [always? sometimes?] super slow on elm. |
