|
|
Created:
4 years, 5 months ago by dshwang Modified:
3 years, 8 months ago Reviewers:
Ken Russell (switch to Gerrit), dnicoara, fbarchard, Daniele Castagna, aleksandar.stojiljkovic, ccameron, piman, danakj, inactive_dshwang_plz_cc_intel CC:
Aaron Boodman, abarth-chromium, ben+mojo_chromium.org, cc-bugs_chromium.org, chromium-reviews, darin (slow to review), darin-cc_chromium.org, feature-media-reviews_chromium.org, jam, kalyank, ozone-reviews_chromium.org, piman+watch_chromium.org, posciak+watch_chromium.org, qsr+mojo_chromium.org, viettrungluu+watch_chromium.org, yzshen+watch_chromium.org, enne. hubbe_chromium.org Base URL:
https://chromium.googlesource.com/chromium/src.git@master Target Ref:
refs/pending/heads/master Project:
chromium Visibility:
Public. |
Descriptionmedia: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos
LUMINANCE_F16 has following issues
1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated.
2. GpuMemoryBuffer cannot support LUMINANCE_F16.
3. LUMINANCE_F16 requires cpu int-to-float conversion.
This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues.
Rationale to choose RG_88, instead of R_16
1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf)
2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2)
This CL introduces kUseRGTexture feature to use RG_88,
instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms,
LUMINANCE_F16 will be removed.
BUG=445071, 624436, 622133
TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG
media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P
content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4*
chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4
chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management
chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture
chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture
CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel
Patch Set 1 : build fix #
Total comments: 6
Patch Set 2 : add custom bilinear filter and pass all unittests #Patch Set 3 : before flag-controlled for RG8 #Patch Set 4 : introduce --disable-half-float-conversion-texture flag #
Total comments: 26
Patch Set 5 : resolve hubbe's concerns except for test. separate the CLs. #
Total comments: 9
Patch Set 6 : add pixel tests for RG88 and resolve hubbe's comments #
Total comments: 6
Patch Set 7 : add more pixel test, and resolve concerns #
Total comments: 4
Patch Set 8 : rebase to ToT #Messages
Total messages: 75 (38 generated)
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_optional_gpu_tests_rel;tryserver.chromium.mac:mac_optional_gpu_tests_rel;tryserver.chromium.win:win_optional_gpu_tests_rel ==========
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 CQ_INCLUDE_TRYBOTS=tryserver.chromium.linux:linux_optional_gpu_tests_rel;tryserver.chromium.mac:mac_optional_gpu_tests_rel;tryserver.chromium.win:win_optional_gpu_tests_rel ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel ==========
dnicoara, dcastagna, hubbe, could you review? hubbe suggested Y16 (or Z16) use LUMINANCE_F16 in https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c4 However, RG88 is better than LUMINANCE_F16 as the description explains. After this CL, gpu_memory_buffer_video_frame_pool.cc can handle 9/10-bit h264 videos as mentioned in https://bugs.chromium.org/p/chromium/issues/detail?id=445071#c29 Y16 will use RG88 also, which is introdueced in https://codereview.chromium.org/2113243003/
Patchset #2 (id:20001) has been deleted
Hi DS! Could you please split this CL? You could start just adding and testing BufferFormat::RG_88.
On 2016/07/06 17:38:21, Daniele Castagna wrote: > Hi DS! > Could you please split this CL? > > You could start just adding and testing BufferFormat::RG_88. Hi Daniele, Ok, let me split GMB code.
https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); This doesn't work! The hardware will have done bilinear interpolation on the r and g values before we get here, which will completely mess up the lower bits.
Thank you for feedback. https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); On 2016/07/06 17:54:17, hubbe wrote: > This doesn't work! > The hardware will have done bilinear interpolation on the r and g values before > we get here, which will completely mess up the lower bits. Ah, that's right. I have some idea. 1. Use R16 and doesn't support GpuMemoryBuffer. It's better than LUMINANCE_F16. 2. bilinear filter by custom shader. It can support GpuMemoryBuffer. e.g. https://www.opengl.org/discussion_boards/showthread.php/162345-Bilinear-filte... 3. Use only nearest filter. might not be acceptable. 4. Use linear filter and just ignore "mess up the lower bits", because higher bit is still correct and image is almost correct. might not be acceptable. hubbe, kbr, what do you think? I prefer #2 because it can support GpuMemoryBuffer. YUV shader will be a bit complicated though.
Both Arm and Intel provide half float SIMD conversions, if performance is a concern.
https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); On 2016/07/06 18:12:19, dshwang wrote: > On 2016/07/06 17:54:17, hubbe wrote: > > This doesn't work! > > The hardware will have done bilinear interpolation on the r and g values > before > > we get here, which will completely mess up the lower bits. > > Ah, that's right. I have some idea. > > 1. Use R16 and doesn't support GpuMemoryBuffer. It's better than LUMINANCE_F16. > > 2. bilinear filter by custom shader. It can support GpuMemoryBuffer. > e.g. > https://www.opengl.org/discussion_boards/showthread.php/162345-Bilinear-filte... > > 3. Use only nearest filter. might not be acceptable. > > 4. Use linear filter and just ignore "mess up the lower bits", because higher > bit is still correct and image is almost correct. might not be acceptable. > > hubbe, kbr, what do you think? I prefer #2 because it can support > GpuMemoryBuffer. YUV shader will be a bit complicated though. I don't know enough about how these shaders work, and in particular when scaling is expected to be done. The highest quality needs to be retained; regressing video quality is not acceptable. hubbe@ is the expert in this area and I defer to him. I'm not sure whether manual bilinear interpolation should be done at all at this point, and if it should be, whether it should be done in YUV space (caution needs to be taken when blending the V component: http://softpixel.com/~cwright/programming/colorspace/yuv/ ) or RGB space.
On 2016/07/06 18:12:20, dshwang wrote: > Thank you for feedback. > > https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc > File cc/output/shader.cc (right): > > https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... > cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / > float(max_channel_value); > On 2016/07/06 17:54:17, hubbe wrote: > > This doesn't work! > > The hardware will have done bilinear interpolation on the r and g values > before > > we get here, which will completely mess up the lower bits. > > Ah, that's right. I have some idea. > > 1. Use R16 and doesn't support GpuMemoryBuffer. It's better than LUMINANCE_F16. > Why is it better? Is it more widely supported? Does it use less power? Last I checked, R16 requires GLES3, which chrome is not using yet, and getting it to work in GLES2 was tricky. > 2. bilinear filter by custom shader. It can support GpuMemoryBuffer. > e.g. > https://www.opengl.org/discussion_boards/showthread.php/162345-Bilinear-filte... > This converts static texture fetches to dynamic texture fetches, which are much slower on some platforms. Also, 12 dynamic texture fetches is too much for some low-end hardware, so you'd need a fallback path. It may be feasible, but we'd need to do a fair amount of testing to make sure it's efficient enough. > 3. Use only nearest filter. might not be acceptable. > Not acceptable. > 4. Use linear filter and just ignore "mess up the lower bits", because higher > bit is still correct and image is almost correct. might not be acceptable. > Then why use 16 bits at all? Just convert it to 8 bits and don't worry about it. (Which is of course what we do today if half-floats are not supported.) > hubbe, kbr, what do you think? I prefer #2 because it can support > GpuMemoryBuffer. YUV shader will be a bit complicated though.
Hi, Thank you for your feedback. I took time to investigate and write my result in https://bugs.chromium.org/p/chromium/issues/detail?id=624436 Could you give me feedback again? https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); > On 2016/07/06 18:12:19, dshwang wrote: >> 1. RB8 and R16UI are better than LUMINANCE_F16. On 2016/07/06 18:44:59, hubbe wrote: > Why is it better? > Is it more widely supported? > Does it use less power? > > Last I checked, R16 requires GLES3, which chrome is not using yet, and getting > it to work in GLES2 was tricky. Because LUMINANCE_F16 requires CPU int-to-float conversion for all pixels. https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?c... So less power. In additioin, RG8 is more widely supported than GL_OES_texture_half_float. e.g. mesa However R16UI requires GLES3.
https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); On 2016/07/07 15:49:48, dshwang wrote: > > On 2016/07/06 18:12:19, dshwang wrote: > >> 1. RB8 and R16UI are better than LUMINANCE_F16. > > On 2016/07/06 18:44:59, hubbe wrote: > > Why is it better? > > Is it more widely supported? > > Does it use less power? > > > > Last I checked, R16 requires GLES3, which chrome is not using yet, and getting > > it to work in GLES2 was tricky. > > Because LUMINANCE_F16 requires CPU int-to-float conversion for all pixels. > https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?c... > > So less power. CPU conversion is very fast and does not require a lot of power. Obviously it requires *some* power though. Both intel and arm has instructions that converts integers to half-floats, so it should be pretty fast if written right. > In additioin, RG8 is more widely supported than GL_OES_texture_half_float. e.g. > mesa > However R16UI requires GLES3. RG8 is well supported, but if you have to do the filtering yourself, it makes things complicated, slow and expensive. With R16UI you also have to do the filtering yourself. It's entirely possible that doing the filtering yourself is more efficient than converting to half-float, but it is far from obvious, also, cpu-time is threaded while GPU-time is not, so trading cpu time for gpu time is usually a bad trade unless you make some fairly substantial gains in efficiency. I still think half-floats is the way to go, but it is possible that RG88 + shader-filtering would be better, however, you really need to give me numbers that proves that, simply stating it is not enough.
https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); On 2016/07/07 17:28:09, hubbe wrote: > On 2016/07/07 15:49:48, dshwang wrote: > > > On 2016/07/06 18:12:19, dshwang wrote: > > >> 1. RB8 and R16UI are better than LUMINANCE_F16. > > > > On 2016/07/06 18:44:59, hubbe wrote: > > > Why is it better? > > > Is it more widely supported? > > > Does it use less power? > > > > > > Last I checked, R16 requires GLES3, which chrome is not using yet, and > getting > > > it to work in GLES2 was tricky. > > > > Because LUMINANCE_F16 requires CPU int-to-float conversion for all pixels. > > > https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?c... > > > > So less power. > > CPU conversion is very fast and does not require a lot of power. > Obviously it requires *some* power though. Both intel and arm has > instructions that converts integers to half-floats, so it should be > pretty fast if written right. Note: after this CL, 9/10-bit h264 videos goes to non-copy path. https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?c... LUMINANCE_F16 requires temp memory allocation, int-to-half_float conversion and copy. > > In additioin, RG8 is more widely supported than GL_OES_texture_half_float. > e.g. > > mesa > > However R16UI requires GLES3. > > RG8 is well supported, but if you have to do the filtering yourself, it makes > things complicated, slow and expensive. With R16UI you also have to do the > filtering yourself. It's entirely possible that doing the filtering yourself is > more efficient than converting to half-float, but it is far from obvious, also, > cpu-time is threaded while GPU-time is not, so trading cpu time for gpu time is > usually a bad trade unless you make some fairly substantial gains in > efficiency. DIY filtering with nearest filter is not expensive than built-in bilinear filter in theory, because GPU driver has to do same thing. For example, Mesa uses exactly same shader in user space. https://github.com/freedreno/mesa/blob/master/src/mesa/swrast/s_texfilter.c#L110 It doesn't move workload from CPU to GPU. GPU workload is same but redundant CPU workload is removed.
On 2016/07/07 17:48:59, dshwang wrote: > https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc > File cc/output/shader.cc (right): > > https://codereview.chromium.org/2122573003/diff/60001/cc/output/shader.cc#new... > cc/output/shader.cc:2110: return ((rg.g * 65536.0) + (rg.r * 256.0)) / > float(max_channel_value); > On 2016/07/07 17:28:09, hubbe wrote: > > On 2016/07/07 15:49:48, dshwang wrote: > > > > On 2016/07/06 18:12:19, dshwang wrote: > > > >> 1. RB8 and R16UI are better than LUMINANCE_F16. > > > > > > On 2016/07/06 18:44:59, hubbe wrote: > > > > Why is it better? > > > > Is it more widely supported? > > > > Does it use less power? > > > > > > > > Last I checked, R16 requires GLES3, which chrome is not using yet, and > > getting > > > > it to work in GLES2 was tricky. > > > > > > Because LUMINANCE_F16 requires CPU int-to-float conversion for all pixels. > > > > > > https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?c... > > > > > > So less power. > > > > CPU conversion is very fast and does not require a lot of power. > > Obviously it requires *some* power though. Both intel and arm has > > instructions that converts integers to half-floats, so it should be > > pretty fast if written right. > > Note: after this CL, 9/10-bit h264 videos goes to non-copy path. > https://cs.chromium.org/chromium/src/cc/resources/video_resource_updater.cc?c... > > LUMINANCE_F16 requires temp memory allocation, int-to-half_float conversion and > copy. > > > > In additioin, RG8 is more widely supported than GL_OES_texture_half_float. > > e.g. > > > mesa > > > However R16UI requires GLES3. > > > > RG8 is well supported, but if you have to do the filtering yourself, it makes > > things complicated, slow and expensive. With R16UI you also have to do the > > filtering yourself. It's entirely possible that doing the filtering yourself > is > > more efficient than converting to half-float, but it is far from obvious, > also, > > cpu-time is threaded while GPU-time is not, so trading cpu time for gpu time > is > > usually a bad trade unless you make some fairly substantial gains in > > efficiency. > > DIY filtering with nearest filter is not expensive than built-in bilinear filter > in theory, because GPU driver has to do same thing. For example, Mesa uses > exactly same shader in user space. > https://github.com/freedreno/mesa/blob/master/src/mesa/swrast/s_texfilter.c#L110 > > It doesn't move workload from CPU to GPU. GPU workload is same but redundant CPU > workload is removed. You can't extrapolate how all GPUs work from a non-production driver for a single GPU. In my experience, the hardware has optimized instructions for doing texture lookups and filtering, and when you do the same thing by hand it is much slower. I did a bunch of test in this area when I implemented gl_helper_scaling.cc 3 years ago. However, those tests did not specifically test how half-floats perform, so it's possible that things are different for half-floats than it is for 8-bit data, but I don't actually think so. The only way we will know is to benchmark it on a variety of systems and see how they perform.
> > CPU conversion is very fast and does not require a lot of power. > Obviously it requires *some* power though. Both intel and arm has > instructions that converts integers to half-floats, so it should be > pretty fast if written right. > AVX2 and Neon both support float to half float, so its a 2 step conversion for general case. Current media code uses a trick and stomps an exponent into the upper bits, then adjusts for it with with a shader. Thats reasonably fast for both C and SIMD. I'd rather we dont use tricks and write a reasonably fast int16 to float16 converter. Note that Microsoft have defined some 10 bit Direct X formats, but they're more like NV12 and YUY2 in layout, and the NV12 style one uses the upper 10 bits, not the lower 10 bits. https://msdn.microsoft.com/en-us/library/windows/desktop/bb970578(v=vs.85).aspx -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel ==========
hi, long time no see I push forward this CL, because 1. alecksandar shows rg8 is better perf and power than half-float. https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c29 2. hubbe@ and kbr@ agree on rg8. https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c30 3. dcastagna@ agree on gbm support is important. https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c23 I add custom bilinear filter and following tests guarantee the custom bilinear filter result is same to OpenGL filter. IntersectingQuadGLPixelTest/0.YUVVideoQuads (../../cc/output/renderer_pixeltest.cc:961) IntersectingQuadGLPixelTest/1.YUVVideoQuads (../../cc/output/renderer_pixeltest.cc:961) VideoGLRendererPixelHiLoTest.ClippedYUVRect/0 (../../cc/output/renderer_pixeltest.cc:1105) VideoGLRendererPixelHiLoTest.OffsetYUVRect (../../cc/output/renderer_pixeltest.cc:1154) VideoGLRendererPixelHiLoTest.SimpleYUVARect (../../cc/output/renderer_pixeltest.cc:1295) VideoGLRendererPixelHiLoTest.SimpleYUVRect/0 (../../cc/output/renderer_pixeltest.cc:1105) In addition, Blink pixel tests are passed also, which means pixel results are not affected by this CL. https://codereview.chromium.org/2173563002/diff/200001/third_party/WebKit/Lay... hubbe@, could you review again? In addition, https://codereview.chromium.org/2113243003/ blocks this CL.
Just make sure that the RG8 change is flag-controlled so we can test it out properly. /Hubbe On Tue, Sep 20, 2016 at 8:04 AM, <dongseong.hwang@intel.com> wrote: > hi, long time no see > > I push forward this CL, because > 1. alecksandar shows rg8 is better perf and power than half-float. > https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c29 > 2. hubbe@ and kbr@ agree on rg8. > https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c30 > 3. dcastagna@ agree on gbm support is important. > https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c23 > > I add custom bilinear filter and following tests guarantee the custom > bilinear > filter result is same to OpenGL filter. > IntersectingQuadGLPixelTest/0.YUVVideoQuads > (../../cc/output/renderer_pixeltest.cc:961) > IntersectingQuadGLPixelTest/1.YUVVideoQuads > (../../cc/output/renderer_pixeltest.cc:961) > VideoGLRendererPixelHiLoTest.ClippedYUVRect/0 > (../../cc/output/renderer_pixeltest.cc:1105) > VideoGLRendererPixelHiLoTest.OffsetYUVRect > (../../cc/output/renderer_pixeltest.cc:1154) > VideoGLRendererPixelHiLoTest.SimpleYUVARect > (../../cc/output/renderer_pixeltest.cc:1295) > VideoGLRendererPixelHiLoTest.SimpleYUVRect/0 > (../../cc/output/renderer_pixeltest.cc:1105) > > In addition, Blink pixel tests are passed also, which means pixel results > are > not affected by this CL. > https://codereview.chromium.org/2173563002/diff/200001/third_party/WebKit/ > LayoutTests/TestExpectations > > hubbe@, could you review again? > In addition, https://codereview.chromium.org/2113243003/ blocks this CL. > > > > > https://codereview.chromium.org/2122573003/ > -- You received this message because you are subscribed to the Google Groups "Chromium-reviews" group. To unsubscribe from this group and stop receiving emails from it, send an email to chromium-reviews+unsubscribe@chromium.org.
Patchset #2 (id:40001) has been deleted
Patchset #1 (id:1) has been deleted
The CQ bit was checked by dongseong.hwang@chromium.org to run a CQ dry run
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management CQ_INCLUDE_TRYBOTS=tryserver.blink:linux_blink_rel ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management ==========
dongseong.hwang@intel.com changed reviewers: - dongseong.hwang@chromium.org
The CQ bit was unchecked by dongseong.hwang@intel.com
The CQ bit was checked by dongseong.hwang@chromium.org to run a CQ dry run
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --disable-half-float-conversion-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management --disable-half-float-conversion-texture ==========
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --disable-half-float-conversion-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management --disable-half-float-conversion-texture ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces --disable-half-float-conversion-texture to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --disable-half-float-conversion-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management --disable-half-float-conversion-texture ==========
hubbe@, could you review again? I introduce --disable-half-float-conversion-texture to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. VideoResourceUpdaterTestWithRG unittest covers RG_88 while VideoResourceUpdaterTestWithF16 unittest covers LUMINANCE_F16. I chehck all combinations work well chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --disable-half-float-conversion-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management --disable-half-float-conversion-texture
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_chromeos_ozone_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
THis CL is rather large, is it possible to split it into smaller CLs? Maybe one for the image format, one for the gpu memory buffer and one for to bring everything together? https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc File cc/base/switches.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc#ne... cc/base/switches.cc:48: const char kDisableHalfFloatConversionTexture[] = Please use the features API instead. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2116: if (highbit_texture_ == HIGHBIT_RG88) { Can we make it so that the if and else branches produce functions with the same prorotypes? The idea is that the code *after* the if statement can just call the function without knowing which one it is. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2175: vec2 GetUV(vec2 uv_tex_coord) { I think it would be more efficient if this was a an 8888 texture so that we could look up U and V at the same time. However, we could do that in a separate patch. (We'd need to swizzle the values a bit on upload, but that's nearly free...) https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h File cc/output/shader.h (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h#new... cc/output/shader.h:766: enum HighbitTexture { There is no need for three values here, because the shader doesn't need to know if we're using F16 or Y8, the shader does exactly the same thing for those two cases.
> This CL is rather large, is it possible to split it into smaller CLs? Maybe one > for the image format, one for the gpu memory buffer and one for to bring > everything together? This CL can be separated by 3 CLs as you mention. 1. Y16 introduce https://codereview.chromium.org/2113243003/ 2. GB_88 for GpuMemoryBuffer 3. anything else As you agree on direction, let me separate this CL. https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc File cc/base/switches.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc#ne... cc/base/switches.cc:48: const char kDisableHalfFloatConversionTexture[] = On 2016/09/27 20:49:51, hubbe wrote: > Please use the features API instead. Got it. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2116: if (highbit_texture_ == HIGHBIT_RG88) { On 2016/09/27 20:49:51, hubbe wrote: > Can we make it so that the if and else branches produce functions with the same > prorotypes? The idea is that the code *after* the if statement can just call the > function without knowing which one it is. GetY() and GetUV() is kind of public function. Following code just use them without knowing which one it is. You might want to go lower layer such as RGTextureLookupBilinear(). However, it requires texture size which only HIGHBIT_RG88 branch know. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2175: vec2 GetUV(vec2 uv_tex_coord) { On 2016/09/27 20:49:51, hubbe wrote: > I think it would be more efficient if this was a an 8888 texture so that we > could look up U and V at the same time. However, we could do that in a separate > patch. (We'd need to swizzle the values a bit on upload, but that's nearly > free...) That's good idea. 8888 texture could be fallback option of RG88. What I mean is 8888 replacing both F16 and scaling down when RG88 is not supported. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h File cc/output/shader.h (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h#new... cc/output/shader.h:766: enum HighbitTexture { On 2016/09/27 20:49:51, hubbe wrote: > There is no need for three values here, because the shader doesn't need to know > if we're using F16 or Y8, the shader does exactly the same thing for those two > cases. |resource_multiplier_location| and |resource_offset_location| are used by only HIGHBIT_LUMINANCE_F16. To make it easy to remove HIGHBIT_LUMINANCE_F16 path later, I highlight HIGHBIT_LUMINANCE_F16 specific code.
https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc File cc/base/switches.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc#ne... cc/base/switches.cc:48: const char kDisableHalfFloatConversionTexture[] = On 2016/09/28 10:33:27, dshwang wrote: > On 2016/09/27 20:49:51, hubbe wrote: > > Please use the features API instead. > > Got it. Usually people don't reply until they have actually implemented the change. (Did you forget to upload?) https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2116: if (highbit_texture_ == HIGHBIT_RG88) { On 2016/09/28 10:33:27, dshwang wrote: > On 2016/09/27 20:49:51, hubbe wrote: > > Can we make it so that the if and else branches produce functions with the > same > > prorotypes? The idea is that the code *after* the if statement can just call > the > > function without knowing which one it is. > > GetY() and GetUV() is kind of public function. Following code just use them > without knowing which one it is. > > You might want to go lower layer such as RGTextureLookupBilinear(). However, it > requires texture size which only HIGHBIT_RG88 branch know. Hmm, you're right. I was hoping to make this a bit cleaner, but it's not easy to do so unfortunately. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2117: head += " uniform int max_channel_value;\n"; How about reusing resource_multiplier? The calling code should probably set it to 1/max_channel_value to avoid a division in this code. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2122: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); should be (rg.g * 65280.0) + (rg.r * 255.0) This highlights the need for tests for this code. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2133: min(clamp_rect.zw, tex_coord + vec2(unit_texel.x, 0.))); This doesn't actually work, because there is no guarantee that the calculation is accurate enough that adding 1/resolution will give us the next pixel. We need to snap tex_coord to the middle of a pixel to be sure that it works. Also, we're clamping text_coord.y the same way as we did before. The compiler might be able to figure that out, but it's probably better to not rely on it. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2136: max(clamp_rect.xy, Do we really need to do the clamping for every sub-sample? That's not actually what the "normal" GetY() code does... https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2251: float y_raw = GetY(); GetUV and GetAlpha takes a parameter, probably GetY() should too? https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h File cc/output/shader.h (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h#new... cc/output/shader.h:766: enum HighbitTexture { On 2016/09/28 10:33:27, dshwang wrote: > On 2016/09/27 20:49:51, hubbe wrote: > > There is no need for three values here, because the shader doesn't need to > know > > if we're using F16 or Y8, the shader does exactly the same thing for those two > > cases. > > |resource_multiplier_location| and |resource_offset_location| are used by only > HIGHBIT_LUMINANCE_F16. To make it easy to remove HIGHBIT_LUMINANCE_F16 path > later, I highlight HIGHBIT_LUMINANCE_F16 specific code. True I suppose. But you'll need the resource_multiplier for RG88 too, right?
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces --disable-half-float-conversion-texture to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --disable-half-float-conversion-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management --disable-half-float-conversion-texture ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture ==========
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ==========
Thank you for reviewing. I resolved all your concern except for additional test. I'm working to create new test. hubbe@, In parallel, could you review again? https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc File cc/base/switches.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/base/switches.cc#ne... cc/base/switches.cc:48: const char kDisableHalfFloatConversionTexture[] = On 2016/09/28 17:57:44, hubbe wrote: > On 2016/09/28 10:33:27, dshwang wrote: > > On 2016/09/27 20:49:51, hubbe wrote: > > > Please use the features API instead. > > > > Got it. > > Usually people don't reply until they have actually implemented the change. (Did > you forget to upload?) Usually I did :) I want to listen to your opinion about shader, so I asked without new patch. your answer was very helpful! I added kUseRGTexture in new patch set. const base::Feature kUseRGTexture{"use-rg-texture", base::FEATURE_DISABLED_BY_DEFAULT}; https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2117: head += " uniform int max_channel_value;\n"; On 2016/09/28 17:57:44, hubbe wrote: > How about reusing resource_multiplier? > The calling code should probably set it to 1/max_channel_value to avoid a > division in this code. It needs 2048 hardcode or interact code between video_resource_updater to gl_renderer, while gl_renderer can handle it by itself. resource_multiplier is defined in video_resource_updater by external_resources.multiplier = 2048.0 / max_input_value; Let me change |max_channel_value| to |inverse_max_input_value| https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2122: return ((rg.g * 65536.0) + (rg.r * 256.0)) / float(max_channel_value); On 2016/09/28 17:57:44, hubbe wrote: > should be (rg.g * 65280.0) + (rg.r * 255.0) > > This highlights the need for tests for this code. oh, you are right. I'm extending renderer_pixeltest test to cover both LUMINANCE_F16 and RG_88, but it will take time because I need to figure out how PixelTest create context_provider which support GL_HALF_FLOAT_OES and GL_RG_EXT. Now I suffer folloing error. [11470:11471:0929/214423:284857029184:ERROR:texture_manager.cc(2246)] [GroupMarkerNotSet(crbug.com/242999)!:D0A0867E7B210000]GL ERROR :GL_INVALID_ENUM : glTexImage2D: type was GL_HALF_FLOAT_OES https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2133: min(clamp_rect.zw, tex_coord + vec2(unit_texel.x, 0.))); On 2016/09/28 17:57:44, hubbe wrote: > This doesn't actually work, because there is no guarantee that the calculation > is accurate enough that adding 1/resolution will give us the next pixel. > > We need to snap tex_coord to the middle of a pixel to be sure that it works. > > Also, we're clamping text_coord.y the same way as we did before. The compiler > might be able to figure that out, but it's probably better to not rely on it. Ok, I added snapping code inspired by http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter27.html However, the link say """ One additional tip is that it is not necessary to snap texture coordinates to the center of the nearest texel as long as texture lookups do not perform any type of filtering. This means setting filtering to GL_NEAREST in OpenGL or D3DTEXF_POINT in Direct3D. This way, coordinate snapping is performed automatically during texture lookups. """ The author thinks if border of tex coord looks up left texel, (border of tex coord + n/width) always looks up left texel. WDYT? https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2136: max(clamp_rect.xy, On 2016/09/28 17:57:44, hubbe wrote: > Do we really need to do the clamping for every sub-sample? > That's not actually what the "normal" GetY() code does... make sense. it's not really needed. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2251: float y_raw = GetY(); On 2016/09/28 17:57:44, hubbe wrote: > GetUV and GetAlpha takes a parameter, probably GetY() should too? to make consistent, let me change it to get |v_yaTexCoord| https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h File cc/output/shader.h (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.h#new... cc/output/shader.h:766: enum HighbitTexture { On 2016/09/28 17:57:44, hubbe wrote: > On 2016/09/28 10:33:27, dshwang wrote: > > On 2016/09/27 20:49:51, hubbe wrote: > > > There is no need for three values here, because the shader doesn't need to > > know > > > if we're using F16 or Y8, the shader does exactly the same thing for those > two > > > cases. > > > > |resource_multiplier_location| and |resource_offset_location| are used by only > > HIGHBIT_LUMINANCE_F16. To make it easy to remove HIGHBIT_LUMINANCE_F16 path > > later, I highlight HIGHBIT_LUMINANCE_F16 specific code. > > True I suppose. But you'll need the resource_multiplier for RG88 too, right? no, RG88 don't need it.
https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2133: min(clamp_rect.zw, tex_coord + vec2(unit_texel.x, 0.))); On 2016/09/29 18:52:28, dshwang wrote: > On 2016/09/28 17:57:44, hubbe wrote: > > This doesn't actually work, because there is no guarantee that the calculation > > is accurate enough that adding 1/resolution will give us the next pixel. > > > > We need to snap tex_coord to the middle of a pixel to be sure that it works. > > > > Also, we're clamping text_coord.y the same way as we did before. The compiler > > might be able to figure that out, but it's probably better to not rely on it. > > Ok, I added snapping code inspired by > http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter27.html > > However, the link say > """ > One additional tip is that it is not necessary to snap texture coordinates to > the center of the nearest texel as long as texture lookups do not perform any > type of filtering. This means setting filtering to GL_NEAREST in OpenGL or > D3DTEXF_POINT in Direct3D. This way, coordinate snapping is performed > automatically during texture lookups. > """ > > The author thinks if border of tex coord looks up left texel, (border of tex > coord + n/width) always looks up left texel. > WDYT? If snapping is not needed, I'd like to understand why. https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2175: vec2 GetUV(vec2 uv_tex_coord) { On 2016/09/28 10:33:27, dshwang wrote: > On 2016/09/27 20:49:51, hubbe wrote: > > I think it would be more efficient if this was a an 8888 texture so that we > > could look up U and V at the same time. However, we could do that in a > separate > > patch. (We'd need to swizzle the values a bit on upload, but that's nearly > > free...) > > That's good idea. 8888 texture could be fallback option of RG88. What I mean is > 8888 replacing both F16 and scaling down when RG88 is not supported. Almost everything supports RG88 I think. https://codereview.chromium.org/2122573003/diff/140001/cc/output/gl_renderer.cc File cc/output/gl_renderer.cc (right): https://codereview.chromium.org/2122573003/diff/140001/cc/output/gl_renderer.... cc/output/gl_renderer.cc:2370: if (use_color_lut) { I don't really like this change, as it requires this code to know more about the shader. I prefer that it just checks which variables are available and sets them whenever possible. Not sure if the cc/ owners agree with me or not though. https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc#ne... cc/output/shader.cc:2122: return ((rg.g * 65280.0) + (rg.r * 255.0)) * inverse_max_input_value; We can save an instruction here by dividing inverse_max_input_value by 255 in gl_renderer.cc. https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc#ne... cc/output/shader.cc:2130: float s1 = GetFloat(TextureLookup(sampler, snap_tex_coord).xy); I would probably just include the TextureLookup() call in GetFloat() https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc#ne... cc/output/shader.cc:2157: vec2 uv_clamped = I think you forgot to undo this, GetUV() gets uv_clamped now, right?
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ==========
dongseong.hwang@intel.com changed reviewers: + danakj@chromium.org
The CQ bit was checked by dongseong.hwang@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
hubbe@, I resolved all your comments and add new pixel test for RG88 code. could you review again? https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/120001/cc/output/shader.cc#ne... cc/output/shader.cc:2133: min(clamp_rect.zw, tex_coord + vec2(unit_texel.x, 0.))); On 2016/09/29 23:36:50, hubbe wrote: > On 2016/09/29 18:52:28, dshwang wrote: > > On 2016/09/28 17:57:44, hubbe wrote: > > > This doesn't actually work, because there is no guarantee that the > calculation > > > is accurate enough that adding 1/resolution will give us the next pixel. > > > > > > We need to snap tex_coord to the middle of a pixel to be sure that it works. > > > > > > Also, we're clamping text_coord.y the same way as we did before. The > compiler > > > might be able to figure that out, but it's probably better to not rely on > it. > > > > Ok, I added snapping code inspired by > > http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter27.html > > > > However, the link say > > """ > > One additional tip is that it is not necessary to snap texture coordinates to > > the center of the nearest texel as long as texture lookups do not perform any > > type of filtering. This means setting filtering to GL_NEAREST in OpenGL or > > D3DTEXF_POINT in Direct3D. This way, coordinate snapping is performed > > automatically during texture lookups. > > """ > > > > The author thinks if border of tex coord looks up left texel, (border of tex > > coord + n/width) always looks up left texel. > > WDYT? > > If snapping is not needed, I'd like to understand why. I found snapping is needed when I made pixel test. check new patch https://codereview.chromium.org/2122573003/diff/140001/cc/output/gl_renderer.cc File cc/output/gl_renderer.cc (right): https://codereview.chromium.org/2122573003/diff/140001/cc/output/gl_renderer.... cc/output/gl_renderer.cc:2370: if (use_color_lut) { On 2016/09/29 23:36:51, hubbe wrote: > I don't really like this change, as it requires this code to know more about the > shader. I prefer that it just checks which variables are available and sets them > whenever possible. Not sure if the cc/ owners agree with me or not I think gl_renderer is controller of shader.cc. gl_renderer should know every detail of shader.cc. Moreover, it makes it easier to read code also as we can explicitly know which uniforms is used in specific case. This explicitness will help to prevent unnecessary bug. hubbe, danakj, what do you think? https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc#ne... cc/output/shader.cc:2122: return ((rg.g * 65280.0) + (rg.r * 255.0)) * inverse_max_input_value; On 2016/09/29 23:36:51, hubbe wrote: > We can save an instruction here by dividing inverse_max_input_value by 255 in > gl_renderer.cc. Done. now uniform name is resource_multiplier as you explained :) https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc#ne... cc/output/shader.cc:2130: float s1 = GetFloat(TextureLookup(sampler, snap_tex_coord).xy); On 2016/09/29 23:36:51, hubbe wrote: > I would probably just include the TextureLookup() call in GetFloat() Done. https://codereview.chromium.org/2122573003/diff/140001/cc/output/shader.cc#ne... cc/output/shader.cc:2157: vec2 uv_clamped = On 2016/09/29 23:36:51, hubbe wrote: > I think you forgot to undo this, GetUV() gets uv_clamped now, right? oops, thx! https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... File cc/output/renderer_pixeltest.cc (right): https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1152: file_path = base::FilePath(FILE_PATH_LITERAL("yuv_stripes_rg88.png")); RG88 path in shader.cc cannot produce exactly same pixels, but looks very similar. https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1184: base::FilePath(FILE_PATH_LITERAL("yuv_stripes_clipped_rg88.png")); ditto
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: linux_chromium_asan_rel_ng on master.tryserver.chromium.linux (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...)
I think we're getting pretty close, but I'm an owner of this code... https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... File cc/output/renderer_pixeltest.cc (right): https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1152: file_path = base::FilePath(FILE_PATH_LITERAL("yuv_stripes_rg88.png")); On 2016/10/05 15:13:42, dshwang wrote: > RG88 path in shader.cc cannot produce exactly same pixels, but looks very > similar. The values in these files differ quite a lot actually, and they differ by more than one in lots of places. This is more than I would expect and would seem to indicate that there is a bug somewhere. Also, instead of explaining what you're doing with comments in the code review, use comments in the code instead. https://codereview.chromium.org/2122573003/diff/160001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/160001/cc/output/shader.cc#ne... cc/output/shader.cc:2111: return ((rg.g * 256.0) + rg.r) * resource_multiplier; Instead of multiplying by resource_multiplier here, we can do it after mixing, which saves a few multiplications. In fact, we can do it in yuv2rgb, same as we do for half-float textures, and when we're not using a LUT, we can simply multiply it into the YUV->RGB matrix.
enne@chromium.org changed reviewers: + enne@chromium.org
https://codereview.chromium.org/2122573003/diff/140001/cc/output/gl_renderer.cc File cc/output/gl_renderer.cc (right): https://codereview.chromium.org/2122573003/diff/140001/cc/output/gl_renderer.... cc/output/gl_renderer.cc:2370: if (use_color_lut) { On 2016/10/05 at 15:13:39, dshwang wrote: > On 2016/09/29 23:36:51, hubbe wrote: > > I don't really like this change, as it requires this code to know more about the > > shader. I prefer that it just checks which variables are available and sets them > > whenever possible. Not sure if the cc/ owners agree with me or not > > I think gl_renderer is controller of shader.cc. gl_renderer should know every detail of shader.cc. Moreover, it makes it easier to read code also as we can explicitly know which uniforms is used in specific case. This explicitness will help to prevent unnecessary bug. I agree with this. GLRenderer is the single consumer of shaders. It owns them. It sets uniforms. So, it has to know what's available in a program and what's not. There's really no abstraction. If it appears that there's an abstraction it's mostly to avoid repeating code and not to hide details of the shaders.
enne@chromium.org changed reviewers: - enne@chromium.org
Description was changed from ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ========== to ========== media: replace LUMINANCE_F16 by RG_88 for 9/10-bit h264 videos LUMINANCE_F16 has following issues 1. GL_LUMINANCE (as well as GL_ALPHA) is deprecated. 2. GpuMemoryBuffer cannot support LUMINANCE_F16. 3. LUMINANCE_F16 requires cpu int-to-float conversion. This CL replaces LUMINANCE_F16 by RG_88 to fix all above issues. Rationale to choose RG_88, instead of R_16 1. GpuMemoryBuffer cannot support R_16. (e.g. Mesa supports R8 and GR88 dmabuf) 2. GL_EXT_texture_rg is more widely used than GL_EXT_color_buffer_half_float (e.g. Mesa v11.2) This CL introduces kUseRGTexture feature to use RG_88, instead of LUMINANCE_F16. When we make sure RG_88 is better on all platforms, LUMINANCE_F16 will be removed. BUG=445071, 624436, 622133 TEST=existing cc_unittests. e.g. VideoGLRendererPixelHiLoTest, VideoResourceUpdaterTestWithRG media_unittests --gtest_filter=PipelineIntegrationTest.BasicPlaybackHi10P content_browsertests --gtest_filter=*VideoBearHighBitDepthMp4* chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=use-rg-texture chrome file:///<chromium>/src/media/test/data/bear-320x180-hi10p.mp4 --enable-features=video-color-management,use-rg-texture CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_precise_blink_rel ==========
The CQ bit was checked by dongseong.hwang@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
hubbe, I took investigation to explain why the pixel results are different. I think RG result is also right. See below for more detail. https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... File cc/output/renderer_pixeltest.cc (right): https://codereview.chromium.org/2122573003/diff/160001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1152: file_path = base::FilePath(FILE_PATH_LITERAL("yuv_stripes_rg88.png")); On 2016/10/05 18:03:25, hubbe wrote: > On 2016/10/05 15:13:42, dshwang wrote: > > RG88 path in shader.cc cannot produce exactly same pixels, but looks very > > similar. > > The values in these files differ quite a lot actually, and they differ by more > than one in lots of places. This is more than I would expect and would seem to > indicate that there is a bug somewhere. Good point. To verity it, I added |disable_one_component_textures| path in the pixel test, in which SkCanvasVideoRenderer::ConvertVideoFrameToRGBPixels() converts PIXEL_FORMAT_YUV420P10 to RGBA and gl_renderer draws it as TextureDrawQuad. Surprisingly, the results has some charicteristics * very different color space * more similar 2d-linear filter result to HighbitTexture::RG88 For color space, this path doesn't respect video color profile, so color space is so different. For example, chrome rendering of media/test/data/bear-320x180-hi10p-vp9.webm by this path is very yellowish. For 2d-linear filter, the gpu default 2d-linear filter seems to overblur the image, because * |disable_one_component_textures| result is more similar to HighbitTexture::RG88 * If disabling GL_LINEAR in gl_renderer, the result is more similar to HighbitTexture::RG88 In my opinion, we can live with this different rendering result. > Also, instead of explaining what you're doing with comments in the code review, > use comments in the code instead. move explanation to comment https://codereview.chromium.org/2122573003/diff/160001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/160001/cc/output/shader.cc#ne... cc/output/shader.cc:2111: return ((rg.g * 256.0) + rg.r) * resource_multiplier; On 2016/10/05 18:03:25, hubbe wrote: > Instead of multiplying by resource_multiplier here, we can do it after mixing, > which saves a few multiplications. In fact, we can do it in yuv2rgb, same as we > do for half-float textures, and when we're not using a LUT, we can simply > multiply it into the YUV->RGB matrix. > Good point! Done. https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... File cc/output/renderer_pixeltest.cc (right): https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1188: // crbug.com/622133 unfortunately, 3 pathes have different results. Note: Y8 and LUMINANCE_F16 have same results. I think HighbitTexture::RG88 result is right. we will fix HighbitTexture::RGBA_8888 result in next CL. https://codereview.chromium.org/2122573003/diff/180001/cc/output/shader.cc File cc/output/shader.cc (right): https://codereview.chromium.org/2122573003/diff/180001/cc/output/shader.cc#ne... cc/output/shader.cc:2127: // https://cs.chromium.org/chromium/src/third_party/mesa/src/src/mesa/swrast/s_t... now this shader is logically same to Mesa implementation.
https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... File cc/output/renderer_pixeltest.cc (right): https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1188: // crbug.com/622133 On 2016/10/07 12:35:01, dshwang wrote: > unfortunately, 3 pathes have different results. Note: Y8 and LUMINANCE_F16 have > same results. > I think HighbitTexture::RG88 result is right. > we will fix HighbitTexture::RGBA_8888 result in next CL. to compare easy per configuration, I upload the all images in https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c36 The nearest filter of GLRenderer::DrawYUVVideoQuad is very similar to HighbitTexture::RG88 result in terms of 2d-linear filter. GLRenderer::DrawYUVVideoQuad probably has bluring bug, because this pixel test draws 200x200 texture on 200x200 display. Both result of nearest and linear have to be same.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... File cc/output/renderer_pixeltest.cc (right): https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... cc/output/renderer_pixeltest.cc:1188: // crbug.com/622133 On 2016/10/07 13:00:32, dshwang wrote: > On 2016/10/07 12:35:01, dshwang wrote: > > unfortunately, 3 pathes have different results. Note: Y8 and LUMINANCE_F16 > have > > same results. > > I think HighbitTexture::RG88 result is right. > > we will fix HighbitTexture::RGBA_8888 result in next CL. > > to compare easy per configuration, I upload the all images in > https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c36 > > The nearest filter of GLRenderer::DrawYUVVideoQuad is very similar to > HighbitTexture::RG88 result in terms of 2d-linear filter. > GLRenderer::DrawYUVVideoQuad probably has bluring bug, because this pixel test > draws 200x200 texture on 200x200 display. Both result of nearest and linear have > to be same. I understood why it has unnecessary bluring. I submitted new CL to fix this existing bug. https://codereview.chromium.org/2400033004/ After landing the another CL, I'll rebase this CL to ToT.
On 2016/10/07 17:55:17, dshwang wrote: > https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... > File cc/output/renderer_pixeltest.cc (right): > > https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... > cc/output/renderer_pixeltest.cc:1188: // crbug.com/622133 > On 2016/10/07 13:00:32, dshwang wrote: > > On 2016/10/07 12:35:01, dshwang wrote: > > > unfortunately, 3 pathes have different results. Note: Y8 and LUMINANCE_F16 > > have > > > same results. > > > I think HighbitTexture::RG88 result is right. > > > we will fix HighbitTexture::RGBA_8888 result in next CL. > > > > to compare easy per configuration, I upload the all images in > > https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c36 > > > > The nearest filter of GLRenderer::DrawYUVVideoQuad is very similar to > > HighbitTexture::RG88 result in terms of 2d-linear filter. > > GLRenderer::DrawYUVVideoQuad probably has bluring bug, because this pixel test > > draws 200x200 texture on 200x200 display. Both result of nearest and linear > have > > to be same. > > I understood why it has unnecessary bluring. I submitted new CL to fix this > existing bug. https://codereview.chromium.org/2400033004/ > After landing the another CL, I'll rebase this CL to ToT. I think this other CL is wrong, and if this CL works more like the other CL, then this CL is wrong too.
On 2016/10/07 18:13:03, hubbe wrote: > On 2016/10/07 17:55:17, dshwang wrote: > > > https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... > > File cc/output/renderer_pixeltest.cc (right): > > > > > https://codereview.chromium.org/2122573003/diff/180001/cc/output/renderer_pix... > > cc/output/renderer_pixeltest.cc:1188: // crbug.com/622133 > > On 2016/10/07 13:00:32, dshwang wrote: > > > On 2016/10/07 12:35:01, dshwang wrote: > > > > unfortunately, 3 pathes have different results. Note: Y8 and LUMINANCE_F16 > > > have > > > > same results. > > > > I think HighbitTexture::RG88 result is right. > > > > we will fix HighbitTexture::RGBA_8888 result in next CL. > > > > > > to compare easy per configuration, I upload the all images in > > > https://bugs.chromium.org/p/chromium/issues/detail?id=624436#c36 > > > > > > The nearest filter of GLRenderer::DrawYUVVideoQuad is very similar to > > > HighbitTexture::RG88 result in terms of 2d-linear filter. > > > GLRenderer::DrawYUVVideoQuad probably has bluring bug, because this pixel > test > > > draws 200x200 texture on 200x200 display. Both result of nearest and linear > > have > > > to be same. > > > > I understood why it has unnecessary bluring. I submitted new CL to fix this > > existing bug. https://codereview.chromium.org/2400033004/ > > After landing the another CL, I'll rebase this CL to ToT. > > I think this other CL is wrong, and if this CL works more like the other CL, > then this CL is wrong too. I'd like to run this CL through our power lab, but it doesn't currently apply cleanly. (Conflicts in gl_renderer.cc) Would you mind fixing the conflicts and upload a new version?
The CQ bit was checked by dongseong.hwang@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2016/10/12 21:59:16, hubbe wrote: > I'd like to run this CL through our power lab, but it doesn't currently apply > cleanly. (Conflicts in gl_renderer.cc) Would you mind fixing the conflicts and > upload a new version? Hi, different time zone (Finland) causes latency. I'm glad to hear you testing it by power lab. Thank you in advance. I rebased it to ToT.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: mac_chromium_rel_ng on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
This approach does not lgtm. We should not simulate 1=channel images using 2-channel images. It is possible to create R16 unorm GMBs. We should use these. I verified this on Mac with IOSurfaces, and I can send you a test application which demonstrates this. Alternatively, unless we prove that the short<->half-float conversion is the critical bottleneck for a critical use-case, then we should consider using R16F (half-float single channel).
ccameron@chromium.org changed reviewers: + piman@chromium.org
On 2016/10/17 22:28:31, ccameron wrote: > This approach does not lgtm. We should not simulate 1=channel images using > 2-channel images. > > It is possible to create R16 unorm GMBs. We should use these. I verified this on > Mac with IOSurfaces, and I can send you a test application which demonstrates > this. IOSurfaces has R16 format but linux kernel (i.e. chromeos) doesn't support R16 format for native DMA-BUF https://github.com/torvalds/linux/blob/master/include/uapi/drm/drm_fourcc.h#L45 I'm not sure that linux community will agree on R16 introduction. If so, I'm not sure how long will it take. Note: Mesa allocates format agnostic DMA-BUF for texture, which means R16 dma-buf is not very needed in current linux graphics stack. https://cs.chromium.org/chromium/src/third_party/mesa/src/src/mesa/drivers/dr... > Alternatively, unless we prove that the short<->half-float conversion is the > critical bottleneck for a critical use-case, then we should consider using R16F > (half-float single channel). R16F requires half-float conversion data also. TexImage2D(R16F) doesn't convert uint16 to half-float automatically, while TexImage2D(R8) converts uint8 to half-float automatically.
On 2016/10/17 22:28:31, ccameron wrote: > This approach does not lgtm. We should not simulate 1=channel images using > 2-channel images. > > It is possible to create R16 unorm GMBs. We should use these. I verified this on > Mac with IOSurfaces, and I can send you a test application which demonstrates > this. > > Alternatively, unless we prove that the short<->half-float conversion is the > critical bottleneck for a critical use-case, then we should consider using R16F > (half-float single channel). ccameron@, this is very important and most probably related to another work - depth camera capture and rendering; https://crrev.com/2121043002. Could you please clarify the concern there, why it is not ok to use RG88 for 16 bit plane and in which cases it is considered as a hack? I'll give a short explanation there about measurements, how we concluded that RG88 used for Depth16 internally only is acceptable (hack) if we hide this implementation detail from end user... Agreed with dcastagna@ that we should hold the review there until we clarify this. Thanks a lot for help.
On 2016/10/18 19:24:57, aleksandar.stojiljkovic wrote: > On 2016/10/17 22:28:31, ccameron wrote: > > This approach does not lgtm. We should not simulate 1=channel images using > > 2-channel images. > > > > It is possible to create R16 unorm GMBs. We should use these. I verified this > on > > Mac with IOSurfaces, and I can send you a test application which demonstrates > > this. > > > > Alternatively, unless we prove that the short<->half-float conversion is the > > critical bottleneck for a critical use-case, then we should consider using > R16F > > (half-float single channel). > > ccameron@, this is very important and most probably related to another work - > depth camera capture and rendering; https://crrev.com/2121043002. > Could you please clarify the concern there, why it is not ok to use RG88 for 16 > bit plane and in which cases it is considered as a hack? > I'll give a short explanation there about measurements, how we concluded that > RG88 used for Depth16 internally only is acceptable (hack) if we hide this > implementation detail from end user... Agreed with dcastagna@ that we should > hold the review there until we clarify this. > Thanks a lot for help. I've done a few runs (on a single machine) through our power analysis lab, and so far I have not been able to observe any difference between running with --enable-features=use-rg-texture and running without it. I will keep trying and run it on a few more machines to be sure though. Note that we've optimized the half-float conversion a fair amount since you tested this.
On 2016/10/17 22:28:31, ccameron wrote: > This approach does not lgtm. We should not simulate 1=channel images using > 2-channel images. > > It is possible to create R16 unorm GMBs. We should use these. I verified this on > Mac with IOSurfaces, and I can send you a test application which demonstrates > this. > > Alternatively, unless we prove that the short<->half-float conversion is the > critical bottleneck for a critical use-case, then we should consider using R16F > (half-float single channel). Chris, as mentioned by Dongseong, there's a fundamental problem where the Linux kernel doesn't support R16 DMA buffers, so GMBs with that format wouldn't be portable. I'd like to not block this work on getting that change into the Linux kernel.
hubbe@chromium.org changed reviewers: - hubbe@chromium.org
On 2016/10/18 12:03:16, dshwang wrote: > On 2016/10/17 22:28:31, ccameron wrote: > > This approach does not lgtm. We should not simulate 1=channel images using > > 2-channel images. > > > > It is possible to create R16 unorm GMBs. We should use these. I verified this > on > > Mac with IOSurfaces, and I can send you a test application which demonstrates > > this. > > IOSurfaces has R16 format but linux kernel (i.e. chromeos) doesn't support R16 > format for native DMA-BUF > https://github.com/torvalds/linux/blob/master/include/uapi/drm/drm_fourcc.h#L45 DMA-BUF doesn't care about the format. It's a way to share buffers (a list of memory pages) and as such the format that is being transported is completely irrelevant. So you argument doesn't make sense. > > I'm not sure that linux community will agree on R16 introduction. If so, I'm not > sure how long will it take. > > Note: Mesa allocates format agnostic DMA-BUF for texture, which means R16 > dma-buf is not very needed in current linux graphics stack. > https://cs.chromium.org/chromium/src/third_party/mesa/src/src/mesa/drivers/dr... > > > Alternatively, unless we prove that the short<->half-float conversion is the > > critical bottleneck for a critical use-case, then we should consider using > R16F > > (half-float single channel). > > R16F requires half-float conversion data also. TexImage2D(R16F) doesn't convert > uint16 to half-float automatically, while TexImage2D(R8) converts uint8 to > half-float automatically.
On 2017/04/01 00:37:47, marcheu wrote: > On 2016/10/18 12:03:16, dshwang wrote: > > On 2016/10/17 22:28:31, ccameron wrote: > > > This approach does not lgtm. We should not simulate 1=channel images using > > > 2-channel images. > > > > > > It is possible to create R16 unorm GMBs. We should use these. I verified > this > > on > > > Mac with IOSurfaces, and I can send you a test application which > demonstrates > > > this. > > > > IOSurfaces has R16 format but linux kernel (i.e. chromeos) doesn't support R16 > > format for native DMA-BUF > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/drm_fourcc.h#L45 > > DMA-BUF doesn't care about the format. It's a way to share buffers (a list of > memory pages) and as such the format that is being transported is completely > irrelevant. So you argument doesn't make sense. > > > > > I'm not sure that linux community will agree on R16 introduction. If so, I'm > not > > sure how long will it take. As DS mentioned, it is format agnostic. Good to revisit this. Meanwhile, DRM_FORMAT_R16 got added - interestingly, to the same line 45 [1]. Another patch [2], is introducing the usage of GL_R16_EXT (available through core desktop and EXT_texture_norm16 extension) so we could rebase this patch to use GL_R16_EXT. [1] DRM_FORMAT_R16 https://github.com/torvalds/linux/commit/fd056f05b9fcba35b77ec5b20e553cc48884... [2] https://crrev.com/2767063002/ > > > > Note: Mesa allocates format agnostic DMA-BUF for texture, which means R16 > > dma-buf is not very needed in current linux graphics stack. > > > https://cs.chromium.org/chromium/src/third_party/mesa/src/src/mesa/drivers/dr... > > > > > Alternatively, unless we prove that the short<->half-float conversion is the > > > critical bottleneck for a critical use-case, then we should consider using > > R16F > > > (half-float single channel). > > > > R16F requires half-float conversion data also. TexImage2D(R16F) doesn't > convert > > uint16 to half-float automatically, while TexImage2D(R8) converts uint8 to > > half-float automatically. |