|
|
DescriptionOptimize QuadF's PointIsInTriangle function
PointIsInTriangle() is still the hotspot of many CSS animation cases.Take Nexus10 with latest chromium build for example:
4.0% case: poster-circle.html
1.5% case: morphing-cubes.html
This patch changes the algorithm of testing "Point In Triangle".
1. Using double-precision float point.
2. A much readable algorithm.
3. Better performance due to remove 2 div operation
R=danakj@chromium.org
Committed: https://src.chromium.org/viewvc/chrome?view=rev&revision=290290
Patch Set 1 #
Total comments: 12
Patch Set 2 : #Patch Set 3 : #Messages
Total messages: 22 (0 generated)
Thanks for the patch! https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc File ui/gfx/geometry/quad_f.cc (right): https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:74: // Quick test doubles have same sign style nit: comments should be full sentences including periods, here and below. https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:75: #define DOUBLE_SIGN_MASK UINT64_C(0x8000000000000000) Can you use "static const uint64 kDoubleSignMask" instead? https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:76: #define DOUBLE_SAME_SIGN(d1, d2) \ Is the performance changed if you use a static method to do this? With some temporary uint64 vars to make it easier to read? https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:77: (((reinterpret_cast<__uint64_t&>(d1)) ^ \ i think the extra () around the reinterpret_cast makes this harder to read/count braces. https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:78: (reinterpret_cast<__uint64_t&>(d2))) & \ Is doing XOR and AND really all that better than doing >= 0? I don't see either of these operations listed in your analysis of changes to the method. https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:84: if(DOUBLE_SAME_SIGN(u, v)) return false; This is checking if they have different signs not the same sign right? It returns true if they are different signs. Can the name reflect that?
+enne FYI.
NOTRY=true would bypass our try bots and land the CL untested, please don't use that :)
https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc File ui/gfx/geometry/quad_f.cc (right): https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:74: // Quick test doubles have same sign On 2014/08/11 14:09:26, danakj wrote: > style nit: comments should be full sentences including periods, here and below. Remove comments due to algorithm is updated https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:75: #define DOUBLE_SIGN_MASK UINT64_C(0x8000000000000000) On 2014/08/11 14:09:26, danakj wrote: > Can you use "static const uint64 kDoubleSignMask" instead? Remove "Mask" due to algorithm is updated https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:76: #define DOUBLE_SAME_SIGN(d1, d2) \ On 2014/08/11 14:09:26, danakj wrote: > Is the performance changed if you use a static method to do this? With some > temporary uint64 vars to make it easier to read? Compiler should help inlining *short* static method, So, I don't think performance would be changed. For this case, Remove "Micro" due to algorithm is updated. https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:77: (((reinterpret_cast<__uint64_t&>(d1)) ^ \ On 2014/08/11 14:09:26, danakj wrote: > i think the extra () around the reinterpret_cast makes this harder to read/count > braces. Same as above. https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:78: (reinterpret_cast<__uint64_t&>(d2))) & \ On 2014/08/11 14:09:26, danakj wrote: > Is doing XOR and AND really all that better than doing >= 0? I don't see either > of these operations listed in your analysis of changes to the method. The micro-bench(on armv7) result shows manual "XOR and AND MASK" is ~30% faster than this method - "if (u>0 && v<0) || (u<0 && v>0)" for this case. But, Manual "XOR and AND MASK" is not a good way because there is a standard function - "std::signbit()". For the algorithm about testing PointIsInTriangle, Testing Sign bit is insufficient Sometimes. Considering one point is on the edge of triangle, we may get Positive zero or Negative Zero after CrossProduct. In this case, the sign bit doesn't make sense. I had updated algorithm. https://codereview.chromium.org/458173002/diff/1/ui/gfx/geometry/quad_f.cc#ne... ui/gfx/geometry/quad_f.cc:84: if(DOUBLE_SAME_SIGN(u, v)) return false; On 2014/08/11 14:09:26, danakj wrote: > This is checking if they have different signs not the same sign right? It > returns true if they are different signs. Can the name reflect that? Oh yes, it's really a bad name. The Macro would be removed due to updated algorithm.
Changes in patch set2: 1. Updated PointIsInTriangle algorithm. Sometimes, Testing Sign bit is insufficient. Considering one point is on the edge of triangle, we may get Positive Zero or Negative Zero after CrossProduct. In this case, the sign bit doesn't make sense. This issue is fixed by multiply each pairs of normals, then compare with zero Benchmark: Case: Random generated 1000 points around triangle, then Do testing with many loops. Result: * On ARMv7(32 bit), almost the same performance as original single-precision based method. and performance is improved 20% if I using single-precision based multiply. * On ARMv8(64 bit), 8% improvement due to wider register, +20% improvements after changing to single-precision float point. Btw, I'd like using double-precision here, following the discussion: https://groups.google.com/a/chromium.org/forum/#!topic/graphics-dev/aSvAbzcgbos
On 2014/08/14 07:02:39, kui.zheng wrote: > Changes in patch set2: > 1. Updated PointIsInTriangle algorithm. Sometimes, Testing Sign bit is > insufficient. Considering one point is on the edge of triangle, we may get > Positive Zero or Negative Zero after CrossProduct. In this case, the sign bit > doesn't make sense. This issue is fixed by multiply each pairs of normals, then > compare with zero Which unit tests do we have that cover this important distinction? If none, can you add some? > Benchmark: > Case: Random generated 1000 points around triangle, then Do testing with many > loops. > Result: > * On ARMv7(32 bit), almost the same performance as original single-precision > based method. and performance is improved 20% if I using single-precision based > multiply. > * On ARMv8(64 bit), 8% improvement due to wider register, +20% improvements > after changing to single-precision float point. > > Btw, I'd like using double-precision here, following the discussion: > https://groups.google.com/a/chromium.org/forum/#!topic/graphics-dev/aSvAbzcgbos
On 2014/08/14 17:01:14, danakj wrote: > On 2014/08/14 07:02:39, kui.zheng wrote: > > Changes in patch set2: > > 1. Updated PointIsInTriangle algorithm. Sometimes, Testing Sign bit is > > insufficient. Considering one point is on the edge of triangle, we may get > > Positive Zero or Negative Zero after CrossProduct. In this case, the sign bit > > doesn't make sense. This issue is fixed by multiply each pairs of normals, > then > > compare with zero > > Which unit tests do we have that cover this important distinction? If none, can > you add some? > Sorry, I'm not good at writing English :). let me make it clear: The changes I described above is about patch set 1 and patch set 2. The extreme case mentioned above(point on the edge) had been covered by quad_unittest of gfx_unittest. (see line 300 - 330 https://code.google.com/p/chromium/codesearch#chromium/src/ui/gfx/geometry/qu...), actually: * The original code is all right. * This is a bug of my patch set 1, because I only testing sign-bit for performance. * In patch set 2, I updated algorithm and fixed the bug. The Benchmark result is about original code and patch set 2. we can see: * Better performance on armv7 and armv8. * Higher precision. * Algorithm readability is better.
Ah ok, the test failed on patchset 1. Thanks. LGTM
The CQ bit was checked by kui.zheng@arm.com
The CQ bit was unchecked by kui.zheng@arm.com
The CQ bit was checked by kui.zheng@arm.com
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/kui.zheng@arm.com/458173002/20001
FYI, CQ is re-trying this CL (attempt #1). The failing builders are: linux_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/linux_gpu/builds/...) mac_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/mac_gpu/builds/42945) win_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/win_gpu/builds/48125) android_aosp on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_aosp/bu...) android_chromium_gn_compile_rel on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_chromiu...) android_clang_dbg on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_clang_d...) android_dbg on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_dbg/bui...) chromium_presubmit on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/chromium_presub...) linux_chromium_chromeos_rel_swarming on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_gn_rel on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) linux_chromium_rel_swarming on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_...) ios_dbg_simulator on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_dbg_simulator...) ios_rel_device on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device/bu...) ios_rel_device_ninja on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device_ni...) mac_chromium_compile_dbg on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_swarming on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...) win8_chromium_rel on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win8_chromium_rel...) win_chromium_compile_dbg on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_comp...) win_chromium_rel_swarming on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_...) win_chromium_x64_rel_swarming on tryserver.chromium.win (http://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_...)
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: mac_gpu on tryserver.chromium.gpu (http://build.chromium.org/p/tryserver.chromium.gpu/builders/mac_gpu/builds/42970) ios_dbg_simulator on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_dbg_simulator...) ios_rel_device on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device/bu...) ios_rel_device_ninja on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device_ni...) mac_chromium_compile_dbg on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) mac_chromium_rel_swarming on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
On 2014/08/18 08:37:25, I haz the power (commit-bot) wrote: > Try jobs failed on following builders: > mac_gpu on tryserver.chromium.gpu > (http://build.chromium.org/p/tryserver.chromium.gpu/builders/mac_gpu/builds/42970) > ios_dbg_simulator on tryserver.chromium.mac > (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_dbg_simulator...) > ios_rel_device on tryserver.chromium.mac > (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device/bu...) > ios_rel_device_ninja on tryserver.chromium.mac > (http://build.chromium.org/p/tryserver.chromium.mac/builders/ios_rel_device_ni...) > mac_chromium_compile_dbg on tryserver.chromium.mac > (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_comp...) > mac_chromium_rel_swarming on tryserver.chromium.mac > (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...) Patch set 2's AUTHORS file breaks builders. Remove it because my first patch had been merged days ago.
The CQ bit was checked by danakj@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/kui.zheng@arm.com/458173002/40001
FYI, CQ is re-trying this CL (attempt #1). The failing builders are: android_dbg_triggered_tests on tryserver.chromium.linux (http://build.chromium.org/p/tryserver.chromium.linux/builders/android_dbg_tri...) mac_chromium_rel_swarming on tryserver.chromium.mac (http://build.chromium.org/p/tryserver.chromium.mac/builders/mac_chromium_rel_...)
Message was sent while issue was closed.
Committed patchset #3 (40001) as 290290
Message was sent while issue was closed.
A revert of this CL (patchset #3 id:40001) has been created in https://codereview.chromium.org/682793003/ by danakj@chromium.org. The reason for reverting is: Caused perf regressions: https://code.google.com/p/chromium/issues/detail?id=404968 . |