|
|
DescriptionInline PrePaintTreeWalk private methods and static functions
This reduces overhead of function calls, including function prologues
and epilogues, handling of parameters and return values (especially
for LayoutRect parameters).
The methods and functions are very hot but most of them are called
from only one place.
Also added 'static' keyword to local functions to prevent the compiler
from generating full functions for them.
BUG=678597
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2
Review-Url: https://codereview.chromium.org/2640863003
Cr-Commit-Position: refs/heads/master@{#444672}
Committed: https://chromium.googlesource.com/chromium/src/+/b297e151503a1eef82426df4f13d6285b7d94d4f
Patch Set 1 #
Total comments: 3
Patch Set 2 : Rebase #Patch Set 3 : ALWAYS_INLINE #
Depends on Patchset: Messages
Total messages: 28 (17 generated)
Description was changed from ========== Inline PrePaintTreeWalk private methods and static functions This reduces overhead of function calls, including function prologues and epilogues, handling of parameters and return values (especially for LayoutRect parameters). The methods and functions are very hot but most of them are called from only one place. Also added 'static' keyword to local functions to prevent the compiler from generating full functions for them. BUG=678597 ========== to ========== Inline PrePaintTreeWalk private methods and static functions This reduces overhead of function calls, including function prologues and epilogues, handling of parameters and return values (especially for LayoutRect parameters). The methods and functions are very hot but most of them are called from only one place. Also added 'static' keyword to local functions to prevent the compiler from generating full functions for them. BUG=678597 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2 ==========
The CQ bit was checked by wangxianzhu@chromium.org to run a CQ dry run
wangxianzhu@chromium.org changed reviewers: + pdr@chromium.org
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2640863003/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/paint/PaintPropertyTreeBuilder.cpp (left): https://codereview.chromium.org/2640863003/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/paint/PaintPropertyTreeBuilder.cpp:358: } // namespace Alternately we can enclose all local static functions in namespace {}, but 'static' seems more common than namespace {}. Wdyt?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: ios-device on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device/builds...) ios-device-xcode-clang on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-device-xcode-...) ios-simulator on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator/bui...) ios-simulator-xcode-clang on master.tryserver.chromium.mac (JOB_FAILED, http://build.chromium.org/p/tryserver.chromium.mac/builders/ios-simulator-xco...)
The CQ bit was checked by wangxianzhu@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2640863003/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/paint/PaintPropertyTreeBuilder.h (right): https://codereview.chromium.org/2640863003/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/paint/PaintPropertyTreeBuilder.h:118: inline static void updatePaintOffsetTranslation( After seeing your other inline patch, I was experimenting with this exact change too. It looks like "static inline" might have no effect in the common case: http://stackoverflow.com/questions/7762731/whats-the-difference-between-stati... Do you know if this has any effect on the generated code? Is there an easy way to test that this change works?
https://codereview.chromium.org/2640863003/diff/1/third_party/WebKit/Source/c... File third_party/WebKit/Source/core/paint/PaintPropertyTreeBuilder.h (right): https://codereview.chromium.org/2640863003/diff/1/third_party/WebKit/Source/c... third_party/WebKit/Source/core/paint/PaintPropertyTreeBuilder.h:118: inline static void updatePaintOffsetTranslation( On 2017/01/18 18:51:58, pdr. wrote: > After seeing your other inline patch, I was experimenting with this exact change > too. > > It looks like "static inline" might have no effect in the common case: > http://stackoverflow.com/questions/7762731/whats-the-difference-between-stati... > > Do you know if this has any effect on the generated code? Is there an easy way > to test that this change works? I looked into the generated code in a release build. The added 'inline's did inline these functions and reduces many instructions [1] for each function call. I think the compiler doesn't inline these static methods by default because they may be referenced from a friend class outside of the compilation unit. Each of such functions contributed about 1% of total PrePaint time for paint-offset-changes.html. I think most of the cost was the overhead of the function calls as many functions just check condition and return for the test. I think the stackoverflow article is about local static functions. I think it's correct if a local static function is small and is called from few places in the file. We can omit the inlines for the static local functions. [1] I will add more details soon.
*Sorry I posted wrong information above. Actually the 'inline's didn't work. ALWAYS_INLINE works.* More details about generated code of PaintPropertyTreeBuilder::updatePropertiesForChildren() before this CL: ... updateOverflowClip(object, context); 32: 48 89 df mov %rbx,%rdi 35: 4c 89 f6 mov %r14,%rsi 38: e8 00 00 00 00 callq 3d <blink::PaintPropertyTreeBuilder::updatePropertiesForChildren(blink::LayoutObject const&, blink::PaintPropertyTreeBuilderContext&)+0x3d> updatePerspective(object, context); 3d: 48 89 df mov %rbx,%rdi 40: 4c 89 f6 mov %r14,%rsi 43: e8 00 00 00 00 callq 48 <blink::PaintPropertyTreeBuilder::updatePropertiesForChildren(blink::LayoutObject const&, blink::PaintPropertyTreeBuilderContext&)+0x48> updateSvgLocalToBorderBoxTransform(object, context); 48: 48 89 df mov %rbx,%rdi 4b: 4c 89 f6 mov %r14,%rsi 4e: e8 00 00 00 00 callq 53 <blink::PaintPropertyTreeBuilder::updatePropertiesForChildren(blink::LayoutObject const&, blink::PaintPropertyTreeBuilderContext&)+0x53> updateScrollAndScrollTranslation(object, context); 53: 48 89 df mov %rbx,%rdi 56: 4c 89 f6 mov %r14,%rsi 59: e8 00 00 00 00 callq 5e <blink::PaintPropertyTreeBuilder::updatePropertiesForChildren(blink::LayoutObject const&, blink::PaintPropertyTreeBuilderContext&)+0x5e> updateOutOfFlowContext(object, context); 5e: 48 89 df mov %rbx,%rdi 61: 4c 89 f6 mov %r14,%rsi 64: e8 00 00 00 00 callq 69 <blink::PaintPropertyTreeBuilder::updatePropertiesForChildren(blink::LayoutObject const&, blink::PaintPropertyTreeBuilderContext&)+0x69> ... In updateOverflowClip(): 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 41 57 push %r15 6: 41 56 push %r14 8: 41 55 push %r13 a: 41 54 push %r12 c: 53 push %rbx d: 48 81 ec 88 00 00 00 sub $0x88,%rsp 14: 49 89 f6 mov %rsi,%r14 17: 49 89 fd mov %rdi,%r13 .... 4a7: 48 81 c4 88 00 00 00 add $0x88,%rsp 4ae: 5b pop %rbx 4af: 41 5c pop %r12 4b1: 41 5d pop %r13 4b3: 41 5e pop %r14 4b5: 41 5f pop %r15 4b7: 5d pop %rbp 4b8: c3 retq
My profiling on Linux showed speedup of the following functions, with ALWAYS_INLINEs: - PaintPropertyTreeBuilder::updatePropertiesForSelf(): 6.7% -> 4.8% - PaintPropertyTreeBuilder::updatePropertiesForChildren(): 4.7% -> 4.0% Percentages are percentages of the functions' time in total PrePaint time.
The CQ bit was checked by wangxianzhu@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
On 2017/01/18 at 20:39:57, wangxianzhu wrote: > My profiling on Linux showed speedup of the following functions, with ALWAYS_INLINEs: > - PaintPropertyTreeBuilder::updatePropertiesForSelf(): 6.7% -> 4.8% > - PaintPropertyTreeBuilder::updatePropertiesForChildren(): 4.7% -> 4.0% > > Percentages are percentages of the functions' time in total PrePaint time. LGTM! Thank you for the explanation and investigation. Are you running ninja with -v to get the clang command line, then adding -S to output the assembly?
On 2017/01/18 21:46:11, pdr. wrote: > On 2017/01/18 at 20:39:57, wangxianzhu wrote: > > My profiling on Linux showed speedup of the following functions, with > ALWAYS_INLINEs: > > - PaintPropertyTreeBuilder::updatePropertiesForSelf(): 6.7% -> 4.8% > > - PaintPropertyTreeBuilder::updatePropertiesForChildren(): 4.7% -> 4.0% > > > > Percentages are percentages of the functions' time in total PrePaint time. > > LGTM! Thank you for the explanation and investigation. > > Are you running ninja with -v to get the clang command line, then adding -S to > output the assembly? What I did was more complicated: got the command line, added -g and executed, then used objdump --disassemble --demangle --source to disassemble the generated .o file. -S would be much simpler. :)
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by wangxianzhu@chromium.org
The CQ bit was unchecked by wangxianzhu@chromium.org
Will land this after several hours so that we can distinguish the performance impacts of crrev.com/2633383004 and this CL.
The CQ bit was checked by wangxianzhu@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
CQ is committing da patch. Bot data: {"patchset_id": 40001, "attempt_start_ts": 1484810242068000, "parent_rev": "0cf301238f4cc06710e045c0dc4fcfa7a6fd5bf0", "commit_rev": "b297e151503a1eef82426df4f13d6285b7d94d4f"}
Message was sent while issue was closed.
Description was changed from ========== Inline PrePaintTreeWalk private methods and static functions This reduces overhead of function calls, including function prologues and epilogues, handling of parameters and return values (especially for LayoutRect parameters). The methods and functions are very hot but most of them are called from only one place. Also added 'static' keyword to local functions to prevent the compiler from generating full functions for them. BUG=678597 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2 ========== to ========== Inline PrePaintTreeWalk private methods and static functions This reduces overhead of function calls, including function prologues and epilogues, handling of parameters and return values (especially for LayoutRect parameters). The methods and functions are very hot but most of them are called from only one place. Also added 'static' keyword to local functions to prevent the compiler from generating full functions for them. BUG=678597 CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_layout_tests_slimming_paint_v2 Review-Url: https://codereview.chromium.org/2640863003 Cr-Commit-Position: refs/heads/master@{#444672} Committed: https://chromium.googlesource.com/chromium/src/+/b297e151503a1eef82426df4f13d... ==========
Message was sent while issue was closed.
Committed patchset #3 (id:40001) as https://chromium.googlesource.com/chromium/src/+/b297e151503a1eef82426df4f13d... |