|
|
Created:
6 years, 9 months ago by mtklein_C Modified:
6 years, 9 months ago CC:
skia-review_googlegroups.com Base URL:
https://skia.googlesource.com/skia.git@master Visibility:
Public. |
DescriptionSkPaint: eliminate some dead bytes in 64-bit build.
+ memcpy-based copy constructor was hiding this gap -> manual copy constructor.
+ Split tests for finer-grained failures.
BUG=skia:
Committed: http://code.google.com/p/skia/source/detail?r=13856
Committed: http://code.google.com/p/skia/source/detail?r=13887
Committed: http://code.google.com/p/skia/source/detail?r=13927
Patch Set 1 #
Total comments: 2
Patch Set 2 : all pointers, then all flat #Patch Set 3 : early return instead #Patch Set 4 : fix android #Patch Set 5 : no bytewise for ctor, copy ctor, =, or == #Patch Set 6 : fix bugs #Patch Set 7 : tighter #
Messages
Total messages: 30 (0 generated)
https://codereview.chromium.org/203203003/diff/1/include/core/SkPaint.h File include/core/SkPaint.h (right): https://codereview.chromium.org/203203003/diff/1/include/core/SkPaint.h#newco... include/core/SkPaint.h:1050: SkTypeface* fTypeface; Maybe we can be even more aggressive/clear, and move *all* pointers first, so we don't accidentally create another gap if we gain/lose another int/float in the future.
https://codereview.chromium.org/203203003/diff/1/include/core/SkPaint.h File include/core/SkPaint.h (right): https://codereview.chromium.org/203203003/diff/1/include/core/SkPaint.h#newco... include/core/SkPaint.h:1050: SkTypeface* fTypeface; On 2014/03/18 15:04:01, reed1 wrote: > Maybe we can be even more aggressive/clear, and move *all* pointers first, so we > don't accidentally create another gap if we gain/lose another int/float in the > future. SGTM. Done.
lgtm
The CQ bit was checked by mtklein@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/mtklein@chromium.org/203203003/20001
The CQ bit was unchecked by commit-bot@chromium.org
Retried try job too often on Build-Mac10.8-Clang-x86-Release-Trybot for step(s) BuildTests http://108.170.219.164:10117/buildstatus?builder=Build-Mac10.8-Clang-x86-Rele...
The CQ bit was checked by mtklein@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/mtklein@chromium.org/203203003/20001
The CQ bit was unchecked by commit-bot@chromium.org
Retried try job too often on Build-Mac10.8-Clang-x86-Release-Trybot for step(s) BuildTests http://108.170.219.164:10117/buildstatus?builder=Build-Mac10.8-Clang-x86-Rele...
The CQ bit was checked by mtklein@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/mtklein@chromium.org/203203003/40001
Message was sent while issue was closed.
Change committed as 13856
Message was sent while issue was closed.
A revert of this CL has been created in https://codereview.chromium.org/204543002/ by jcgregorio@google.com. The reason for reverting is: Causing RunTest failures on Android..
Message was sent while issue was closed.
On 2014/03/19 14:36:16, jcgregorio wrote: > A revert of this CL has been created in > https://codereview.chromium.org/204543002/ by mailto:jcgregorio@google.com. > > The reason for reverting is: Causing RunTest failures on Android.. Sorry about that. Will have a look.
Message was sent while issue was closed.
Tangent: we had somebody ranting a year and a half ago that we overused memset()/memcpy(), and that individual assignment was faster up to a few score bytes. I think they proved it, too, with matrices in CC. Investigate as a target for some microopt?
Message was sent while issue was closed.
On 2014/03/20 10:39:58, tomhudson wrote: > Tangent: we had somebody ranting a year and a half ago that we overused > memset()/memcpy(), and that individual assignment was faster up to a few score > bytes. I think they proved it, too, with matrices in CC. Investigate as a target > for some microopt? +1
Message was sent while issue was closed.
On 2014/03/20 10:57:20, mtklein wrote: > On 2014/03/20 10:39:58, tomhudson wrote: > > Tangent: we had somebody ranting a year and a half ago that we overused > > memset()/memcpy(), and that individual assignment was faster up to a few score > > bytes. I think they proved it, too, with matrices in CC. Investigate as a > target > > for some microopt? > > +1 Slightly more verbose +1 now that I'm sitting down: memcpy and memset (and bzero) are more general that how we're using them for these cases, and because they're general they're very convenient. The most obvious improvement I'd make if we wanted to keep using something like memcpy or memset for this sort of thing is something like this: memcpy(void*, const void*, size_t) -> static_memcpy<size_t>(void*, const void*) memset(void*, uint8_t, size_t) -> static_memset<size_t>(void*, uint8_t) We always know the size we're copying at compile time, but using memcpy itself can't take advantage of that. An inlined static_memcpy could be as fast or faster than setting each field independently, particularly if we get into specializing on different values of <size_t> to take larger strides (32-bit, 64-bit) instead of going byte by byte (all decided at compile time). That said, I don't personally mind setting all fields explicitly. It's a little bit of human effort in bookkeeping, sure, but we're seeing here it's probably better for us to be thinking about the fields individually more rather than as a bulk.
Message was sent while issue was closed.
On 2014/03/20 11:28:11, mtklein wrote: > On 2014/03/20 10:57:20, mtklein wrote: > > On 2014/03/20 10:39:58, tomhudson wrote: > > > Tangent: we had somebody ranting a year and a half ago that we overused > > > memset()/memcpy(), and that individual assignment was faster up to a few > score > > > bytes. I think they proved it, too, with matrices in CC. Investigate as a > > target > > > for some microopt? > > > > +1 > > Slightly more verbose +1 now that I'm sitting down: > > memcpy and memset (and bzero) are more general that how we're using them for > these cases, and because they're general they're very convenient. > > The most obvious improvement I'd make if we wanted to keep using something like > memcpy or memset for this sort of thing is something like this: > memcpy(void*, const void*, size_t) -> static_memcpy<size_t>(void*, const > void*) > memset(void*, uint8_t, size_t) -> static_memset<size_t>(void*, uint8_t) > We always know the size we're copying at compile time, but using memcpy itself > can't take advantage of that. An inlined static_memcpy could be as fast or > faster than setting each field independently, particularly if we get into > specializing on different values of <size_t> to take larger strides (32-bit, > 64-bit) instead of going byte by byte (all decided at compile time). > > That said, I don't personally mind setting all fields explicitly. It's a little > bit of human effort in bookkeeping, sure, but we're seeing here it's probably > better for us to be thinking about the fields individually more rather than as a > bulk. Sorry for the mail flood. Just wanted to add that I don't think it's particularly important to panic and change all existing code right away. I'd only advocate updating hot spots we see on a profile, and of course changing our habits for new code.
The CQ bit was checked by mtklein@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/mtklein@chromium.org/203203003/60001
Message was sent while issue was closed.
Change committed as 13887
Message was sent while issue was closed.
A revert of this CL has been created in https://codereview.chromium.org/206623005/ by mtklein@google.com. The reason for reverting is: Huh, some Android tests are still failing despite the fix. IntelRhB, Xoom... that's weird..
Mike, want to take another look? This now has changed constructor, copy constructor, operator=, and operator== all to consistently deal with fields instead of memcpy/memcmp.
lgtm
The CQ bit was checked by mtklein@google.com
CQ is trying da patch. Follow status at https://skia-tree-status.appspot.com/cq/mtklein@chromium.org/203203003/110001
Message was sent while issue was closed.
Change committed as 13927 |