Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(48)

Issue 678423002: Reland "Integrate SIMD optimisations for zlib" (Closed)

Created:
6 years, 1 month ago by robert.bradford
Modified:
5 years, 11 months ago
Reviewers:
Nico, agl
CC:
chromium-reviews, cbentzel+watch_chromium.org
Base URL:
https://chromium.googlesource.com/chromium/src.git@master
Project:
chromium
Visibility:
Public.

Description

Reland "Integrate SIMD optimisations for zlib" This version uses a "pthread_once" implementation, using Windows synchronisation primitives, imported from tcmalloc. Previous CLs: https://codereview.chromium.org/677713002/ https://codereview.chromium.org/552123005 This version of the CL also runs fine on Windows Server 2003. These optimisations have been published on zlib mailing list and at https://github.com/jtkukunas/zlib/ This change merges the following optimisation patches: - "For x86, add CPUID check." - "Adds SSE2 optimized hash shifting to fill_window." - "add SSE4.2 optimized hash function" - "add PCLMULQDQ optimized CRC folding" From Jim Kukunas <james.t.kukunas@linux.intel.com>; and adapts them to the current zlib version in Chromium. The optimisations are enabled at runtime if all the necessary CPU features are present. As the optimisations require extra cflags to enable the compiler to use the instructions the optimisations are held in their own static library with a stub implementation to allow linking on other platforms. TEST=net_unittests(GZipUnitTest) passes, Chrome functions and performance improvement seen on RoboHornet benchmark on Linux Desktop BUG=401517 Committed: https://crrev.com/02a95e3084f979084fa8586e1718a6e6dd4c22da Cr-Commit-Position: refs/heads/master@{#302799}

Patch Set 1 #

Patch Set 2 : Use GetProcAddress to load InitOnceExecuteOnce (fix running on XP) #

Patch Set 3 : Tweak formatting and add comment about why we need GetProcAddress #

Total comments: 4

Patch Set 4 : Cache the value from GetProcAddress #

Patch Set 5 : Switch to a lock free implementation on Windows #

Patch Set 6 : Import "pthread_once()" using Windows primitives from tcmalloc #

Unified diffs Side-by-side diffs Delta from patch set Stats (+1209 lines, -33 lines) Patch
M net/spdy/spdy_framer_test.cc View 4 chunks +166 lines, -7 lines 0 comments Download
M third_party/zlib/BUILD.gn View 2 chunks +15 lines, -0 lines 0 comments Download
M third_party/zlib/README.chromium View 1 chunk +16 lines, -0 lines 0 comments Download
M third_party/zlib/crc32.c View 2 chunks +27 lines, -0 lines 0 comments Download
A third_party/zlib/crc_folding.c View 1 chunk +493 lines, -0 lines 0 comments Download
M third_party/zlib/deflate.h View 2 chunks +12 lines, -0 lines 0 comments Download
M third_party/zlib/deflate.c View 19 chunks +113 lines, -26 lines 0 comments Download
A third_party/zlib/fill_window_sse.c View 1 chunk +175 lines, -0 lines 0 comments Download
A third_party/zlib/simd_stub.c View 5 1 chunk +35 lines, -0 lines 0 comments Download
A third_party/zlib/x86.h View 5 1 chunk +13 lines, -0 lines 0 comments Download
A third_party/zlib/x86.c View 1 2 3 4 5 1 chunk +112 lines, -0 lines 0 comments Download
M third_party/zlib/zlib.gyp View 3 chunks +23 lines, -0 lines 0 comments Download
M third_party/zlib/zutil.h View 1 chunk +6 lines, -0 lines 0 comments Download
M tools/msan/blacklist.txt View 1 chunk +3 lines, -0 lines 0 comments Download

Messages

Total messages: 21 (4 generated)
robert.bradford
Hi Adam, third time's the charm :-) . I was able to setup a Windows ...
6 years, 1 month ago (2014-10-29 14:52:20 UTC) #1
robert.bradford
Actually +agl this time, fumbled the original form.
6 years, 1 month ago (2014-10-29 15:02:38 UTC) #3
agl
https://codereview.chromium.org/678423002/diff/40001/third_party/zlib/x86.c File third_party/zlib/x86.c (right): https://codereview.chromium.org/678423002/diff/40001/third_party/zlib/x86.c#newcode65 third_party/zlib/x86.c:65: #if (_WIN32_WINNT >= 0x0600) I assume that this #if ...
6 years, 1 month ago (2014-10-30 01:25:43 UTC) #4
robert.bradford
New version uploaded, PTAL when you can. Thanks. https://codereview.chromium.org/678423002/diff/40001/third_party/zlib/x86.c File third_party/zlib/x86.c (right): https://codereview.chromium.org/678423002/diff/40001/third_party/zlib/x86.c#newcode65 third_party/zlib/x86.c:65: #if ...
6 years, 1 month ago (2014-10-30 14:31:50 UTC) #5
agl
Adding "volatile" and using a CAS operation where the return value is ignored does not ...
6 years, 1 month ago (2014-10-31 17:08:10 UTC) #6
robert.bradford
On 2014/10/31 17:08:10, agl wrote: > Adding "volatile" and using a CAS operation where the ...
6 years, 1 month ago (2014-10-31 18:46:12 UTC) #7
robert.bradford
> Wouldn't this be insufficient for actually protecting against a data race on > x86_cpu_simd_enable, ...
6 years, 1 month ago (2014-11-03 14:58:58 UTC) #8
robert.bradford
Hi Adam, PTAL my latest patch version: Looking into this a bit further I found ...
6 years, 1 month ago (2014-11-04 16:14:45 UTC) #10
agl
lgtm
6 years, 1 month ago (2014-11-04 23:07:45 UTC) #11
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/678423002/120001
6 years, 1 month ago (2014-11-05 14:09:18 UTC) #13
commit-bot: I haz the power
Committed patchset #6 (id:120001)
6 years, 1 month ago (2014-11-05 14:59:43 UTC) #14
commit-bot: I haz the power
Patchset 6 (id:??) landed as https://crrev.com/02a95e3084f979084fa8586e1718a6e6dd4c22da Cr-Commit-Position: refs/heads/master@{#302799}
6 years, 1 month ago (2014-11-05 15:00:19 UTC) #15
dcheng
On 2014/11/05 at 15:00:19, commit-bot wrote: > Patchset 6 (id:??) landed as https://crrev.com/02a95e3084f979084fa8586e1718a6e6dd4c22da > Cr-Commit-Position: ...
6 years, 1 month ago (2014-11-05 18:19:14 UTC) #16
recbradford
On 2014/11/05 18:19:14, dcheng wrote: > On 2014/11/05 at 15:00:19, commit-bot wrote: > > Patchset ...
6 years, 1 month ago (2014-11-05 20:58:40 UTC) #17
Noel Gordon
On 2014/11/05 20:58:40, recbradford wrote: > On 2014/11/05 18:19:14, dcheng wrote: > > On 2014/11/05 ...
6 years, 1 month ago (2014-11-06 03:05:55 UTC) #18
Nico
Why do we want these changes? Neither the CL description nor the bug really say ...
6 years, 1 month ago (2014-11-18 19:16:45 UTC) #20
robert.bradford
6 years, 1 month ago (2014-11-20 12:37:47 UTC) #21
On 2014/11/18 19:16:45, Nico wrote:
> Why do we want these changes? Neither the CL description nor the bug really
say
> (except a vague "it's recommended" and "robohornet improvements on linux
> desktop" without actual numbers), and they are cost maintenance: There were
two
> changes trying to get this building in the win/gn build (and it's still not
> right), and it currently doesn't work right on our clang/win bot either.
> 
> All of that is fairly easy to fix, but I don't think we should add assembly /
> intrinsic reimplementations without there being a really strong reason for
doing
> so.

Hi Nico, i'm sorry for the inconvenience caused with the Windows builds. I
didn't even realise that clang on Windows was a possibility before your original
feedback on the first version of the patch.

I've asked our performance team to provide some more metrics showing the value
of these optimisations. These compression optimisations are also the ground work
for some patches under development to provide SIMD acceleration for
decompression which will have a wide impact.

Powered by Google App Engine
This is Rietveld 408576698