DescriptionAdd atomic load/store, fetch_add, fence, and is-lock-free lowering.
Loads/stores w/ type i8, i16, and i32 are converted to
plain load/store instructions and lowered w/ the plain
lowerLoad/lowerStore. Atomic stores are followed by an mfence
for sequential consistency.
For 64-bit types, use movq to do 64-bit memory
loads/stores (vs the usual load/store being broken into
separate 32-bit load/stores). This means bitcasting the
i64 -> f64, first (which splits the load of the value to be
stored into two 32-bit ops) then stores in a single op. For
load, load into f64 then bitcast back to i64 (which splits
after the atomic load). This follows what GCC does for
c++11 std::atomic<uint64_t> load/store methods (uses movq
when -mfpmath=sse). This introduces some redundancy between
movq and movsd, but the convention seems to be to use movq
when working with integer quantities. Otherwise, movsd
could work too. The difference seems to be in whether or
not the XMM register's upper 64-bits are filled with 0 or
not. Zero-extending could help avoid partial register
stalls.
Handle up to i32 fetch_add. TODO: add i64 via a cmpxchg loop.
TODO: add some runnable crosstests to make sure that this
doesn't do funny things to integer bit patterns that happen
to look like signaling NaNs and quiet NaNs. However, the system
clang would not know how to handle "llvm.nacl.*" if we choose to
target that level directly via .ll files. Or, (a) we use old-school __sync
methods (sync_fetch_and_add w/ 0 to load) or (b) require buildbot's
clang/gcc to support c++11...
BUG= https://code.google.com/p/nativeclient/issues/detail?id=3882
R=stichnot@chromium.org
Committed: https://gerrit.chromium.org/gerrit/gitweb?p=native_client/pnacl-subzero.git;a=commit;h=5cd240d
Patch Set 1 #Patch Set 2 : Handle atomic rmw add up to i32 for now #Patch Set 3 : use movq for cast #Patch Set 4 : beef up test a bit #
Total comments: 1
Patch Set 5 : make sure atomic loads are n't optimized out #Patch Set 6 : test atomic rmw is not elided also #
Total comments: 10
Patch Set 7 : review #Patch Set 8 : add a comment about xadd #Patch Set 9 : move _xadd fakedef to common _xadd code #
Total comments: 7
Patch Set 10 : review #Patch Set 11 : cleanup more #Patch Set 12 : couple more daring tests #Patch Set 13 : fix test todo now that separate fix landed #Patch Set 14 : test 64 errors more #
Total comments: 2
Patch Set 15 : check width #
Total comments: 3
Patch Set 16 : add LOCK prefix to the usage part of comment #Patch Set 17 : change comment #
Messages
Total messages: 14 (0 generated)
|