|
|
DescriptionUse a different variant of CpuFeatures::FlushICache asm with clang.
This variant avoids a constant pool entry, which can be problematic
when LTO'ing. It is also slightly shorter.
R=bmeurer@chromium.org,Jacob.Bramley@arm.com
BUG=chromium:453195
LOG=n
Committed: https://crrev.com/0c05bdfd09ea5e01e5d8e6de1260ef43446ab15f
Cr-Commit-Position: refs/heads/master@{#27474}
Patch Set 1 #
Total comments: 2
Messages
Total messages: 21 (6 generated)
ben@strongloop.com changed reviewers: + ben@strongloop.com
https://codereview.chromium.org/986643004/diff/1/src/arm/cpu-arm.cc File src/arm/cpu-arm.cc (right): https://codereview.chromium.org/986643004/diff/1/src/arm/cpu-arm.cc#newcode51 src/arm/cpu-arm.cc:51: register uint32_t scno asm("r7") = __ARM_NR_cacheflush; I don't think it's safe to assume that the compiler will actually load the value into the register this way. A peephole optimizer can't see into the asm block and won't realize that the register is actually used. It may opt to remove the load.
https://codereview.chromium.org/986643004/diff/1/src/arm/cpu-arm.cc File src/arm/cpu-arm.cc (right): https://codereview.chromium.org/986643004/diff/1/src/arm/cpu-arm.cc#newcode51 src/arm/cpu-arm.cc:51: register uint32_t scno asm("r7") = __ARM_NR_cacheflush; On 2015/03/07 18:34:49, bnoordhuis wrote: > I don't think it's safe to assume that the compiler will actually load the value > into the register this way. > > A peephole optimizer can't see into the asm block and won't realize that the > register is actually used. It may opt to remove the load. Please see https://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html#Local-Reg-Vars : "However, using the variable as an input or output operand to the asm guarantees that the specified register is used for that operand." Note also that r0, r1, r2 are already being set in the same way. (Clang has the same inline asm semantics as GCC, so it works the same way.)
On 2015/03/07 21:26:39, pcc1 wrote: > https://codereview.chromium.org/986643004/diff/1/src/arm/cpu-arm.cc > File src/arm/cpu-arm.cc (right): > > https://codereview.chromium.org/986643004/diff/1/src/arm/cpu-arm.cc#newcode51 > src/arm/cpu-arm.cc:51: register uint32_t scno asm("r7") = __ARM_NR_cacheflush; > On 2015/03/07 18:34:49, bnoordhuis wrote: > > I don't think it's safe to assume that the compiler will actually load the > value > > into the register this way. > > > > A peephole optimizer can't see into the asm block and won't realize that the > > register is actually used. It may opt to remove the load. > > Please see https://gcc.gnu.org/onlinedocs/gcc/Local-Reg-Vars.html#Local-Reg-Vars > : > > "However, using the variable as an input or output operand to the asm guarantees > that the specified register is used for that operand." > > Note also that r0, r1, r2 are already being set in the same way. (Clang has the > same inline asm semantics as GCC, so it works the same way.) Apologies, I missed that it's used as a constraint. (For some reason, I read it as 'nr'.)
LGTM I'm not comfortable with having two implementations that are effectively the same, but I can't think of a better solution. Note that your Clang path is marginally better than the existing GCC one, since the compiler can decide how best to initialise scno. The GCC problem only applies with -mthumb, so we could use your new version with -marm targets if you decide that the complexity is worthwhile. The system call will dominate the performance cost here, so it doesn't make a lot of difference. What I wanted to do was implement the whole thing in pure assembly. The problem is trivial — it just forwards arguments and executes `svc` — but it's awkward to express without using compiler-specific tricks (like `__attribute__((naked))`). I'm curious about the constant-pool problem, though. Clang will use constant pools itself, so it must have some way of handling them.
On 2015/03/09 10:14:26, jbramley wrote: > I'm curious about the constant-pool problem, though. Clang will use constant > pools itself, so it must have some way of handling them. Clang will emit constant pools for ldr= at the end of the .text section. With LTO, the .text section could be large enough that the distance to the constant exceeds the maximum displacement. (In principle, I imagine clang could be fixed to emit the pool at the end of the function, but still in principle the function could be large enough to trigger the same problem, especially with inlining.)
The CQ bit was checked by pcc@google.com
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/986643004/1
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: v8_presubmit on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/1056)
bmeurer: Ping.
The CQ bit was checked by bmeurer@chromium.org
lgtm
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/986643004/1
The CQ bit was unchecked by commit-bot@chromium.org
Try jobs failed on following builders: v8_presubmit on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_presubmit/builds/1553)
The CQ bit was checked by bmeurer@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/986643004/1
Message was sent while issue was closed.
Committed patchset #1 (id:1)
Message was sent while issue was closed.
Patchset 1 (id:??) landed as https://crrev.com/0c05bdfd09ea5e01e5d8e6de1260ef43446ab15f Cr-Commit-Position: refs/heads/master@{#27474} |