|
|
Created:
6 years, 8 months ago by oetuaho-nv Modified:
6 years, 8 months ago CC:
v8-dev Base URL:
git://github.com/v8/v8.git@master Visibility:
Public. |
DescriptionARM: Do not set FPSCR when converting to clamped uint8
Setting the FPSCR flags is expensive on some CPUs. Get rid of repeated
setting of the FPSCR by relying on the correct default flags being set
when doing uint8 clamping. Also use vcvt_u32_f64 instead of vcvt_s32_f64,
which enables removing the check against zero (vcvt_u32_f64 will clamp to
zero).
To be on the safe side, add asserts to check that the VFP rounding mode
flags are set to default as expected.
This increases performance of a hot loop repeatedly setting
Uint8ClampedArray values on some CPUs by as much as a factor of 12.
BUG=v8:3253
LOG=N
R=jacob.bramley@arm.com, rmcilroy@chromium.org, ulan@chromium.org
Committed: https://code.google.com/p/v8/source/detail?r=20676
Committed: https://code.google.com/p/v8/source/detail?r=20755
Patch Set 1 #
Total comments: 4
Patch Set 2 : Address nits #Patch Set 3 : Fix initial FPSCR state in ARM simulator #Patch Set 4 : Use stop instead of assert to avoid infinite recursion #Patch Set 5 : Rebase on top of separate simulator fix #Messages
Total messages: 21 (0 generated)
This CL implements the alternative, simpler approach from discussions at: https://codereview.chromium.org/222403002/ Did not find any cases where non-default rounding mode would have been interfering with the code so far.
For the record: lgtm (I'm not an owner though, so that doesn't actually help you.) https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.cc File src/arm/macro-assembler-arm.cc (right): https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.... src/arm/macro-assembler-arm.cc:3815: if (emit_debug_code()) { I'd be tempted to omit this, since everything else depends on it too and we (now) check it in VFPEnsureFPSCRState, but I won't object if you insist.
https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.cc File src/arm/macro-assembler-arm.cc (right): https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.... src/arm/macro-assembler-arm.cc:3815: if (emit_debug_code()) { On 2014/04/09 14:18:26, jbramley wrote: > I'd be tempted to omit this, since everything else depends on it too and we > (now) check it in VFPEnsureFPSCRState, but I won't object if you insist. Yes, I thought about leaving this out too, but this is the only place in the code where the vcvt variant with kFPSCRRounding set seems to be called, so it doesn't seem too out-of-place to double-check. But if there's a strong opposite opinion from someone we can surely throw this out.
Looks good other than my nits - thanks. https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.cc File src/arm/macro-assembler-arm.cc (right): https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.... src/arm/macro-assembler-arm.cc:3815: if (emit_debug_code()) { On 2014/04/09 14:24:32, oetuaho-nv wrote: > On 2014/04/09 14:18:26, jbramley wrote: > > I'd be tempted to omit this, since everything else depends on it too and we > > (now) check it in VFPEnsureFPSCRState, but I won't object if you insist. > > Yes, I thought about leaving this out too, but this is the only place in the > code where the vcvt variant with kFPSCRRounding set seems to be called, so it > doesn't seem too out-of-place to double-check. But if there's a strong opposite > opinion from someone we can surely throw this out. This may be the only place vcvt uses the fpscr's rounding mode, but all the other FP arithmetic operations depend on fpscr's rounding mode being correct, so I don't think it's worth adding this check here (but adding it to VFPEnsureFPSCRState seems like a good move). https://codereview.chromium.org/230473005/diff/1/src/arm/macro-assembler-arm.... src/arm/macro-assembler-arm.cc:3821: // value in input_reg < 255. nit - "For inputs < 255 (including negative) vcvt_u32_f64 with round-to-nearest rounding mode will provide the correct."
Uploaded a new version with the nits addressed.
On 2014/04/10 09:21:52, oetuaho-nv wrote: > Uploaded a new version with the nits addressed. lgtm. I'll land this for you.
lgtm
Message was sent while issue was closed.
Committed patchset #2 manually as r20676 (presubmit successful).
Message was sent while issue was closed.
On 2014/04/11 09:22:29, rmcilroy wrote: > Committed patchset #2 manually as r20676 (presubmit successful). Looks like this broke mjsunit/ascii-regexp-subject and bunch of other tests: === mjsunit/ascii-regexp-subject === abort: Default rounding mode not set # # Fatal error in ../src/frames.cc, line 455 # CHECK(kind == Code::FUNCTION || kind == Code::OPTIMIZED_FUNCTION) failed # ==== C stack trace =============================== 1: V8_Fatal 2: v8::internal::StackFrame::ComputeType(v8::internal::StackFrameIteratorBase const*, v8::internal::StackFrame::State*) 3: v8::internal::StackFrame::GetCallerState(v8::internal::StackFrame::State*) const 4: v8::internal::StackFrameIterator::Advance() 5: ?? 6: v8::internal::Isolate::PrintStack(v8::internal::StringStream*) 7: v8::internal::Isolate::PrintStack(_IO_FILE*) 8: ?? 9: v8::internal::Runtime_Abort(int, v8::internal::Object**, v8::internal::Isolate*) 10: v8::internal::Simulator::SoftwareInterrupt(v8::internal::Instruction*) 11: v8::internal::Simulator::DecodeType7(v8::internal::Instruction*) 12: v8::internal::Simulator::InstructionDecode(v8::internal::Instruction*) 13: v8::internal::Simulator::Execute() 14: v8::internal::Simulator::CallInternal(unsigned char*) 15: v8::internal::Simulator::Call(unsigned char*, int, ...) 16: ?? 17: v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, bool*, bool) 18: v8::Script::Run() 19: v8::Shell::ExecuteString(v8::Isolate*, v8::Handle<v8::String>, v8::Handle<v8::Value>, bool, bool) 20: v8::SourceGroup::Execute(v8::Isolate*) 21: v8::Shell::RunMain(v8::Isolate*, int, char**) 22: v8::Shell::Main(int, char**) 23: main 24: __libc_start_main 25: ?? Command: /usr/local/hdd/v8/out/arm.debug/d8 --test --random-seed=-1212432547 --nocrankshaft --nohard-abort --nodead-code-elimination --nofold-constants --enable-slow-asserts --debug-code --verify-heap /usr/local/hdd/v8/test/mjsunit/mjsunit.js /usr/local/hdd/v8/test/mjsunit/ascii-regexp-subject.js
Message was sent while issue was closed.
On 2014/04/11 09:54:41, ulan wrote: > On 2014/04/11 09:22:29, rmcilroy wrote: > > Committed patchset #2 manually as r20676 (presubmit successful). > > Looks like this broke mjsunit/ascii-regexp-subject and bunch of other tests: > > === mjsunit/ascii-regexp-subject === > abort: Default rounding mode not set > > > # > # Fatal error in ../src/frames.cc, line 455 > # CHECK(kind == Code::FUNCTION || kind == Code::OPTIMIZED_FUNCTION) failed > # > > ==== C stack trace =============================== > > 1: V8_Fatal > 2: v8::internal::StackFrame::ComputeType(v8::internal::StackFrameIteratorBase > const*, v8::internal::StackFrame::State*) > 3: v8::internal::StackFrame::GetCallerState(v8::internal::StackFrame::State*) > const > 4: v8::internal::StackFrameIterator::Advance() > 5: ?? > 6: v8::internal::Isolate::PrintStack(v8::internal::StringStream*) > 7: v8::internal::Isolate::PrintStack(_IO_FILE*) > 8: ?? > 9: v8::internal::Runtime_Abort(int, v8::internal::Object**, > v8::internal::Isolate*) > 10: v8::internal::Simulator::SoftwareInterrupt(v8::internal::Instruction*) > 11: v8::internal::Simulator::DecodeType7(v8::internal::Instruction*) > 12: v8::internal::Simulator::InstructionDecode(v8::internal::Instruction*) > 13: v8::internal::Simulator::Execute() > 14: v8::internal::Simulator::CallInternal(unsigned char*) > 15: v8::internal::Simulator::Call(unsigned char*, int, ...) > 16: ?? > 17: v8::internal::Execution::Call(v8::internal::Isolate*, > v8::internal::Handle<v8::internal::Object>, > v8::internal::Handle<v8::internal::Object>, int, > v8::internal::Handle<v8::internal::Object>*, bool*, bool) > 18: v8::Script::Run() > 19: v8::Shell::ExecuteString(v8::Isolate*, v8::Handle<v8::String>, > v8::Handle<v8::Value>, bool, bool) > 20: v8::SourceGroup::Execute(v8::Isolate*) > 21: v8::Shell::RunMain(v8::Isolate*, int, char**) > 22: v8::Shell::Main(int, char**) > 23: main > 24: __libc_start_main > 25: ?? > Command: /usr/local/hdd/v8/out/arm.debug/d8 --test --random-seed=-1212432547 > --nocrankshaft --nohard-abort --nodead-code-elimination --nofold-constants > --enable-slow-asserts --debug-code --verify-heap > /usr/local/hdd/v8/test/mjsunit/mjsunit.js > /usr/local/hdd/v8/test/mjsunit/ascii-regexp-subject.js I reverted the CL in https://codereview.chromium.org/233013005/. You can run the test-suite by running "make arm.debug.check".
Message was sent while issue was closed.
On 2014/04/11 10:15:02, rmcilroy wrote: > I reverted the CL in https://codereview.chromium.org/233013005/. You can run > the test-suite by running "make arm.debug.check". Right, looks like the simulator is setting RZ rounding mode at initialization while it should be setting RN. I'll upload the fix as soon as I have it verified.
Message was sent while issue was closed.
On 2014/04/11 10:39:09, oetuaho-nv wrote: > On 2014/04/11 10:15:02, rmcilroy wrote: > > I reverted the CL in https://codereview.chromium.org/233013005/. You can run > > the test-suite by running "make arm.debug.check". > > Right, looks like the simulator is setting RZ rounding mode at initialization > while it should be setting RN. I'll upload the fix as soon as I have it > verified. I was about to say the same thing! Default NaN should start off false, to match hardware. With the simulator, I still get several failures with arm.optdebug.check but they look different.
Message was sent while issue was closed.
On 2014/04/11 10:43:38, jbramley wrote: > On 2014/04/11 10:39:09, oetuaho-nv wrote: > > On 2014/04/11 10:15:02, rmcilroy wrote: > > > I reverted the CL in https://codereview.chromium.org/233013005/. You can > run > > > the test-suite by running "make arm.debug.check". > > > > Right, looks like the simulator is setting RZ rounding mode at initialization > > while it should be setting RN. I'll upload the fix as soon as I have it > > verified. > > I was about to say the same thing! Default NaN should start off false, to match > hardware. I meant to say "also, default NaN ..." > > With the simulator, I still get several failures with arm.optdebug.check but > they look different.
Message was sent while issue was closed.
On 2014/04/11 10:44:06, jbramley wrote: > On 2014/04/11 10:43:38, jbramley wrote: > > On 2014/04/11 10:39:09, oetuaho-nv wrote: > > > On 2014/04/11 10:15:02, rmcilroy wrote: > > > > I reverted the CL in https://codereview.chromium.org/233013005/. You can > > run > > > > the test-suite by running "make arm.debug.check". > > > > > > Right, looks like the simulator is setting RZ rounding mode at > initialization > > > while it should be setting RN. I'll upload the fix as soon as I have it > > > verified. > > > > I was about to say the same thing! Default NaN should start off false, to > match > > hardware. > > I meant to say "also, default NaN ..." > > > > > With the simulator, I still get several failures with arm.optdebug.check but > > they look different. Patch set 3 has the correct initial FPSCR state in the ARM simulator, but I don't have a cause for the other crashes yet. Did they repro only on optdebug for you? Any debugging tips?
Message was sent while issue was closed.
> Patch set 3 has the correct initial FPSCR state in the ARM simulator, but I > don't have a cause for the other crashes yet. Did they repro only on optdebug > for you? Any debugging tips? Ah, gdb backtrace does reveal something: it seems to be hitting the assert, only then Abort will call back into VFPEnsureFPSCRState and the infinite recursion will run out of stack. Will have to look into whether the state is actually being set to something different somewhere, or if it's some other kind of a simulator bug.
Message was sent while issue was closed.
On 2014/04/11 13:31:24, oetuaho-nv wrote: > > Patch set 3 has the correct initial FPSCR state in the ARM simulator, but I > > don't have a cause for the other crashes yet. Did they repro only on optdebug > > for you? Any debugging tips? A simulator release build passes everything at r20676, but I get similar crashes in a debug build.
Message was sent while issue was closed.
Figured out the bug, it was actually hitting infinite recursion in code generation, since Assert called back into VFPEnsureFPSCRState. Fixed in the latest patch set by using stop instead of Assert, all tests passed on arm.optdebug. Do I need to upload a new cl so that this can be committed, or is this one still fine?
Message was sent while issue was closed.
On 2014/04/14 10:29:50, oetuaho-nv wrote: > Figured out the bug, it was actually hitting infinite recursion in code > generation, since Assert called back into VFPEnsureFPSCRState. Fixed in the > latest patch set by using stop instead of Assert, all tests passed on > arm.optdebug. > > Do I need to upload a new cl so that this can be committed, or is this one still > fine? Please upload a separate CL with the simulator fix (so that it is obvious if this causes any additional failures when it is submitted). You can continue to use this CL for the original issue though.
Message was sent while issue was closed.
The simulator fix has been there for a day now, time to commit this?
On 2014/04/15 07:51:16, oetuaho-nv wrote: > The simulator fix has been there for a day now, time to commit this? Lgtm. I'll land later today.
Message was sent while issue was closed.
Committed patchset #5 manually as r20755 (presubmit successful). |