Issue 1700473003: NEON f32 <-> f16 and f32 <-> u16

mtklein_C

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-12 22:48:49 UTC) #1

mtklein_C

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-12 22:49:41 UTC) #2

mtklein_C

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-12 22:49:47 UTC) #3

mtklein_C

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-12 22:51:55 UTC) #4

mtklein_C

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-12 22:52:13 UTC) #5

mtklein_C

mtklein@chromium.org changed reviewers: + reed@google.com

4 years, 10 months ago (2016-02-12 22:52:38 UTC) #6

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/1 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/1

4 years, 10 months ago (2016-02-12 22:52:44 UTC) #9

commit-bot: I haz the power

Note for Reviewers: The CQ is waiting for an approval. If you believe that the ...

4 years, 10 months ago (2016-02-12 22:52:44 UTC) #10

mtklein_C

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-12 22:54:15 UTC) #11

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 10 months ago (2016-02-13 04:52:00 UTC) #12

commit-bot: I haz the power

No LGTM from a valid reviewer yet. Please ask for an LGTM from a full ...

4 years, 10 months ago (2016-02-13 04:52:00 UTC) #13

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:45:39 UTC) #14

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x):

   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	


GCC, for comparison:
   1108.07  	xferu64_bw_1_opaque_u16	
  53033.72  	xferu64_bw_1_alpha_u16	
  56324.06  	xferu64_aa_1_opaque_u16	
  63194.09  	xferu64_aa_1_alpha_u16	
    629.98  	xferu64_bw_1_opaque_f16	
  95098.56  	xferu64_bw_1_alpha_f16	
 109346.14  	xferu64_aa_1_opaque_f16	
 106094.29  	xferu64_aa_1_alpha_f16	



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
on ARMv7-compatible NEON for GCC.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	


GCC:
    597.99 ?	xferu64_bw_1_opaque_u16	nonrendering
  53036.52  	xferu64_bw_1_alpha_u16	nonrendering
  56328.17  	xferu64_aa_1_opaque_u16	nonrendering
  63196.74  	xferu64_aa_1_alpha_u16	nonrendering
    575.16  	xferu64_bw_1_opaque_f16	nonrendering
   8866.49  	xferu64_bw_1_alpha_f16	nonrendering
  11050.74  	xferu64_aa_1_opaque_f16	nonrendering
  14128.42  	xferu64_aa_1_alpha_f16	nonrendering



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:49:13 UTC) #15

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
on ARMv7-compatible NEON for GCC.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	


GCC:
    597.99 ?	xferu64_bw_1_opaque_u16	nonrendering
  53036.52  	xferu64_bw_1_alpha_u16	nonrendering
  56328.17  	xferu64_aa_1_opaque_u16	nonrendering
  63196.74  	xferu64_aa_1_alpha_u16	nonrendering
    575.16  	xferu64_bw_1_opaque_f16	nonrendering
   8866.49  	xferu64_bw_1_alpha_f16	nonrendering
  11050.74  	xferu64_aa_1_opaque_f16	nonrendering
  14128.42  	xferu64_aa_1_alpha_f16	nonrendering



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
on ARMv7-compatible NEON for GCC.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	


GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:49:27 UTC) #16

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
on ARMv7-compatible NEON for GCC.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	


GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
on ARMv7-compatible NEON for GCC.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:50:23 UTC) #17

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
on ARMv7-compatible NEON for GCC.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:51:25 UTC) #18

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t (timed on N5x).

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for uint16_t code on GCC.

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:51:43 UTC) #19

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for uint16_t code on GCC.

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:52:58 UTC) #20

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

Clang:
   1425.29  	xferu64_bw_1_opaque_u16	
   7712.89  	xferu64_bw_1_alpha_u16	
  10338.13  	xferu64_aa_1_opaque_u16	
  13750.49  	xferu64_aa_1_alpha_u16	
   1112.06  	xferu64_bw_1_opaque_f16	
   6070.07  	xferu64_bw_1_alpha_f16	
   8789.06  	xferu64_aa_1_opaque_f16	
  11975.83  	xferu64_aa_1_alpha_f16	

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 16:54:00 UTC) #21

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x, Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x, GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 17:36:47 UTC) #22

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x, Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x, GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x, Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x, GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16

N5, GCC
   1028.12  	xferu64_bw_1_opaque_u16
  38204.10  	xferu64_bw_1_alpha_u16
  44265.87  	xferu64_aa_1_opaque_u16
  46950.93  	xferu64_aa_1_alpha_u16
    911.87  	xferu64_bw_1_opaque_f16
  11553.22  	xferu64_bw_1_alpha_f16
  15076.66  	xferu64_aa_1_opaque_f16
  20457.03  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 17:37:17 UTC) #23

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x, Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x, GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16

N5, GCC
   1028.12  	xferu64_bw_1_opaque_u16
  38204.10  	xferu64_bw_1_alpha_u16
  44265.87  	xferu64_aa_1_opaque_u16
  46950.93  	xferu64_aa_1_alpha_u16
    911.87  	xferu64_bw_1_opaque_f16
  11553.22  	xferu64_bw_1_alpha_f16
  15076.66  	xferu64_aa_1_opaque_f16
  20457.03  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x (ARMv8), Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x (ARMv8), GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16

N5 (ARMv7), Clang

N5 (ARMv7), GCC
   1028.12  	xferu64_bw_1_opaque_u16
  38204.10  	xferu64_bw_1_alpha_u16
  44265.87  	xferu64_aa_1_opaque_u16
  46950.93  	xferu64_aa_1_alpha_u16
    911.87  	xferu64_bw_1_opaque_f16
  11553.22  	xferu64_bw_1_alpha_f16
  15076.66  	xferu64_aa_1_opaque_f16
  20457.03  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 17:41:01 UTC) #24

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x (ARMv8), Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x (ARMv8), GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16

N5 (ARMv7), Clang

N5 (ARMv7), GCC
   1028.12  	xferu64_bw_1_opaque_u16
  38204.10  	xferu64_bw_1_alpha_u16
  44265.87  	xferu64_aa_1_opaque_u16
  46950.93  	xferu64_aa_1_alpha_u16
    911.87  	xferu64_bw_1_opaque_f16
  11553.22  	xferu64_bw_1_alpha_f16
  15076.66  	xferu64_aa_1_opaque_f16
  20457.03  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x (ARMv8), Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x (ARMv8), GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16

N5 (ARMv7), Clang
    470.13  	xferu64_bw_1_opaque_u16
  17775.88  	xferu64_bw_1_alpha_u16
  20440.19  	xferu64_aa_1_opaque_u16
  25235.11  	xferu64_aa_1_alpha_u16
    464.99  	xferu64_bw_1_opaque_f16
  10631.84  	xferu64_bw_1_alpha_f16
  13293.95  	xferu64_aa_1_opaque_f16
  18150.39  	xferu64_aa_1_alpha_f16

N5 (ARMv7), GCC
   1028.12  	xferu64_bw_1_opaque_u16
  38204.10  	xferu64_bw_1_alpha_u16
  44265.87  	xferu64_aa_1_opaque_u16
  46950.93  	xferu64_aa_1_alpha_u16
    911.87  	xferu64_bw_1_opaque_f16
  11553.22  	xferu64_bw_1_alpha_f16
  15076.66  	xferu64_aa_1_opaque_f16
  20457.03  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== On ARMv8, we definitely have NEON f16 <-> f32 instructions. ...

4 years, 10 months ago (2016-02-14 17:48:26 UTC) #25

Description was changed from

==========
On ARMv8, we definitely have NEON f16 <-> f32 instructions.

... unfortunately GCC 4.9 doesn't seem to know that.  We could
work around that with inline assembly, but I don't feel like
pandering to old compilers.   Instead, check for Clang, falling back
to ARMv7-compatible NEON for ARMv8 GCC and, of course, ARMv7.

This means that on ARMv8, half-float is a faster storage format 
than uint16_t.  Though some of that is due to garbage code generation
for the uint16_t case on GCC.

N5x (ARMv8), Clang:
   1113.04  	xferu64_bw_1_opaque_u16
   7707.76  	xferu64_bw_1_alpha_u16
  10333.98  	xferu64_aa_1_opaque_u16
  13723.14  	xferu64_aa_1_alpha_u16
   1112.06  	xferu64_bw_1_opaque_f16
   6059.57  	xferu64_bw_1_alpha_f16
   8778.08  	xferu64_aa_1_opaque_f16
  11973.88  	xferu64_aa_1_alpha_f16

N5x (ARMv8), GCC:
    597.99  	xferu64_bw_1_opaque_u16
  53036.52  	xferu64_bw_1_alpha_u16
  56328.17  	xferu64_aa_1_opaque_u16
  63196.74  	xferu64_aa_1_alpha_u16
    575.16  	xferu64_bw_1_opaque_f16
   8866.49  	xferu64_bw_1_alpha_f16
  11050.74  	xferu64_aa_1_opaque_f16
  14128.42  	xferu64_aa_1_alpha_f16

N5 (ARMv7), Clang
    470.13  	xferu64_bw_1_opaque_u16
  17775.88  	xferu64_bw_1_alpha_u16
  20440.19  	xferu64_aa_1_opaque_u16
  25235.11  	xferu64_aa_1_alpha_u16
    464.99  	xferu64_bw_1_opaque_f16
  10631.84  	xferu64_bw_1_alpha_f16
  13293.95  	xferu64_aa_1_opaque_f16
  18150.39  	xferu64_aa_1_alpha_f16

N5 (ARMv7), GCC
   1028.12  	xferu64_bw_1_opaque_u16
  38204.10  	xferu64_bw_1_alpha_u16
  44265.87  	xferu64_aa_1_opaque_u16
  46950.93  	xferu64_aa_1_alpha_u16
    911.87  	xferu64_bw_1_opaque_f16
  11553.22  	xferu64_bw_1_alpha_f16
  15076.66  	xferu64_aa_1_opaque_f16
  20457.03  	xferu64_aa_1_alpha_f16



BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was just
TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

In all cases, f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:48:39 UTC) #26

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was just
TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

In all cases, f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

In all cases, f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

mtklein_C

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:50:06 UTC) #27

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

In all cases, f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

In all cases, f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:51:36 UTC) #28

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

In all cases, f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16, and faster with proper ARMv8.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:51:56 UTC) #29

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
To make it a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16, and faster with proper ARMv8.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16, and faster with proper ARMv8.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:52:31 UTC) #30

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16, and faster with proper ARMv8.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16, and faster with proper ARMv8.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:53:40 UTC) #31

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16, and faster with proper ARMv8.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:54:34 UTC) #32

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 17:59:38 UTC) #33

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

The CQ bit was checked by mtklein@google.com to run a CQ dry run

4 years, 10 months ago (2016-02-14 18:04:46 UTC) #34

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/100001

4 years, 10 months ago (2016-02-14 18:04:49 UTC) #35

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 10 months ago (2016-02-14 18:27:12 UTC) #36

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 10 months ago (2016-02-14 18:27:12 UTC) #37

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 21:27:53 UTC) #38

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
For a fair comparison, also adds NEON f32 <-> u16 code, which was a TODO.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-14 21:29:36 UTC) #39

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot
==========

mtklein

The CQ bit was checked by mtklein@google.com to run a CQ dry run

4 years, 10 months ago (2016-02-14 21:29:44 UTC) #40

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/100001

4 years, 10 months ago (2016-02-14 21:29:53 UTC) #41

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 10 months ago (2016-02-14 22:26:38 UTC) #42

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot on client.skia.android (JOB_FAILED, http://build.chromium.org/p/client.skia.android/builders/Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot/builds/49)

4 years, 10 months ago (2016-02-14 22:26:39 UTC) #43

mtklein_C

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-17 17:01:05 UTC) #47

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

The patchset sent to the CQ was uploaded after l-g-t-m from reed@google.com Link to the ...

4 years, 10 months ago (2016-02-17 17:01:46 UTC) #49

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/120001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/120001

4 years, 10 months ago (2016-02-17 17:01:56 UTC) #50

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-17 18:22:22 UTC) #52

Description was changed from

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest,
so we fall back on my ARMv7 version there.  The ARMv7 version is different
enough
from the SSE version that it does not make sense to use SkNx.

f16 is at least competitive with u16.

Nexus 5 (ARMv7), GCC:
  10218.75  	xferu64_bw_1_alpha_u16	nonrendering
  12868.90  	xferu64_aa_1_opaque_u16	nonrendering
  19093.02  	xferu64_aa_1_alpha_u16	nonrendering

  11520.75  	xferu64_bw_1_alpha_f16	nonrendering
  15064.45  	xferu64_aa_1_opaque_f16	nonrendering
  20384.28  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5 (ARMv7), Clang:
  17812.26  	xferu64_bw_1_alpha_u16	nonrendering
  20440.92  	xferu64_aa_1_opaque_u16	nonrendering
  25239.75 !	xferu64_aa_1_alpha_u16	nonrendering

  10631.35  	xferu64_bw_1_alpha_f16	nonrendering
  13285.64  	xferu64_aa_1_opaque_f16	nonrendering
  18147.22  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), GCC:
   8604.82 !	xferu64_bw_1_alpha_u16	nonrendering
  12658.99  	xferu64_aa_1_opaque_u16	nonrendering
  14555.23  	xferu64_aa_1_alpha_u16	nonrendering

   8876.97  	xferu64_bw_1_alpha_f16	nonrendering
  11141.55 ?	xferu64_aa_1_opaque_f16	nonrendering
  14257.30  	xferu64_aa_1_alpha_f16	nonrendering

Nexus 5x (ARMv8), Clang:
   7795.90 ?	xferu64_bw_1_alpha_u16	nonrendering
  10327.39  	xferu64_aa_1_opaque_u16	nonrendering
  13880.62  	xferu64_aa_1_alpha_u16	nonrendering

   6064.70  	xferu64_bw_1_alpha_f16	nonrendering
   8782.47  	xferu64_aa_1_opaque_f16	nonrendering
  11970.70  	xferu64_aa_1_alpha_f16	nonrendering

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

to

==========
NEON f32 <-> f16 and f32 <-> u16

Adds f32 <-> f16 ARMv7 and ARMv8 NEON code.
Also adds NEON f32 <-> u16 code to make the comparison fair.

The NDK GCC does not support the ARMv8 NEON intrinsics needed to go fastest, so
we use a tiny amount of inline assembly.

The ARMv7 half -> float is different enough from the SSE version that it does
not make sense to use SkNx.

Still TODO: 
ARMv7 float -> half.  Naively translating the SSE version results in 0x0000
where we'd expect a denormal output.

BUG=skia:
GOLD_TRYBOT_URL=
https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&is...
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
==========

mtklein

The CQ bit was checked by mtklein@google.com to run a CQ dry run

4 years, 10 months ago (2016-02-17 18:25:56 UTC) #53

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/140001

4 years, 10 months ago (2016-02-17 18:26:08 UTC) #54

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-17 18:31:16 UTC) #55

mtklein

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-17 18:36:02 UTC) #56

mtklein

This is probably a good time to take a(nother) look. I've changed two things: 1) ...

4 years, 10 months ago (2016-02-17 18:44:11 UTC) #57

msarett

I realize I've partially reviewed code that was already there :). https://codereview.chromium.org/1700473003/diff/140001/src/core/SkHalf.h File src/core/SkHalf.h (right): ...

4 years, 10 months ago (2016-02-17 19:46:38 UTC) #58

mtklein

https://codereview.chromium.org/1700473003/diff/140001/src/core/SkHalf.h File src/core/SkHalf.h (right): https://codereview.chromium.org/1700473003/diff/140001/src/core/SkHalf.h#newcode55 src/core/SkHalf.h:55: norm = vreinterpretq_f32_u32(vaddq_u32(vshlq_n_u32(h, 13), On 2016/02/17 19:46:38, msarett wrote: ...

4 years, 10 months ago (2016-02-17 19:52:40 UTC) #59

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 10 months ago (2016-02-17 20:08:59 UTC) #60

commit-bot: I haz the power

Dry run: Try jobs failed on following builders: Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot on client.skia.android (JOB_FAILED, http://build.chromium.org/p/client.skia.android/builders/Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot/builds/51)

4 years, 10 months ago (2016-02-17 20:09:00 UTC) #61

mtklein

On 2016/02/17 20:09:00, commit-bot: I haz the power wrote: > Dry run: Try jobs failed ...

4 years, 10 months ago (2016-02-17 20:28:22 UTC) #62

mtklein

The CQ bit was checked by mtklein@google.com to run a CQ dry run

4 years, 10 months ago (2016-02-17 20:28:34 UTC) #63

commit-bot: I haz the power

Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/140001

4 years, 10 months ago (2016-02-17 20:28:42 UTC) #64

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 10 months ago (2016-02-17 22:36:11 UTC) #65

commit-bot: I haz the power

Dry run: This issue passed the CQ dry run.

4 years, 10 months ago (2016-02-17 22:36:13 UTC) #66

mtklein

Gonna get this baking... happy to follow up / evolve it.

4 years, 10 months ago (2016-02-19 14:19:52 UTC) #67

mtklein

The patchset sent to the CQ was uploaded after l-g-t-m from reed@google.com Link to the ...

4 years, 10 months ago (2016-02-19 14:20:02 UTC) #69

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/140001

4 years, 10 months ago (2016-02-19 14:20:12 UTC) #70

commit-bot: I haz the power

The CQ bit was unchecked by commit-bot@chromium.org

4 years, 10 months ago (2016-02-19 16:20:43 UTC) #71

commit-bot: I haz the power

Try jobs failed on following builders: Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot on client.skia.android (JOB_TIMED_OUT, no build URL)

4 years, 10 months ago (2016-02-19 16:20:44 UTC) #72

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1700473003/140001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1700473003/140001

4 years, 10 months ago (2016-02-19 16:41:21 UTC) #74

commit-bot: I haz the power

Description was changed from ========== NEON f32 <-> f16 and f32 <-> u16 Adds f32 ...

4 years, 10 months ago (2016-02-19 17:40:26 UTC) #75

commit-bot: I haz the power

4 years, 10 months ago (2016-02-19 17:40:28 UTC) #76

Message was sent while issue was closed.

Committed patchset #8 (id:140001) as
https://skia.googlesource.com/skia/+/be8c19e8d3deac9b9585c44b9a423912dd00a75a

Issue 1700473003: NEON f32 <-> f16 and f32 <-> u16 (Closed)

Description

Patch Set 1 #

Patch Set 2 : ARMv7 support too #

Patch Set 3 : fixes #

Patch Set 4 : q #

Patch Set 5 : tweak #

Patch Set 6 : f32 <-> u16 #

Patch Set 7 : back off from ARMv7 #

Patch Set 8 : armv8 asm #

Messages