Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(160)

Unified Diff: ChangeLog.txt

Issue 1934113002: Update libjpeg_turbo to 1.4.90 from https://github.com/libjpeg-turbo/ (Closed) Base URL: https://chromium.googlesource.com/chromium/deps/libjpeg_turbo.git@master
Patch Set: Created 4 years, 8 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
Index: ChangeLog.txt
diff --git a/ChangeLog.txt b/ChangeLog.txt
index 3ec6c18f230c60a67c9082d0ebd71cace703d26f..4062c698d7501460b7c8b225b725e452a844a484 100644
--- a/ChangeLog.txt
+++ b/ChangeLog.txt
@@ -1,3 +1,407 @@
+1.5 beta1
+=========
+
+[1] Added full SIMD acceleration for PowerPC platforms using AltiVec VMX
+(128-bit SIMD) instructions. Although the performance of libjpeg-turbo on
+PowerPC was already good, due to the increased number of registers available
+to the compiler vs. x86, it was still possible to speed up compression by about
+3-4x and decompression by about 2-2.5x (relative to libjpeg v6b) through the
+use of AltiVec instructions.
+
+[2] Added two new libjpeg API functions (jpeg_skip_scanlines() and
+jpeg_crop_scanline()) that can be used to partially decode a JPEG image. See
+libjpeg.txt for more details.
+
+[3] The TJCompressor and TJDecompressor classes in the TurboJPEG Java API now
+implement the Closeable interface, so those classes can be used with a
+try-with-resources statement.
+
+[4] The TurboJPEG Java classes now throw unchecked idiomatic exceptions
+(IllegalArgumentException, IllegalStateException) for unrecoverable errors
+caused by incorrect API usage, and those classes throw a new checked exception
+type (TJException) for errors that are passed through from the C library.
+
+[5] Source buffers for the TurboJPEG C API functions, as well as the
+jpeg_mem_src() function in the libjpeg API, are now declared as const pointers.
+This facilitates passing read-only buffers to those functions and ensures the
+caller that the source buffer will not be modified. This should not create any
+backward API or ABI incompatibilities with prior libjpeg-turbo releases.
+
+[6] The MIPS DSPr2 SIMD code can now be compiled to support either FR=0 or FR=1
+FPUs.
+
+[7] Fixed additional negative left shifts and other issues reported by the GCC
+and Clang undefined behavior sanitizers. Most of these issues affected only
+32-bit code, and none of them was known to pose a security threat, but removing
+the warnings makes it easier to detect actual security issues, should they
+arise in the future.
+
+[8] Removed the unnecessary .arch directive from the ARM64 NEON SIMD code.
+This directive was preventing the code from assembling using the clang
+integrated assembler.
+
+[9] Fixed a regression caused by 1.4.1[6] that prevented 32-bit and 64-bit
+libjpeg-turbo RPMs from being installed simultaneously on recent Red Hat/Fedora
+distributions. This was due to the addition of a macro in jconfig.h that
+allows the Huffman codec to determine the word size at compile time. Since
+that macro differs between 32-bit and 64-bit builds, this caused a conflict
+between the i386 and x86_64 RPMs (any differing files, other than executables,
+are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.)
+Since the macro is used only internally, it has been moved into jconfigint.h.
+
+[10] The x86-64 SIMD code can now be disabled at run time by setting the
+JSIMD_FORCENONE environment variable to 1 (the other SIMD implementations
+already had this capability.)
+
+[11] Added a new command-line argument to TJBench (-nowrite) that prevents the
+benchmark from outputting any images. This removes any potential operating
+system overhead that might be caused by lazy writes to disk and thus improves
+the consistency of the performance measurements.
+
+[12] Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and
+x86-64 platforms. This speeds up the compression of full-color JPEGs by about
+10-15% on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and
+AMD CPUs. Additionally, this works around an issue in the clang optimizer that
+prevents it (as of this writing) from achieving the same performance as GCC
+when compiling the C version of the Huffman encoder
+(https://llvm.org/bugs/show_bug.cgi?id=16035). For the purposes of benchmarking
+or regression testing, SIMD-accelerated Huffman encoding can be disabled by
+setting the JSIMD_NOHUFFENC environment variable to 1.
+
+[13] Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
+compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
+downsampling algorithms, which are not accelerated in the 32-bit NEON
+implementation.) This speeds up the compression of full-color JPEGs by about
+75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
+Cortex-A53 and Cortex-A57 cores.
+
+[14] Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
+and 64-bit platforms.
+
+For 32-bit code, this speeds up the compression of full-color JPEGs by about
+30% on average on a typical iOS device (iPhone 4S, Cortex-A9) and by about 6-7%
+on average on a typical Android device (Nexus 5X, Cortex-A53 and Cortex-A57),
+relative to libjpeg-turbo 1.4.x. Note that the larger speedup under iOS is due
+to the fact that iOS builds use LLVM, which does not optimize the C Huffman
+encoder as well as GCC does.
+
+For 64-bit code, NEON-accelerated Huffman encoding speeds up the compression of
+full-color JPEGs by about 40% on average on a typical iOS device (iPhone 5S,
+Apple A7) and by about 7-8% on average on a typical Android device (Nexus 5X,
+Cortex-A53 and Cortex-A57), in addition to the speedup described in [13] above.
+
+For the purposes of benchmarking or regression testing, SIMD-accelerated
+Huffman encoding can be disabled by setting the JSIMD_NOHUFFENC environment
+variable to 1.
+
+[15] pkg-config (.pc) scripts are now included for both the libjpeg and
+TurboJPEG API libraries on Un*x systems. Note that if a project's build system
+relies on these scripts, then it will not be possible to build that project
+with libjpeg or with a prior version of libjpeg-turbo.
+
+[16] Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
+improve performance on CPUs with in-order pipelines. This speeds up the
+decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
+processor and by about 15% on average on a Cortex-A53 core.
+
+[17] Fixed an issue in the accelerated Huffman decoder that could have caused
+the decoder to read past the end of the input buffer when a malformed,
+specially-crafted JPEG image was being decompressed. In prior versions of
+libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only
+if there were > 128 bytes of data in the input buffer. However, it is possible
+to construct a JPEG image in which a single Huffman block is over 430 bytes
+long, so this version of libjpeg-turbo activates the accelerated Huffman
+decoder only if there are > 512 bytes of data in the input buffer.
+
+[18] Fixed a memory leak in tjunittest encountered when running the program
+with the -yuv option.
+
+
+1.4.2
+=====
+
+[1] Fixed an issue whereby cjpeg would segfault if a Windows bitmap with a
+negative width or height was used as an input image (Windows bitmaps can have
+a negative height if they are stored in top-down order, but such files are
+rare and not supported by libjpeg-turbo.)
+
+[2] Fixed an issue whereby, under certain circumstances, libjpeg-turbo would
+incorrectly encode certain JPEG images when quality=100 and the fast integer
+forward DCT were used. This was known to cause 'make test' to fail when the
+library was built with '-march=haswell' on x86 systems.
+
+[3] Fixed an issue whereby libjpeg-turbo would crash when built with the latest
+& greatest development version of the Clang/LLVM compiler. This was caused by
+an x86-64 ABI conformance issue in some of libjpeg-turbo's 64-bit SSE2 SIMD
+routines. Those routines were incorrectly using a 64-bit mov instruction to
+transfer a 32-bit JDIMENSION argument, whereas the x86-64 ABI allows the upper
+(unused) 32 bits of a 32-bit argument's register to be undefined. The new
+Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit
+structure members into a single 64-bit register, and this exposed the ABI
+conformance issue.
+
+[4] Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged)
+upsampling routine that caused a buffer overflow (and subsequent segfault) when
+decompressing a 4:2:0 JPEG image whose scaled output width was less than 16
+pixels. The "plain" upsampling routines are normally only used when
+decompressing a non-YCbCr JPEG image, but they are also used when decompressing
+a JPEG image whose scaled output height is 1.
+
+[5] Fixed various negative left shifts and other issues reported by the GCC and
+Clang undefined behavior sanitizers. None of these was known to pose a
+security threat, but removing the warnings makes it easier to detect actual
+security issues, should they arise in the future.
+
+
+1.4.1
+=====
+
+[1] tjbench now properly handles CMYK/YCCK JPEG files. Passing an argument of
+-cmyk (instead of, for instance, -rgb) will cause tjbench to internally convert
+the source bitmap to CMYK prior to compression, to generate YCCK JPEG files,
+and to internally convert the decompressed CMYK pixels back to RGB after
+decompression (the latter is done automatically if a CMYK or YCCK JPEG is
+passed to tjbench as a source image.) The CMYK<->RGB conversion operation is
+not benchmarked. NOTE: The quick & dirty CMYK<->RGB conversions that tjbench
+uses are suitable for testing only. Proper conversion between CMYK and RGB
+requires a color management system.
+
+[2] 'make test' now performs additional bitwise regression tests using tjbench,
+mainly for the purpose of testing compression from/decompression to a subregion
+of a larger image buffer.
+
+[3] 'make test' no longer tests the regression of the floating point DCT/IDCT
+by default, since the results of those tests can vary if the algorithms in
+question are not implemented using SIMD instructions on a particular platform.
+See the comments in Makefile.am for information on how to re-enable the tests
+and to specify an expected result for them based on the particulars of your
+platform.
+
+[4] The NULL color conversion routines have been significantly optimized,
+which speeds up the compression of RGB and CMYK JPEGs by 5-20% when using
+64-bit code and 0-3% when using 32-bit code, and the decompression of those
+images by 10-30% when using 64-bit code and 3-12% when using 32-bit code.
+
+[5] Fixed an "illegal instruction" error that occurred when djpeg from a
+SIMD-enabled libjpeg-turbo MIPS build was executed with the -nosmooth option on
+a MIPS machine that lacked DSPr2 support. The MIPS SIMD routines for h2v1 and
+h2v2 merged upsampling were not properly checking for the existence of DSPr2.
+
+[6] Performance has been improved significantly on 64-bit non-Linux and
+non-Windows platforms (generally 10-20% faster compression and 5-10% faster
+decompression.) Due to an oversight, the 64-bit version of the accelerated
+Huffman codec was not being compiled in when libjpeg-turbo was built on
+platforms other than Windows or Linux. Oops.
+
+[7] Fixed an extremely rare bug in the Huffman encoder that caused 64-bit
+builds of libjpeg-turbo to incorrectly encode a few specific test images when
+quality=98, an optimized Huffman table, and the slow integer forward DCT were
+used.
+
+[8] The Windows (CMake) build system now supports building only static or only
+shared libraries. This is accomplished by adding either -DENABLE_STATIC=0 or
+-DENABLE_SHARED=0 to the CMake command line.
+
+[9] TurboJPEG API functions will now return an error code if a warning is
+triggered in the underlying libjpeg API. For instance, if a JPEG file is
+corrupt, the TurboJPEG decompression functions will attempt to decompress
+as much of the image as possible, but those functions will now return -1 to
+indicate that the decompression was not entirely successful.
+
+[10] Fixed a bug in the MIPS DSPr2 4:2:2 fancy upsampling routine that caused a
+buffer overflow (and subsequent segfault) when decompressing a 4:2:2 JPEG image
+in which the right-most MCU was 5 or 6 pixels wide.
+
+
+1.4.0
+=====
+
+[1] Fixed a build issue on OS X PowerPC platforms (md5cmp failed to build
+because OS X does not provide the le32toh() and htole32() functions.)
+
+[2] The non-SIMD RGB565 color conversion code did not work correctly on big
+endian machines. This has been fixed.
+
+[3] Fixed an issue in tjPlaneSizeYUV() whereby it would erroneously return 1
+instead of -1 if componentID was > 0 and subsamp was TJSAMP_GRAY.
+
+[3] Fixed an issue in tjBufSizeYUV2() whereby it would erroneously return 0
+instead of -1 if width was < 1.
+
+[5] The Huffman encoder now uses clz and bsr instructions for bit counting on
+ARM64 platforms (see 1.4 beta1 [5].)
+
+[6] The close() method in the TJCompressor and TJDecompressor Java classes is
+now idempotent. Previously, that method would call the native tjDestroy()
+function even if the TurboJPEG instance had already been destroyed. This
+caused an exception to be thrown during finalization, if the close() method had
+already been called. The exception was caught, but it was still an expensive
+operation.
+
+[7] The TurboJPEG API previously generated an error ("Could not determine
+subsampling type for JPEG image") when attempting to decompress grayscale JPEG
+images that were compressed with a sampling factor other than 1 (for instance,
+with 'cjpeg -grayscale -sample 2x2'). Subsampling technically has no meaning
+with grayscale JPEGs, and thus the horizontal and vertical sampling factors
+for such images are ignored by the decompressor. However, the TurboJPEG API
+was being too rigid and was expecting the sampling factors to be equal to 1
+before it treated the image as a grayscale JPEG.
+
+[8] cjpeg, djpeg, and jpegtran now accept an argument of -version, which will
+print the library version and exit.
+
+[9] Referring to 1.4 beta1 [15], another extremely rare circumstance was
+discovered under which the Huffman encoder's local buffer can be overrun
+when a buffered destination manager is being used and an
+extremely-high-frequency block (basically junk image data) is being encoded.
+Even though the Huffman local buffer was increased from 128 bytes to 136 bytes
+to address the previous issue, the new issue caused even the larger buffer to
+be overrun. Further analysis reveals that, in the absolute worst case (such as
+setting alternating AC coefficients to 32767 and -32768 in the JPEG scanning
+order), the Huffman encoder can produce encoded blocks that approach double the
+size of the unencoded blocks. Thus, the Huffman local buffer was increased to
+256 bytes, which should prevent any such issue from re-occurring in the future.
+
+[10] The new tjPlaneSizeYUV(), tjPlaneWidth(), and tjPlaneHeight() functions
+were not actually usable on any platform except OS X and Windows, because
+those functions were not included in the libturbojpeg mapfile. This has been
+fixed.
+
+[11] Restored the JPP(), JMETHOD(), and FAR macros in the libjpeg-turbo header
+files. The JPP() and JMETHOD() macros were originally implemented in libjpeg
+as a way of supporting non-ANSI compilers that lacked support for prototype
+parameters. libjpeg-turbo has never supported such compilers, but some
+software packages still use the macros to define their own prototypes.
+Similarly, libjpeg-turbo has never supported MS-DOS and other platforms that
+have far symbols, but some software packages still use the FAR macro. A pretty
+good argument can be made that this is a bad practice on the part of the
+software in question, but since this affects more than one package, it's just
+easier to fix it here.
+
+[12] Fixed issues that were preventing the ARM 64-bit SIMD code from compiling
+for iOS, and included an ARMv8 architecture in all of the binaries installed by
+the "official" libjpeg-turbo SDK for OS X.
+
+
+1.3.90 (1.4 beta1)
+==================
+
+[1] New features in the TurboJPEG API:
+-- YUV planar images can now be generated with an arbitrary line padding
+(previously only 4-byte padding, which was compatible with X Video, was
+supported.)
+-- The decompress-to-YUV function has been extended to support image scaling.
+-- JPEG images can now be compressed from YUV planar source images.
+-- YUV planar images can now be decoded into RGB or grayscale images.
+-- 4:1:1 subsampling is now supported. This is mainly included for
+compatibility, since 4:1:1 is not fully accelerated in libjpeg-turbo and has no
+significant advantages relative to 4:2:0.
+-- CMYK images are now supported. This feature allows CMYK source images to be
+compressed to YCCK JPEGs and YCCK or CMYK JPEGs to be decompressed to CMYK
+destination images. Conversion between CMYK/YCCK and RGB or YUV images is not
+supported. Such conversion requires a color management system and is thus out
+of scope for a codec library.
+-- The handling of YUV images in the Java API has been significantly refactored
+and should now be much more intuitive.
+-- The Java API now supports encoding a YUV image from an arbitrary position in
+a large image buffer.
+-- All of the YUV functions now have a corresponding function that operates on
+separate image planes instead of a unified image buffer. This allows for
+compressing/decoding from or decompressing/encoding to a subregion of a larger
+YUV image. It also allows for handling YUV formats that swap the order of the
+U and V planes.
+
+[2] Added SIMD acceleration for DSPr2-capable MIPS platforms. This speeds up
+the compression of full-color JPEGs by 70-80% on such platforms and
+decompression by 25-35%.
+
+[3] If an application attempts to decompress a Huffman-coded JPEG image whose
+header does not contain Huffman tables, libjpeg-turbo will now insert the
+default Huffman tables. In order to save space, many motion JPEG video frames
+are encoded without the default Huffman tables, so these frames can now be
+successfully decompressed by libjpeg-turbo without additional work on the part
+of the application. An application can still override the Huffman tables, for
+instance to re-use tables from a previous frame of the same video.
+
+[4] The Mac packaging system now uses pkgbuild and productbuild rather than
+PackageMaker (which is obsolete and no longer supported.) This means that
+OS X 10.6 "Snow Leopard" or later must be used when packaging libjpeg-turbo,
+although the packages produced can be installed on OS X 10.5 "Leopard" or
+later. OS X 10.4 "Tiger" is no longer supported.
+
+[5] The Huffman encoder now uses clz and bsr instructions for bit counting on
+ARM platforms rather than a lookup table. This reduces the memory footprint
+by 64k, which may be important for some mobile applications. Out of four
+Android devices that were tested, two demonstrated a small overall performance
+loss (~3-4% on average) with ARMv6 code and a small gain (also ~3-4%) with
+ARMv7 code when enabling this new feature, but the other two devices
+demonstrated a significant overall performance gain with both ARMv6 and ARMv7
+code (~10-20%) when enabling the feature. Actual mileage may vary.
+
+[6] Worked around an issue with Visual C++ 2010 and later that caused incorrect
+pixels to be generated when decompressing a JPEG image to a 256-color bitmap,
+if compiler optimization was enabled when libjpeg-turbo was built. This caused
+the regression tests to fail when doing a release build under Visual C++ 2010
+and later.
+
+[7] Improved the accuracy and performance of the non-SIMD implementation of the
+floating point inverse DCT (using code borrowed from libjpeg v8a and later.)
+The accuracy of this implementation now matches the accuracy of the SSE/SSE2
+implementation. Note, however, that the floating point DCT/IDCT algorithms are
+mainly a legacy feature. They generally do not produce significantly better
+accuracy than the slow integer DCT/IDCT algorithms, and they are quite a bit
+slower.
+
+[8] Added a new output colorspace (JCS_RGB565) to the libjpeg API that allows
+for decompressing JPEG images into RGB565 (16-bit) pixels. If dithering is not
+used, then this code path is SIMD-accelerated on ARM platforms.
+
+[9] Numerous obsolete features, such as support for non-ANSI compilers and
+support for the MS-DOS memory model, were removed from the libjpeg code,
+greatly improving its readability and making it easier to maintain and extend.
+
+[10] Fixed a segfault that occurred when calling output_message() with msg_code
+set to JMSG_COPYRIGHT.
+
+[11] Fixed an issue whereby wrjpgcom was allowing comments longer than 65k
+characters to be passed on the command line, which was causing it to generate
+incorrect JPEG files.
+
+[12] Fixed a bug in the build system that was causing the Windows version of
+wrjpgcom to be built using the rdjpgcom source code.
+
+[13] Restored 12-bit-per-component JPEG support. A 12-bit version of
+libjpeg-turbo can now be built by passing an argument of --with-12bit to
+configure (Unix) or -DWITH_12BIT=1 to cmake (Windows.) 12-bit JPEG support is
+included only for convenience. Enabling this feature disables all of the
+performance features in libjpeg-turbo, as well as arithmetic coding and the
+TurboJPEG API. The resulting library still contains the other libjpeg-turbo
+features (such as the colorspace extensions), but in general, it performs no
+faster than libjpeg v6b.
+
+[14] Added ARM 64-bit SIMD acceleration for the YCC-to-RGB color conversion
+and IDCT algorithms (both are used during JPEG decompression.) For unknown
+reasons (probably related to clang), this code cannot currently be compiled for
+iOS.
+
+[15] Fixed an extremely rare bug that could cause the Huffman encoder's local
+buffer to overrun when a very high-frequency MCU is compressed using quality
+100 and no subsampling, and when the JPEG output buffer is being dynamically
+resized by the destination manager. This issue was so rare that, even with a
+test program specifically designed to make the bug occur (by injecting random
+high-frequency YUV data into the compressor), it was reproducible only once in
+about every 25 million iterations.
+
+[16] Fixed an oversight in the TurboJPEG C wrapper: if any of the JPEG
+compression functions was called repeatedly with the same
+automatically-allocated destination buffer, then TurboJPEG would erroneously
+assume that the jpegSize parameter was equal to the size of the buffer, when in
+fact that parameter was probably equal to the size of the most recently
+compressed JPEG image. If the size of the previous JPEG image was not as large
+as the current JPEG image, then TurboJPEG would unnecessarily reallocate the
+destination buffer.
+
+
1.3.1
=====
@@ -128,9 +532,9 @@ ABI. The "age number" of the libjpeg-turbo library on Un*x systems has been
incremented by 1 to reflect this. You can disable this feature with a
configure/CMake switch in order to retain strict API/ABI compatibility with the
libjpeg v6b or v7 API/ABI (or with previous versions of libjpeg-turbo.) See
-README-turbo.txt for more details.
+README.md for more details.
-[13] Added ARM v7s architecture to libjpeg.a and libturbojpeg.a in the official
+[13] Added ARMv7s architecture to libjpeg.a and libturbojpeg.a in the official
libjpeg-turbo binary package for OS X, so that those libraries can be used to
build applications that leverage the faster CPUs in the iPhone 5 and iPad 4.
@@ -213,7 +617,7 @@ K component is assigned a component ID of 1 instead of 4. Although these files
are in violation of the spec, other JPEG implementations handle them
correctly.
-[7] Added ARM v6 and ARM v7 architectures to libjpeg.a and libturbojpeg.a in
+[7] Added ARMv6 and ARMv7 architectures to libjpeg.a and libturbojpeg.a in
the official libjpeg-turbo binary package for OS X, so that those libraries can
be used to build both OS X and iOS applications.
@@ -364,7 +768,7 @@ tjDecompressToYUV(), to replace the somewhat hackish TJ_YUV flag.
==================
[1] Added emulation of the libjpeg v7 and v8 APIs and ABIs. See
-README-turbo.txt for more details. This feature was sponsored by CamTrace SAS.
+README.md for more details. This feature was sponsored by CamTrace SAS.
[2] Created a new CMake-based build system for the Visual C++ and MinGW builds.

Powered by Google App Engine
This is Rietveld 408576698