Issue 23646005: Improve GIF decoding performance

Issue 23646005: Improve GIF decoding performance (Closed)

Created:
7 years, 3 months ago by Alpha Left Google

Modified:
7 years, 2 months ago

Reviewers:
Peter Kasting, Use pkasting(at)chromium.org

CC:
blink-reviews, eae+blinkwatch, dglazkov+blink, jeez

Base URL:
svn://svn.chromium.org/blink/trunk

Visibility:
Public.

More Reviews

Description

Improve GIF decoding performance This change improves LZW decoding performance for GIF decoding. BACKGROUND LZW decoding uses a dictionary to store LZW codes. The data structure stores the last character for each code and the index of last code. This allows back tracing to construct the entire code. Current implementation builds the code in reverse order in a temporary buffer and then copies it into the row buffer. ALGORITHM This change eliminates the temporary buffer and saves a memory copy. If the code length is known in advance then reconstruction can write directly to the row buffer. The index output buffer is advanced by the code length, and then reconstruct as usual. The code length is saved such that it can be used next time. Testing is done locally with a corpus of 70k images. There's no change in output and roughly result in a 20% performance gain. BUG=233076 R=pkasting@chromium.org Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=158842

Patch Set 1 #

Total comments: 48

Patch Set 2 : fixed comments #

Total comments: 13

Patch Set 3 : fixed bug #

Patch Set 4 : bad code initial dictionary #

Total comments: 10

Patch Set 5 : test with bad code #

Total comments: 2

Patch Set 6 : comments #

Patch Set 7 : comments #

Patch Set 8 : test data moved to Source/core/platform/image-decoders/testing #

Total comments: 2

Created: 7 years, 2 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+128 lines, -96 lines)			Patch
M	Source/core/platform/image-decoders/gif/GIFImageDecoder.h	View	1 2 3 4 5 6 7	2 chunks	+3 lines, -1 line	0 comments	Download
M	Source/core/platform/image-decoders/gif/GIFImageDecoder.cpp	View	1 2 3 4 5 6 7	3 chunks	+3 lines, -5 lines	0 comments	Download
M	Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp	View	1 2 3 4 5 6 7	1 chunk	+27 lines, -0 lines	2 comments	Download
M	Source/core/platform/image-decoders/gif/GIFImageReader.h	View	1 2 3 4 5 6 7	3 chunks	+10 lines, -13 lines	0 comments	Download
M	Source/core/platform/image-decoders/gif/GIFImageReader.cpp	View	1 2 3 4 5 6 7	10 chunks	+85 lines, -77 lines	0 comments	Download
A +	Source/core/platform/image-decoders/testing/bad-code.gif	View	1 2 3 4 5 6 7	Binary file		0 comments	Download
A	Source/core/platform/image-decoders/testing/bad-initial-code.gif	View	1 2 3 4 5 6 7	Binary file		0 comments	Download

Messages

Total messages: 24 (0 generated)

Expand Messages | Collapse Messages

Peter Kasting

This looks better than the number of comments would indicate -- most are either about ...

7 years, 3 months ago (2013-08-29 23:46:25 UTC) #2

This looks better than the number of comments would indicate -- most are either
about better comment clarity, or attempt to address pre-existing issues with the
code.

This change is a nice improvement overall.  We're going from "write to stack >
copy to rowbuffer > read back rowbuffer to convert to colors and write to
framebuffer" to just "write to rowbuffer > read back ...".  It seems like a
subsequent change could buy more performance by skipping the rowbuffer entirely,
and as we decode in doLZW(), converting the decoded values directly to colors
and writing straight to the framebuffer, no rowbuffer necessary.  Do you agree?

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right):

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:205: int code;
Nit: Declare these first three where they are first used.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:243: // The previous
code is not available either because this is the first
Tiny nit: It seems like your comments in this section are "almost less than 80
chars, but not quite" and the code actually managed to fit in 80 chars.

While Blink code has no line length restriction, if you're already close to
fitting in 80 chars anyway, I might go ahead and wrap the comments so that they
do fit inside that.  Makes Rietveld reviews a little easier to read, at least...

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:245: // current code
to stream. The code will have only one character.
Nit: Maybe better second sentence:

This code must therefore just be a single character, which we can write directly
to the stream.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:254: // This is a
complete code. Code length and code content are known.
Nit: How about just:

This is a pre-existing code, so we already know what it encodes.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:257: if (rowIter >=
rowBuffer.end())
Shouldn't this be '>', not ">=", since the row iterator currently points to one
past where we're going to write?  (Also in the block below)

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:262: // by the first
character of previous code.
Nit: How about:

This is a new code just being added to the dictionary.  It must encode the
contents of the previous code, plus the first character of the previous code
again.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:269: // previous
code. Write it to output and then write the previous code.
Nit: This comment seems unnecessary given the above comment.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:274: // to |avail|.
Nit: How about just:

This is an invalid code.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:279: if (code >=
MAX_BYTES || code == prefix[code])
|code| must be less than MAX_BYTES, given how it was initialized (i.e. by
masking against codemask).

|code| also cannot be equal to prefix[code].  For this to occur we would have
had to have oldcode == avail at the beginning of this loop iteration, which can
never happen since that would mean that during the previous loop iteration, code
== avail + 1, which would result in bailing out of the function in the
conditionals above.

Therefore, this conditional should be removed entirely, and the comment below
adjusted to note that "Since |code| is guaranteed to be smaller than
sizeof(suffix), ..."

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:293: if (avail <
4096) {
Nit: Use MAX_DICTIONARY_ENTRIES here in place of 4096 (see comments in header
file)

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:297: avail++;
Nit: I suggest using preincrement form wherever postincrement is not needed
(looks like 3 more spots in this patch?).  It doesn't really matter with
integral types but it's a good habit to be in for consistency with iterators and
such.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:301: // exceed the
specified maximum codeword size of 12 bits.
Nit: Remove "of 12 bits".  Here and elsewhere, I'd rather avoid recopying the
constant value into comments.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:302: if ((!(avail &
codemask)) && (avail < 4096)) {
Nit: Instead of "(avail < 4096)", the equivalent "(codesize <
MAX_DICTIONARY_ENTRY_BITS)" (see comments in header file) would be clearer.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:313: continue;
Nit: It seems slightly clearer to me to skip this, and then below the subsequent
loop, insert a conditional:

if (rowBegin != rowBuffer.begin()) {
  ... memcpy ...

But I suppose in theory this could hurt perf.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:317: for (;rowBegin +
width <= rowIter; rowBegin += width) {
Nit: Space after first semicolon.

If you don't take the previous nit about nuking the conditional above, then this
loop could be converted to "do { ... } while" format and save a comparison.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:326: // of buffer.
Nit: How about just:

Move the remaining bytes to the beginning of the buffer.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:783: // This sets the
largest possible codemask correctly at 4095.
Nit: Remove last sentence, and remove "(12)" from previous sentence.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:787: if (clearCode >=
MAX_BYTES)
This conditional cannot fail given the previous conditional; remove it.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:799:
rowBuffer.resize(m_frameContext->width() + MAX_BYTES);
(See my comment atop GIFImageReader.h as to why I want to nuke MAX_BYTES and
introduce MAX_DICTIONARY_ENTRIES.)

Use the correct value here, and add comments:

    // We want to know the longest sequence encodable by a dictionary with
    // MAX_DICTIONARY_ENTRIES entries.  If we ignore the need to encode the base
    // values themselves at the beginning of the dictionary, as well as the need
    // for a clear code or a termination code, we could use every entry to
    // encode a series of multiple values.  If the input value stream looked
    // like "AAAAA..." (a long string of just one value), the first dictionary
    // entry would encode AA, the next AAA, the next AAAA, and so forth.  Thus
    // the longest sequence would be MAX_DICTIONARY_ENTRIES + 1 values.
    //
    // However, we have to account for reserved entries.  The first |datasize|
    // bits are reserved for the base values, and the next two entries are
    // reserved for the clear code and termination code.  In theory a GIF can
    // set the datasize to 0, meaning we have just two reserved entries, making
    // the longest sequence (MAX_DICTIONARY_ENTIRES + 1) - 2 values long.  Since
    // each value is a byte, this is also the number of bytes in the longest
    // encodable sequence.
    const size_t maxBytes = MAX_DICTIONARY_ENTRIES - 1;

    // Now allocate the output buffer.  We decode directly into this buffer
    // until we have at least one row worth of data, then call outputRow().
    // This means worst case we may have (row width - 1) bytes in the buffer and
    // then decode a sequence |maxBytes| long to append.
    rowBuffer.resize(m_frameContext->width() - 1 + maxBytes);

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:804: memset(suffix,
0, sizeof(suffix));
Nit: Seems like we could save a little time on the memset and be clearer to boot
by doing:

  for (int i = 0; i < clearCode; ++i)
    suffix[i] = i;
  // Clearing the remainder of the suffix table lets us be more tolerant of bad
data.
   memset(suffix + clearCode, 0, sizeof(suffix) - clearCode);

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
File Source/core/platform/image-decoders/gif/GIFImageReader.h (right):

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:50: #define MAX_BYTES  
        4097 /* 2^MAX_LZW_BITS+1 */
Of the uses of this value in this header, all but |stack| -- which you're
removing -- really should have been using 2^MAX_LZW_BITS anyway, since they want
to have an entry for each possible dictionary code, and don't care about the
length of the sequence encodable by the dictionary.  This is also true of all
but one spot in the .cpp file.

Therefore, I suggest replacing this with:

#define MAX_DICTIONARY_ENTRY_BITS 12
#define MAX_DICTIONARY_ENTRIES 4096 // 2^MAX_DICTIONARY_ENTRY_BITS

...and then using that everywhere in this header, and almost everywhere in the
.cpp file.  The exception is in prepareToDecode(), which I comment separately
about there.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:52: #define GIF_COLORS 
           3
Nit: While here, can you name this BYTES_PER_COLORMAP_ENTRY?

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:81: typedef
Vector<unsigned char> GIFRow;
Nit: Maybe instead of declaring this here we can declare it in
GIFImageDecoder.h, so we can use it to replace the explicit Vector<unsigned
char>s that appear in that file (and the .cpp file).

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:123: unsigned short
prefix[MAX_BYTES];
These three arrays don't actually need to be MAX_BYTES long; they can simply be
(2 ^ MAX_LZW_BITS) long.

Alpha Left Google

Getting rid of the rowBuffer is possible but a bit tricky. We'll have to write ...

7 years, 3 months ago (2013-08-30 23:25:36 UTC) #3

Getting rid of the rowBuffer is possible but a bit tricky. We'll have to write
directly to the RGBA bitmap. The tricky part is that LZW decoding can exceed a
row buffer and it goes reverse, which makes it hard to get right. Also this
direct mode doesn't work with progressive GIFs, and to handle both cases will
add more complexity. It's worth trying though.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right):

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:205: int code;
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Declare these first three where they are first used.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:243: // The previous
code is not available either because this is the first
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Tiny nit: It seems like your comments in this section are "almost less than 80
> chars, but not quite" and the code actually managed to fit in 80 chars.
> 
> While Blink code has no line length restriction, if you're already close to
> fitting in 80 chars anyway, I might go ahead and wrap the comments so that
they
> do fit inside that.  Makes Rietveld reviews a little easier to read, at
least...

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:245: // current code
to stream. The code will have only one character.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Maybe better second sentence:
> 
> This code must therefore just be a single character, which we can write
directly
> to the stream.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:254: // This is a
complete code. Code length and code content are known.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: How about just:
> 
> This is a pre-existing code, so we already know what it encodes.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:257: if (rowIter >=
rowBuffer.end())
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Shouldn't this be '>', not ">=", since the row iterator currently points to
one
> past where we're going to write?  (Also in the block below)

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:262: // by the first
character of previous code.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: How about:
> 
> This is a new code just being added to the dictionary.  It must encode the
> contents of the previous code, plus the first character of the previous code
> again.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:269: // previous
code. Write it to output and then write the previous code.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: This comment seems unnecessary given the above comment.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:274: // to |avail|.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: How about just:
> 
> This is an invalid code.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:279: if (code >=
MAX_BYTES || code == prefix[code])
I removed this block of comments entirely. Having code >= avail actually has
unintended behavior which is prevented by the added check above.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:293: if (avail <
4096) {
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Use MAX_DICTIONARY_ENTRIES here in place of 4096 (see comments in header
> file)

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:297: avail++;
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: I suggest using preincrement form wherever postincrement is not needed
> (looks like 3 more spots in this patch?).  It doesn't really matter with
> integral types but it's a good habit to be in for consistency with iterators
and
> such.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:301: // exceed the
specified maximum codeword size of 12 bits.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Remove "of 12 bits".  Here and elsewhere, I'd rather avoid recopying the
> constant value into comments.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:302: if ((!(avail &
codemask)) && (avail < 4096)) {
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Instead of "(avail < 4096)", the equivalent "(codesize <
> MAX_DICTIONARY_ENTRY_BITS)" (see comments in header file) would be clearer.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:313: continue;
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: It seems slightly clearer to me to skip this, and then below the
subsequent
> loop, insert a conditional:
> 
> if (rowBegin != rowBuffer.begin()) {
>   ... memcpy ...
> 
> But I suppose in theory this could hurt perf.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:317: for (;rowBegin +
width <= rowIter; rowBegin += width) {
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Space after first semicolon.
> 
> If you don't take the previous nit about nuking the conditional above, then
this
> loop could be converted to "do { ... } while" format and save a comparison.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:326: // of buffer.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: How about just:
> 
> Move the remaining bytes to the beginning of the buffer.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:783: // This sets the
largest possible codemask correctly at 4095.
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Remove last sentence, and remove "(12)" from previous sentence.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:787: if (clearCode >=
MAX_BYTES)
On 2013/08/29 23:46:25, Peter Kasting wrote:
> This conditional cannot fail given the previous conditional; remove it.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:799:
rowBuffer.resize(m_frameContext->width() + MAX_BYTES);
On 2013/08/29 23:46:25, Peter Kasting wrote:
> (See my comment atop GIFImageReader.h as to why I want to nuke MAX_BYTES and
> introduce MAX_DICTIONARY_ENTRIES.)
> 
> Use the correct value here, and add comments:
> 
>     // We want to know the longest sequence encodable by a dictionary with
>     // MAX_DICTIONARY_ENTRIES entries.  If we ignore the need to encode the
base
>     // values themselves at the beginning of the dictionary, as well as the
need
>     // for a clear code or a termination code, we could use every entry to
>     // encode a series of multiple values.  If the input value stream looked
>     // like "AAAAA..." (a long string of just one value), the first dictionary
>     // entry would encode AA, the next AAA, the next AAAA, and so forth.  Thus
>     // the longest sequence would be MAX_DICTIONARY_ENTRIES + 1 values.
>     //
>     // However, we have to account for reserved entries.  The first |datasize|
>     // bits are reserved for the base values, and the next two entries are
>     // reserved for the clear code and termination code.  In theory a GIF can
>     // set the datasize to 0, meaning we have just two reserved entries,
making
>     // the longest sequence (MAX_DICTIONARY_ENTIRES + 1) - 2 values long. 
Since
>     // each value is a byte, this is also the number of bytes in the longest
>     // encodable sequence.
>     const size_t maxBytes = MAX_DICTIONARY_ENTRIES - 1;
> 
>     // Now allocate the output buffer.  We decode directly into this buffer
>     // until we have at least one row worth of data, then call outputRow().
>     // This means worst case we may have (row width - 1) bytes in the buffer
and
>     // then decode a sequence |maxBytes| long to append.
>     rowBuffer.resize(m_frameContext->width() - 1 + maxBytes);

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:804: memset(suffix,
0, sizeof(suffix));
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Seems like we could save a little time on the memset and be clearer to
boot
> by doing:
> 
>   for (int i = 0; i < clearCode; ++i)
>     suffix[i] = i;
>   // Clearing the remainder of the suffix table lets us be more tolerant of
bad
> data.
>    memset(suffix + clearCode, 0, sizeof(suffix) - clearCode);

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
File Source/core/platform/image-decoders/gif/GIFImageReader.h (right):

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:50: #define MAX_BYTES  
        4097 /* 2^MAX_LZW_BITS+1 */
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Of the uses of this value in this header, all but |stack| -- which you're
> removing -- really should have been using 2^MAX_LZW_BITS anyway, since they
want
> to have an entry for each possible dictionary code, and don't care about the
> length of the sequence encodable by the dictionary.  This is also true of all
> but one spot in the .cpp file.
> 
> Therefore, I suggest replacing this with:
> 
> #define MAX_DICTIONARY_ENTRY_BITS 12
> #define MAX_DICTIONARY_ENTRIES 4096 // 2^MAX_DICTIONARY_ENTRY_BITS
> 
> ...and then using that everywhere in this header, and almost everywhere in the
> .cpp file.  The exception is in prepareToDecode(), which I comment separately
> about there.

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:52: #define GIF_COLORS 
           3
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: While here, can you name this BYTES_PER_COLORMAP_ENTRY?

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:81: typedef
Vector<unsigned char> GIFRow;
On 2013/08/29 23:46:25, Peter Kasting wrote:
> Nit: Maybe instead of declaring this here we can declare it in
> GIFImageDecoder.h, so we can use it to replace the explicit Vector<unsigned
> char>s that appear in that file (and the .cpp file).

Done.

https://codereview.chromium.org/23646005/diff/1/Source/core/platform/image-de...
Source/core/platform/image-decoders/gif/GIFImageReader.h:123: unsigned short
prefix[MAX_BYTES];
On 2013/08/29 23:46:25, Peter Kasting wrote:
> These three arrays don't actually need to be MAX_BYTES long; they can simply
be
> (2 ^ MAX_LZW_BITS) long.

Done.

Peter Kasting

On 2013/08/30 23:25:36, Alpha wrote: > Getting rid of the rowBuffer is possible but a ...

7 years, 3 months ago (2013-08-30 23:30:39 UTC) #4

Peter Kasting

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right): https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp#newcode210 Source/core/platform/image-decoders/gif/GIFImageReader.cpp:210: for (const unsigned char *ch = block; bytesInBlock-- > ...

7 years, 3 months ago (2013-08-31 00:08:49 UTC) #5

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right):

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:210: for (const
unsigned char *ch = block; bytesInBlock-- > 0; ch++) {
Nit: Put * on typename

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:239: if (oldcode ==
-1) {
I just realized that because we don't check that (code < avail) here, an
attacker can sneak a bad code into the beginning of the dictionary  and cause
bad behavior later.  This was a problem with the old code as well.  (For
example, by carefully picking the first couple of codes, I think an attacker can
cause the decoder to infinite loop.  I plan to create a test GIF and check to
see who's vulnerable.)

I think the right solution is to eliminate this block; instead unconditionally
go to what's currently in the other conditional arm.  There, modify the "} else
if (code == avail) {" to be "} else if ((code == avail) && (oldcode != -1)) {".

Added bonus: shorter code.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:310: // Move the
remaining bytes to the beginning of buffer.
Nit: buffer -> the buffer

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:311: size_t
bytesToCopy = rowIter - rowBegin;
Nit: Can be const (since you seem to be willing to use local const elsewhere
around here)

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:805: // Clearing the
whole suffix table lets us be more tolerant of bad data.
I think this comment and the memset can just be nuked.  Assuming you follow my
bugfix comment above, I don't think we ever read from the unset portion of the
suffix table even in the case of bad data.

Peter Kasting

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right): https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp#newcode239 Source/core/platform/image-decoders/gif/GIFImageReader.cpp:239: if (oldcode == -1) { On 2013/08/31 00:08:49, Peter ...

7 years, 3 months ago (2013-08-31 02:07:19 UTC) #6

Alpha Left Google

I've updated the code to fix the bug for intentional bad initial code. Good catch! ...

7 years, 3 months ago (2013-09-01 01:40:21 UTC) #7

I've updated the code to fix the bug for intentional bad initial code. Good
catch!

I found that since we have guaranteed that a new code strictly references a
defined code. There's no need to clear entire suffix, suffixLength and prefix.

It is true that we're now more strict about the code. The previous version did
allow code to be greater than the defined dictionary, this means giving an
invalid value to the pixel. Even worse it means accessing an invalid |prefix|
and causes unexpected behavior.

I fixed this a while back by clearing the entire |prefix| and the problem I just
mentioned wouldn't happen in Blink's version of gif decoder. But before that the
bug certainly was there. I doubt the code was really tolerant as the comment
would suggest.

Even today Mozilla's version of GIF decoder has the same bug:
http://mxr.mozilla.org/mozilla-central/source/image/decoders/nsGIFDecoder2.cp...

The code doesn't clear suffix nor prefix. An undefined code would cause trouble.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right):

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:210: for (const
unsigned char *ch = block; bytesInBlock-- > 0; ch++) {
On 2013/08/31 00:08:49, Peter Kasting wrote:
> Nit: Put * on typename

Done.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:239: if (oldcode ==
-1) {
I removed this branch and let the other branch handle the initial code. The only
change is we need to have:

if (avail < MAX_DICTIONARY_ENTRIES && oldcode != -1)

Such that he dictionary doesn't reference the invalid code -1.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:245: if (rowIter >=
rowBuffer.end())
That is true. Since we can be sure there's no cycles in the dictionary we won't
be writing outside of the row buffer.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:310: // Move the
remaining bytes to the beginning of buffer.
On 2013/08/31 00:08:49, Peter Kasting wrote:
> Nit: buffer -> the buffer

Done.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:311: size_t
bytesToCopy = rowIter - rowBegin;
On 2013/08/31 00:08:49, Peter Kasting wrote:
> Nit: Can be const (since you seem to be willing to use local const elsewhere
> around here)

Done.

https://codereview.chromium.org/23646005/diff/7001/Source/core/platform/image...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:805: // Clearing the
whole suffix table lets us be more tolerant of bad data.
On 2013/08/31 00:08:49, Peter Kasting wrote:
> I think this comment and the memset can just be nuked.  Assuming you follow my
> bugfix comment above, I don't think we ever read from the unset portion of the
> suffix table even in the case of bad data.

Thinking about this a bit more since we're now more strict about the code and
have established that a new code always references a defined and smaller code it
is not necessary to clear the entire suffixLength as well.

Clearing prefix is also not necessary.

Alpha Left Google

So at the end I think the new behavior is better. The corrupted image will ...

7 years, 3 months ago (2013-09-01 01:42:12 UTC) #8

Peter Kasting

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp File Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp (right): https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp#newcode458 Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp:458: RefPtr<SharedBuffer> testData = readFile("/Source/web/tests/data/bad-initial-code.gif"); Nit: Might want a comment ...

7 years, 3 months ago (2013-09-03 18:49:45 UTC) #9

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
File Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp (right):

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp:458:
RefPtr<SharedBuffer> testData =
readFile("/Source/web/tests/data/bad-initial-code.gif");
Nit: Might want a comment here about what this test actually verifies.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right):

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:246: } else if (code
>= avail && oldcode != -1) {
Why use >= here instead of ==?  This allows an attacker to introduce error to
the code table (because tempCode, and thus eventually oldcode down below, will
be set to the invalid code).

The new testcase I gave you won't catch this, but you could create one with the
invalid code after the beginning of the file that would.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:252: *--rowIter =
firstchar;
Nit: This subtraction isn't necessary... maybe

codeLength = suffixLength[oldcode];
rowIter += codeLength++;
*rowIter = firstchar;

or

const int length = suffixLength[oldcode];
rowIter += length;
*rowIter = firstchar;
codeLength += length + 1;

Probably the compiler optimizes all these to the same thing, though, so it
doesn't end up mattering.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:269: // dictionary is
not in initial state.
Nit: This comment isn't actually correct (checking for oldcode != -1 is not a
check for "the dictionary being in its initial state").  How about:

Define a new codeword in the dictionary as long as we've read more than one
value from the stream.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:279: if ((!(avail &
codemask)) && (avail < MAX_DICTIONARY_ENTRIES)) {
Nit: No parens necessary around unary operator.  I wouldn't put parens around
the binary op either since you're not doing it elsewhere in this function.

Alpha Left Google

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right): https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp#newcode246 Source/core/platform/image-decoders/gif/GIFImageReader.cpp:246: } else if (code >= avail && oldcode != ...

7 years, 3 months ago (2013-09-04 23:47:28 UTC) #10

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right):

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:246: } else if (code
>= avail && oldcode != -1) {
On 2013/09/03 18:49:45, Peter Kasting wrote:
> Why use >= here instead of ==?  This allows an attacker to introduce error to
> the code table (because tempCode, and thus eventually oldcode down below, will
> be set to the invalid code).
> 
> The new testcase I gave you won't catch this, but you could create one with
the
> invalid code after the beginning of the file that would.

Nice catch. I didn't realize tempCode would mess up  things. I was planning to
handle bad code like the original but missed the tempCode. Let's make this
correct and if needed be more tolerant of bad data.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:252: *--rowIter =
firstchar;
I think my original code was the second one but I decided that doesn't matter
performance. I'd like this to better reflect the comments.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:269: // dictionary is
not in initial state.
On 2013/09/03 18:49:45, Peter Kasting wrote:
> Nit: This comment isn't actually correct (checking for oldcode != -1 is not a
> check for "the dictionary being in its initial state").  How about:
> 
> Define a new codeword in the dictionary as long as we've read more than one
> value from the stream.

Done.

https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/imag...
Source/core/platform/image-decoders/gif/GIFImageReader.cpp:279: if ((!(avail &
codemask)) && (avail < MAX_DICTIONARY_ENTRIES)) {
On 2013/09/03 18:49:45, Peter Kasting wrote:
> Nit: No parens necessary around unary operator.  I wouldn't put parens around
> the binary op either since you're not doing it elsewhere in this function.

Done.

Peter Kasting

LGTM https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp File Source/core/platform/image-decoders/gif/GIFImageReader.cpp (right): https://codereview.chromium.org/23646005/diff/33001/Source/core/platform/image-decoders/gif/GIFImageReader.cpp#newcode252 Source/core/platform/image-decoders/gif/GIFImageReader.cpp:252: *--rowIter = firstchar; On 2013/09/04 23:47:28, Alpha wrote: ...

7 years, 3 months ago (2013-09-05 06:13:01 UTC) #11

Alpha Left Google

https://codereview.chromium.org/23646005/diff/38001/Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp File Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp (right): https://codereview.chromium.org/23646005/diff/38001/Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp#newcode455 Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp:455: // The first LZW code in the image is ...

7 years, 3 months ago (2013-09-05 17:02:27 UTC) #12

commit-bot: I haz the power

CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/hclam@chromium.org/23646005/50001

7 years, 3 months ago (2013-09-05 17:04:30 UTC) #13

commit-bot: I haz the power

Can't process patch for file Source/web/tests/data/bad-code.gif. Binary file support is temporarilly disabled due to a ...

7 years, 3 months ago (2013-09-05 17:04:36 UTC) #14

Alpha Left Google

+jamesr: OWNERS approval for the added data files in Source/web

7 years, 3 months ago (2013-09-05 22:53:03 UTC) #15

jamesr

Could you put these data files somewhere else? It doesn't make sense to add files ...

7 years, 3 months ago (2013-09-05 23:27:18 UTC) #16

Alpha Left Google

On 2013/09/05 23:27:18, jamesr wrote: > Could you put these data files somewhere else? It ...

7 years, 3 months ago (2013-09-11 03:36:37 UTC) #17

jamesr

If they aren't for LayoutTests then why would you put them in LayoutTests/? Your test ...

7 years, 3 months ago (2013-09-11 21:14:17 UTC) #18

Alpha Left Google

Picking this up after vacation. It appears that other test files for images are also ...

7 years, 2 months ago (2013-10-03 01:08:06 UTC) #19

jamesr

Source/core/platform/image-decoders/testing/ ? maybe /test/ ? I don't think it makes a huge difference so long ...

7 years, 2 months ago (2013-10-03 01:09:28 UTC) #20

Alpha Left Google

Okay. I'll create that dir and gradually move files there.

7 years, 2 months ago (2013-10-03 01:10:22 UTC) #21

Alpha Left Google

-jamesr since this is only in Source/core/platform/image-decoders. pkasting: please review again as I created Source/core/platform/image-decoders/testing ...

7 years, 2 months ago (2013-10-03 01:26:58 UTC) #22

Peter Kasting

LGTM https://codereview.chromium.org/23646005/diff/64001/Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp File Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp (right): https://codereview.chromium.org/23646005/diff/64001/Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp#newcode456 Source/core/platform/image-decoders/gif/GIFImageDecoderTest.cpp:456: // dictionary. Decoding should fail, but not infinitely ...

7 years, 2 months ago (2013-10-03 17:52:27 UTC) #23

Alpha Left Google

7 years, 2 months ago (2013-10-03 22:49:28 UTC) #24

Message was sent while issue was closed.

Committed patchset #8 manually as r158842 (presubmit successful).

Expand Messages | Collapse Messages