|
|
Chromium Code Reviews|
Created:
11 years, 9 months ago by Lei Zhang Modified:
9 years, 7 months ago CC:
chromium-reviews_googlegroups.com Base URL:
svn://chrome-svn/chrome/trunk/src/ Visibility:
Public. |
DescriptionDo not apply gzip filter to file names with gz/tgz/svgz extensions.
This matches Firefox's behavior.
BUG=8170
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=12467
Patch Set 1 #Patch Set 2 : '' #
Total comments: 2
Patch Set 3 : '' #Patch Set 4 : '' #
Total comments: 1
Patch Set 5 : '' #Patch Set 6 : '' #Messages
Total messages: 16 (0 generated)
Darin, can you take a look or refer me to the appropriate reviewer?
LGTM, but please get a review from haunr too. Thanks! http://codereview.chromium.org/42452/diff/7/1007 File net/base/gzip_filter.h (right): http://codereview.chromium.org/42452/diff/7/1007#newcode69 Line 69: GZIP_GET_NOTGZIP_HEADER nit: NOT_GZIP
Add Jim. Also are we able to understand/fix why browser crashes if filter fails? http://codereview.chromium.org/42452/diff/7/1009 File net/base/gzip_filter.cc (right): http://codereview.chromium.org/42452/diff/7/1009#newcode112 Line 112: GZIP_GET_INVALID_HEADER == gzip_header_status_)) { To quote Lei's findings in bug 8170: "This particular Apache server erroneously sent back an uncompressed tarball, even though it specified the x-gzip encoding. Content-Type: application/x-tar Content-Encoding: x-gzip" In our Filter::FixupEncodingTypes(), we work around cases like: Content-Type: application/x-gzip Content-Encoding: x-gzip Our workaround is cleaning up content-encoding if we see content-type being gzip, x-gzip, x-gunzip. I agree with the fix here that we should not extend the workaround in FixupEncodingTypes to clear content-encoding for application/x-tar. Because of this scenario, we should probably fall back gzip filter to pass-through for any gzip filter header error. Basically we remove the code around possible_sdch_pass_through_ and change the treatment of GZIP_GET_INVALID_HEADER same as the intended GZIP_GET_NOTGZIP_HEADER in this fix. Jim, what do you think?
On 2009/03/23 19:06:16, huanr wrote: > Add Jim. > > Also are we able to understand/fix why browser crashes if filter fails? > > http://codereview.chromium.org/42452/diff/7/1009 > File net/base/gzip_filter.cc (right): > > http://codereview.chromium.org/42452/diff/7/1009#newcode112 > Line 112: GZIP_GET_INVALID_HEADER == gzip_header_status_)) { > To quote Lei's findings in bug 8170: > > "This particular Apache server erroneously sent back an uncompressed tarball, > even though it specified the x-gzip encoding. > > Content-Type: application/x-tar > Content-Encoding: x-gzip" > > In our Filter::FixupEncodingTypes(), we work around cases like: > > Content-Type: application/x-gzip > Content-Encoding: x-gzip > > Our workaround is cleaning up content-encoding if we see content-type being > gzip, x-gzip, x-gunzip. I agree with the fix here that we should not extend the > workaround in FixupEncodingTypes to clear content-encoding for > application/x-tar. > > Because of this scenario, we should probably fall back gzip filter to > pass-through for any gzip filter header error. Basically we remove the code > around possible_sdch_pass_through_ and change the treatment of > GZIP_GET_INVALID_HEADER same as the intended GZIP_GET_NOTGZIP_HEADER in this > fix. > > Jim, what do you think? FixupEncodingTypes won't be able to figure this one out. We really should check what the server is sending us. In both bug 8170 and bug 5772, the server sends the following headers. Content-Type: application/x-tar Content-Encoding: x-gzip However, in bug 5772, the file is gzipped, whereas in bug 8170, it's not.
On 2009/03/23 19:06:16, huanr wrote:
> Also are we able to understand/fix why browser crashes if filter fails?
Here's the backtrace that leads to the crash.
chrome.dll!DownloadThrottlingResourceHandler::OnResponseCompleted(int
request_id=9, const URLRequestStatus & status={...}, const
std::basic_string<char,std::char_traits<char>,std::allocator<char> > &
security_info="") Line 105
chrome.dll!BufferedResourceHandler::OnResponseCompleted(int request_id=9, const
URLRequestStatus & status={...}, const
std::basic_string<char,std::char_traits<char>,std::allocator<char> > &
security_info="") Line 78 + 0x29 bytes
chrome.dll!ResourceDispatcherHost::OnResponseCompleted(URLRequest *
request=0x03a19818) Line 1221 + 0x3a bytes
chrome.dll!ResourceDispatcherHost::OnResponseStarted(URLRequest *
request=0x03a19818) Line 872
chrome.dll!URLRequestJob::NotifyHeadersComplete() Line 369 + 0x2d bytes
chrome.dll!URLRequestHttpJob::NotifyHeadersComplete() Line 477
chrome.dll!URLRequestHttpJob::OnStartCompleted(int result=0) Line 407
chrome.dll!DispatchToMethod<URLRequestHttpJob,void (__thiscall
URLRequestHttpJob::*)(int),int>(URLRequestHttpJob * obj=0x03a1d0b8, void (int)*
method=0x6bfcd8a0, const Tuple1<int> & arg={...}) Line 393 + 0x11 bytes
chrome.dll!CallbackImpl<URLRequestHttpJob,void (__thiscall
URLRequestHttpJob::*)(int),Tuple1<int> >::RunWithParams(const Tuple1<int> &
params={...}) Line 571 + 0x1b bytes
chrome.dll!CallbackRunner<Tuple1<int> >::Run<int>(const int & a=0) Line 536 +
0x1c bytes
chrome.dll!net::HttpCache::Transaction::DoCallback(int rv=0) Line 591
chrome.dll!net::HttpCache::Transaction::HandleResult(int rv=0) Line 597
chrome.dll!net::HttpCache::Transaction::OnNetworkInfoAvailable(int result=0)
Line 965
chrome.dll!DispatchToMethod<net::HttpCache::Transaction,void (__thiscall
net::HttpCache::Transaction::*)(int),int>(net::HttpCache::Transaction *
obj=0x0276d020, void (int)* method=0x6bf65c30, const Tuple1<int> & arg={...})
Line 393 + 0x11 bytes
chrome.dll!CallbackImpl<net::HttpCache::Transaction,void (__thiscall
net::HttpCache::Transaction::*)(int),Tuple1<int> >::RunWithParams(const
Tuple1<int> & params={...}) Line 571 + 0x1b bytes
chrome.dll!CallbackRunner<Tuple1<int> >::Run<int>(const int & a=0) Line 536 +
0x1c bytes
chrome.dll!net::HttpNetworkTransaction::DoCallback(int rv=0) Line 361
chrome.dll!net::HttpNetworkTransaction::OnIOComplete(int result=1434) Line 367
chrome.dll!DispatchToMethod<net::HttpNetworkTransaction,void (__thiscall
net::HttpNetworkTransaction::*)(int),int>(net::HttpNetworkTransaction *
obj=0x0276cb70, void (int)* method=0x6bfaff40, const Tuple1<int> & arg={...})
Line 393 + 0xe bytes
chrome.dll!CallbackImpl<net::HttpNetworkTransaction,void (__thiscall
net::HttpNetworkTransaction::*)(int),Tuple1<int> >::RunWithParams(const
Tuple1<int> & params={...}) Line 571 + 0x17 bytes
chrome.dll!CallbackRunner<Tuple1<int> >::Run<int>(const int & a=1434) Line 536
+ 0x1c bytes
chrome.dll!net::TCPClientSocket::DoCallback(int rv=1434) Line 317
chrome.dll!net::TCPClientSocket::DidCompleteIO() Line 369
chrome.dll!net::TCPClientSocket::OnObjectSignaled(void * object=0x00000434)
Line 381
chrome.dll!base::ObjectWatcher::Watch::Run() Line 30 + 0x1c bytes
chrome.dll!MessageLoop::RunTask(Task * task=0x0271ffd0) Line 308 + 0xf bytes
chrome.dll!MessageLoop::DeferOrRunPendingTask(const MessageLoop::PendingTask &
pending_task={...}) Line 319
chrome.dll!MessageLoop::DoWork() Line 416 + 0xc bytes
chrome.dll!base::MessagePumpForIO::DoRunLoop() Line 445 + 0x19 bytes
chrome.dll!base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate *
delegate=0x037bfc0c, base::MessagePumpWin::Dispatcher * dispatcher=0x00000000)
Line 52 + 0xf bytes
chrome.dll!base::MessagePumpWin::Run(base::MessagePump::Delegate *
delegate=0x037bfc0c) Line 78 + 0x1c bytes
chrome.dll!MessageLoop::RunInternal() Line 197 + 0x2a bytes
chrome.dll!MessageLoop::RunHandler() Line 181
chrome.dll!MessageLoop::Run() Line 155
chrome.dll!base::Thread::ThreadMain() Line 159
chrome.dll!`anonymous namespace'::ThreadFunc(void * closure=0x0271b368) Line 26
+ 0xf bytes
kernel32.dll!7629e3f3()
[Frames below may be incorrect and/or missing, no symbols loaded for
kernel32.dll]
ntdll.dll!774ecfed()
ntdll.dll!774ed1ff()
I don't understand all the code involved, but I've been able to fix the crash as
follows:
Add bool in_error_state_ to DownloadThrottlingResourceHandler. Replace the
NOTREACHED() in OnResponseComplete() with set in_error_state_ to 1. Then in
ContinueDownload(), don't create a new DownloadResourceHandler if
in_error_state_ is true.
There are two issues: a) A crash in Chrome b) Becoming more friendly to poorly configured Apache servers. SUMMARY: I'd like to be sure we really fix the crash, and I'd like to be sure any change in gzip is much more than a one-off compatibility fix for a single mis-configured site. DETAILS: First off, the crash needs to be found and fixed. This patch appears to be an attempt to obscure the only reproducible case we have, by making sure the incorrectly decoded content can't reach later parts of our process. I'd bet a malicious server could go a tad farther to completely avoid all gzip filters, and then your fix would have no impact, and we'd still crash the browser. I think we need to hunt down and remove the crasher bug more directly. Did I by any chance misunderstand, and does this really block the cause of the crash? With regard to changing the semantics for handling gzip content, I'm not sure if we should do this, and I'm not sure if this is how we should do it if we decide to make the change. Before making any change, I think we need stats to show this is a consistent problem with Appche servers, or that (all?) other browsers support auto-correction for this misconfured response. Is there data on either of these fronts? After all, each such hack we add accumulates, making the code harder to grok for us and the next generation of compatible-browsers :-( If we have data that really makes us think we need to change handling of this sort of Apache content, the place we currently do all fixups is the method Filter::FixupEncodingTypes(). That is the place where maximal context can be brought to bear, to make sure we make the narrowest possible extension to the protocol (i.e., we should IMO try to use content encoding per RFCs if at all possible). When we need to add functionality, such a "sniffing and passthrough" we can do so by defining a new filter type (often a slight variant of the existing filter enumerator). For instance, this patch might use the enum currently (temporarily??) used for SDCH when it needs to add a gzip decode (thinking that perchance a proxy has stripped such encoding).
For a), I saw Lei already had some investigation and I suggest we have another CL and review for that. The background for b) is: Apache server has this nice directive http://httpd.apache.org/docs/1.3/mod/mod_mime.html#addencoding With that if you have the following line in the configuration file, which is the case in default Apache install: AddEncoding x-gzip .gz it automatically adds content-encoding: x-gzip for any file with extension .gz. On the other hand, Apache has smarter way to sniff file content and get content-type correctly. What does these mean? (1) if the server has a file foo.gz, Apache will reply with Content-Type: application/x-gzip Content-Encoding: x-gzip As we know, every browser including Chrome works around it (2) if you have a tar file foo.tar but incorrectly add an extension to the file so it becomes foo.tar.gz, Apache will correctly detects the mime type but happily applies the 'AddEncoding x-gzip .gz' rule, and the response becomes: Content-Type: application/x-tar Content-Encoding: x-gzip This could happen to any file. It would be helpful to verify this by setting up an Apache server. What if we rename a plain text file to something like foo.txt.gz? In our Filter::FixupEncodingTypes(), whenever we see Content-Type: application/x-gzip Content-Encoding: x-gzip in the response, we remove Content-Encoding: x-gzip. But we can not do this for Content-Type: application/x-tar, the content may be really gzip encoded. In other words the work around has to be done at the filter level while the actual content is examined. If we can verify that as long as the file extension is .gz Apache server will response with Content-Encoding: x-gzip with the improper configuration setting 'AddEncoding x-gzip .gz'. I am convinced that as a general rule, gzip filter should just fall back to pass-through if there is a header parsing failure. This is safe in the sense that it will not break existing servers. And it appears most clients including other browsers and tools like wget do this. If we go with this direction, the change as I suggest in previous reply, is the following: - removing the newly introduced GZIP_GET_NOTGZIP_HEADER - change the behavior of GZIP_GET_INVALID_HEADER to pass-through fallback - remove previous code around possible_sdch_pass_through_ On 2009/03/23 22:17:27, jar wrote: > There are two issues: > > a) A crash in Chrome > b) Becoming more friendly to poorly configured Apache servers. > > SUMMARY: I'd like to be sure we really fix the crash, and I'd like to be sure > any change in gzip is much more than a one-off compatibility fix for a single > mis-configured site. > > DETAILS: > > First off, the crash needs to be found and fixed. This patch appears to be an > attempt to obscure the only reproducible case we have, by making sure the > incorrectly decoded content can't reach later parts of our process. I'd bet a > malicious server could go a tad farther to completely avoid all gzip filters, > and then your fix would have no impact, and we'd still crash the browser. I > think we need to hunt down and remove the crasher bug more directly. Did I by > any chance misunderstand, and does this really block the cause of the crash? > > With regard to changing the semantics for handling gzip content, I'm not sure if > we should do this, and I'm not sure if this is how we should do it if we decide > to make the change. Before making any change, I think we need stats to show > this is a consistent problem with Appche servers, or that (all?) other browsers > support auto-correction for this misconfured response. Is there data on either > of these fronts? After all, each such hack we add accumulates, making the code > harder to grok for us and the next generation of compatible-browsers :-( > > If we have data that really makes us think we need to change handling of this > sort of Apache content, the place we currently do all fixups is the method > Filter::FixupEncodingTypes(). That is the place where maximal context can be > brought to bear, to make sure we make the narrowest possible extension to the > protocol (i.e., we should IMO try to use content encoding per RFCs if at all > possible). When we need to add functionality, such a "sniffing and passthrough" > we can do so by defining a new filter type (often a slight variant of the > existing filter enumerator). For instance, this patch might use the enum > currently (temporarily??) used for SDCH when it needs to add a gzip decode > (thinking that perchance a proxy has stripped such encoding).
On 2009/03/24 00:35:07, huanr wrote: > For a), I saw Lei already had some investigation and I suggest we have another > CL and review for that. I'll have one soon. I just want to test a few more scenarios. > If we go with this direction, the change as I suggest in previous reply, is the > following: > - removing the newly introduced GZIP_GET_NOTGZIP_HEADER > - change the behavior of GZIP_GET_INVALID_HEADER to pass-through fallback > - remove previous code around possible_sdch_pass_through_ This is fine by me, I added GZIP_GET_NOTGZIP_HEADER because I wanted to make the minimal change possible to fix the case of Apache erroneously reporting Content-Encoding: gzip.
FYI, the (lamest^W simplest) crash fix is http://codereview.chromium.org/42573 CC yourself if interested.
The crash has been fixed in r12410. Patch set 3 follows Firefox's behavior and does not attempt to gunzip files that ends with said extensions. This also helps fix part of bug 5772. (Apparently gnu.org's apache server is also badly configured.)
LGTM
Can you change the description to match the final code? LGTM.
On 2009/03/25 02:25:08, huanr wrote: > Can you change the description to match the final code? LGTM. Done. Compiles on OS_POSIX in patch set 4.
http://codereview.chromium.org/42452/diff/14/15 File net/base/filter.cc (right): http://codereview.chromium.org/42452/diff/14/15#newcode107 Line 107: extension = filename.Extension(); maybe you can use FilePath::FILE_PATH_LITERAL and FilePath::StringType . That way 'if defined' is not needed.
Ah, thanks. FILE_PATH_LITERAL was the part I was missing. Patch set 5 no longer have the #if's.
LGTM |
