Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(603)

Issue 6670085: FTP: Detect the character encoding only after the entire listing is received. (Closed)

Created:
9 years, 9 months ago by Paweł Hajdan Jr.
Modified:
9 years, 7 months ago
Reviewers:
eroman, wtc
CC:
chromium-reviews, cbentzel+watch_chromium.org, darin-cc_chromium.org
Visibility:
Public.

Description

FTP: Detect the character encoding only after the entire listing is received. Otherwise the parser could be confused extremely easily. Imagine a listing which has only ASCII characters at the beginning, but after we detect it as ASCII-encoded, some UTF-8 bytes appear. It can be made more complex though, for example the beginning might look like UTF-8, but the entire listing may use a different encoding incompatible with UTF-8 (that was the case here, see bug). This change also simplifies the directory listing parser. When we always have the entire listing when parsing, some complex code can be removed. BUG=76171 TEST=see bug Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=79490

Patch Set 1 #

Patch Set 2 : polished #

Patch Set 3 : update histograms for SERVER_UNKNOWN #

Total comments: 12

Patch Set 4 : fixes #

Patch Set 5 : test coverage #

Unified diffs Side-by-side diffs Delta from patch set Stats (+723 lines, -1071 lines) Patch
D net/ftp/ftp_directory_listing_buffer.h View 1 chunk +0 lines, -101 lines 0 comments Download
D net/ftp/ftp_directory_listing_buffer.cc View 1 chunk +0 lines, -194 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser.h View 1 3 chunks +11 lines, -23 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser.cc View 1 2 1 chunk +97 lines, -4 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_ls.h View 1 2 chunks +13 lines, -32 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_ls.cc View 3 chunks +109 lines, -118 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_ls_unittest.cc View 1 4 chunks +45 lines, -31 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_netware.h View 1 2 chunks +14 lines, -31 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_netware.cc View 3 chunks +51 lines, -70 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_netware_unittest.cc View 1 3 chunks +30 lines, -12 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_unittest.h View 1 5 chunks +17 lines, -9 lines 0 comments Download
A + net/ftp/ftp_directory_listing_parser_unittest.cc View 1 2 3 6 chunks +16 lines, -14 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_vms.h View 1 2 chunks +8 lines, -50 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_vms.cc View 1 2 3 4 chunks +91 lines, -129 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_vms_unittest.cc View 1 2 3 4 2 chunks +100 lines, -59 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_windows.h View 1 2 chunks +8 lines, -19 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_windows.cc View 1 3 chunks +45 lines, -56 lines 0 comments Download
M net/ftp/ftp_directory_listing_parser_windows_unittest.cc View 1 4 chunks +35 lines, -24 lines 0 comments Download
M net/net.gyp View 1 2 3 2 chunks +1 line, -3 lines 0 comments Download
M webkit/glue/ftp_directory_listing_response_delegate.h View 1 5 chunks +7 lines, -24 lines 0 comments Download
M webkit/glue/ftp_directory_listing_response_delegate.cc View 1 4 chunks +25 lines, -68 lines 0 comments Download

Messages

Total messages: 7 (0 generated)
Paweł Hajdan Jr.
What do you think about this change? Technically just to fix the bug a smaller ...
9 years, 9 months ago (2011-03-17 10:09:25 UTC) #1
wtc
phajdan.jr: (I didn't look at the CL.) Does this mean you want to buffer the ...
9 years, 9 months ago (2011-03-17 20:44:25 UTC) #2
Paweł Hajdan Jr.
On 2011/03/17 20:44:25, wtc wrote: > Does this mean you want to buffer the entire ...
9 years, 9 months ago (2011-03-18 08:56:28 UTC) #3
Paweł Hajdan Jr.
ping
9 years, 9 months ago (2011-03-24 07:55:16 UTC) #4
eroman
I've been out sick, or I would have responded earlier :) I agree with your ...
9 years, 9 months ago (2011-03-24 22:27:56 UTC) #5
eroman
lgtm http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listing_parser.cc File net/ftp/ftp_directory_listing_parser.cc (right): http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listing_parser.cc#newcode30 net/ftp/ftp_directory_listing_parser.cc:30: // Use first encoding that can be used ...
9 years, 9 months ago (2011-03-24 23:09:35 UTC) #6
Paweł Hajdan Jr.
9 years, 9 months ago (2011-03-26 09:47:50 UTC) #7
http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
File net/ftp/ftp_directory_listing_parser.cc (right):

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
net/ftp/ftp_directory_listing_parser.cc:30: // Use first encoding that can be
used to decode the text.
On 2011/03/24 23:09:35, eroman wrote:
> I wander if it would be more flexible to move this loop further out.
> 
> In other words, rather than using the first encoding the decodes the text, use
> the first encoding whose decoded text is parseable:
> 
> for each possible encoding:
>   decode to UTF16
>   split into lines
>   for each directory listing type:
>     try to parse all lines
>     if successful, return OK (else continue trying parsers/decoders)

Yes, it sounds like a good idea. I'd prefer to do it in a separate patch though,
to avoid doing too much in one CL (this is already quite big).

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
File net/ftp/ftp_directory_listing_parser_ls.cc (right):

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
net/ftp/ftp_directory_listing_parser_ls.cc:178: if
(!base::StringToInt64(columns[2 + column_offset], &entry.size)) {
On 2011/03/24 23:09:35, eroman wrote:
> I only glossed over this section quickly. I assume it hasn't changed from
> before, and is just a move right?

Yes, "ls -l" is just a move. VMS had some non-trivial modifications.

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
File net/ftp/ftp_directory_listing_parser_unittest.cc (right):

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
net/ftp/ftp_directory_listing_parser_unittest.cc:98: ASSERT_EQ(lines.size() / 8,
entries.size());
On 2011/03/24 23:09:35, eroman wrote:
> nit: these two asserts could be combined into one:
> 
> ASSERT_EQ(8 * entries.size(), lines.size());
> 
> (I am fine either way).

Done.

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
net/ftp/ftp_directory_listing_parser_unittest.cc:115: FtpDirectoryListingEntry
entry = entries[i];
On 2011/03/24 23:09:35, eroman wrote:
> nit: could make this |const FtpDirectoryListingEntry&|

Done.

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
File net/ftp/ftp_directory_listing_parser_vms.cc (right):

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
net/ftp/ftp_directory_listing_parser_vms.cc:180: namespace net {
On 2011/03/24 23:09:35, eroman wrote:
> nit: i suggest moving this up, so the anonymous namespace is also enclosed
> within net (avoids having to prefix the types with net::)

Done.

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
File net/ftp/ftp_directory_listing_parser_vms_unittest.cc (right):

http://codereview.chromium.org/6670085/diff/5002/net/ftp/ftp_directory_listin...
net/ftp/ftp_directory_listing_parser_vms_unittest.cc:86: "README.TXT;1 2
APR-2000 10:40:39",
On 2011/03/24 23:09:35, eroman wrote:
> how come the test data has changed?

I refactored the tests slightly, and removed some cases that the previous parser
would reject but now will be accepted (for simplicity and compatibility).

After seeing this comment I added an additional test case to restore some
previously existing listing lines that we should still be rejecting.

Powered by Google App Engine
This is Rietveld 408576698