Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(54)

Side by Side Diff: third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp

Issue 2737033003: Convert non-WHATWG text encoding to ASCII (Closed)
Patch Set: Created 3 years, 9 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | third_party/WebKit/Source/platform/text/TextEncodingDetectorTest.cpp » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 /* 1 /*
2 * Copyright (C) 2008, 2009 Google Inc. All rights reserved. 2 * Copyright (C) 2008, 2009 Google Inc. All rights reserved.
3 * 3 *
4 * Redistribution and use in source and binary forms, with or without 4 * Redistribution and use in source and binary forms, with or without
5 * modification, are permitted provided that the following conditions are 5 * modification, are permitted provided that the following conditions are
6 * met: 6 * met:
7 * 7 *
8 * * Redistributions of source code must retain the above copyright 8 * * Redistributions of source code must retain the above copyright
9 * notice, this list of conditions and the following disclaimer. 9 * notice, this list of conditions and the following disclaimer.
10 * * Redistributions in binary form must reproduce the above 10 * * Redistributions in binary form must reproduce the above
(...skipping 44 matching lines...) Expand 10 before | Expand all | Expand 10 after
55 55
56 // Should return false if the detected encoding is UTF8. This helps prevent 56 // Should return false if the detected encoding is UTF8. This helps prevent
57 // modern web sites from neglecting proper encoding labelling and simply 57 // modern web sites from neglecting proper encoding labelling and simply
58 // relying on browser-side encoding detection. Encoding detection is supposed 58 // relying on browser-side encoding detection. Encoding detection is supposed
59 // to work for web sites with legacy encoding only. Detection failure leads 59 // to work for web sites with legacy encoding only. Detection failure leads
60 // |TextResourceDecoder| to use its default encoding determined from system 60 // |TextResourceDecoder| to use its default encoding determined from system
61 // locale or TLD. 61 // locale or TLD.
62 if (encoding == UNKNOWN_ENCODING || encoding == UTF8) 62 if (encoding == UNKNOWN_ENCODING || encoding == UTF8)
63 return false; 63 return false;
64 64
65 // 7-bit encodings (except ISO-2022-JP) are not supported in WHATWG encoding 65 // 7-bit encodings (except ISO-2022-JP), and some obscure encodings not
66 // standard. Mark them as ASCII to keep the raw bytes intact. 66 // supported in WHATWG encoding standard are marked as ASCII to keep the raw
67 // bytes intact.
68 // TODO(jinsukkim): Put this conversion into CED library, and enable "WHATWG"
69 // mode.
67 switch (encoding) { 70 switch (encoding) {
68 case HZ_GB_2312: 71 case HZ_GB_2312:
69 case ISO_2022_KR: 72 case ISO_2022_KR:
70 case ISO_2022_CN: 73 case ISO_2022_CN:
71 case UTF7: 74 case UTF7:
75
76 case CHINESE_EUC_DEC:
77 case CHINESE_CNS:
78 case CHINESE_BIG5_CP950:
79 case JAPANESE_CP932:
80 case MSFT_CP874:
81 case TSCII:
82 case TAMIL_MONO:
83 case TAMIL_BI:
84 case JAGRAN:
85 case BHASKAR:
86 case HTCHANAKYA:
87 case BINARYENC:
88 case UTF8UTF8:
89 case TAM_ELANGO:
90 case TAM_LTTMBARANI:
91 case TAM_SHREE:
92 case TAM_TBOOMIS:
93 case TAM_TMNEWS:
94 case TAM_WEBTAMIL:
95 case KDDI_SHIFT_JIS:
96 case DOCOMO_SHIFT_JIS:
97 case SOFTBANK_SHIFT_JIS:
98 case KDDI_ISO_2022_JP:
99 case SOFTBANK_ISO_2022_JP:
tkent 2017/03/08 02:17:05 Should detectTextEncoding() return |false| for the
Jinsuk Kim 2017/03/08 02:27:05 Hm, returning false is not the solution for the li
72 encoding = ASCII_7BIT; 100 encoding = ASCII_7BIT;
73 break; 101 break;
74 default: 102 default:
75 break; 103 break;
76 } 104 }
77 *detectedEncoding = WTF::TextEncoding(MimeEncodingName(encoding)); 105 *detectedEncoding = WTF::TextEncoding(MimeEncodingName(encoding));
78 return true; 106 return true;
79 } 107 }
80 108
81 } // namespace blink 109 } // namespace blink
OLDNEW
« no previous file with comments | « no previous file | third_party/WebKit/Source/platform/text/TextEncodingDetectorTest.cpp » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698