Chromium Code Reviews| Index: third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp |
| diff --git a/third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp b/third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp |
| index f0d9ccce852814c9fe7314361b4b12cbcbceec9e..1b3f33f7b40386979285a37739cd3b1b318f405e 100644 |
| --- a/third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp |
| +++ b/third_party/WebKit/Source/platform/text/TextEncodingDetector.cpp |
| @@ -62,13 +62,41 @@ bool detectTextEncoding(const char* data, |
| if (encoding == UNKNOWN_ENCODING || encoding == UTF8) |
| return false; |
| - // 7-bit encodings (except ISO-2022-JP) are not supported in WHATWG encoding |
| - // standard. Mark them as ASCII to keep the raw bytes intact. |
| + // 7-bit encodings (except ISO-2022-JP), and some obscure encodings not |
| + // supported in WHATWG encoding standard are marked as ASCII to keep the raw |
| + // bytes intact. |
| + // TODO(jinsukkim): Put this conversion into CED library, and enable "WHATWG" |
| + // mode. |
| switch (encoding) { |
| case HZ_GB_2312: |
| case ISO_2022_KR: |
| case ISO_2022_CN: |
| case UTF7: |
| + |
| + case CHINESE_EUC_DEC: |
| + case CHINESE_CNS: |
| + case CHINESE_BIG5_CP950: |
| + case JAPANESE_CP932: |
| + case MSFT_CP874: |
| + case TSCII: |
| + case TAMIL_MONO: |
| + case TAMIL_BI: |
| + case JAGRAN: |
| + case BHASKAR: |
| + case HTCHANAKYA: |
| + case BINARYENC: |
| + case UTF8UTF8: |
| + case TAM_ELANGO: |
| + case TAM_LTTMBARANI: |
| + case TAM_SHREE: |
| + case TAM_TBOOMIS: |
| + case TAM_TMNEWS: |
| + case TAM_WEBTAMIL: |
| + case KDDI_SHIFT_JIS: |
| + case DOCOMO_SHIFT_JIS: |
| + case SOFTBANK_SHIFT_JIS: |
| + case KDDI_ISO_2022_JP: |
| + case SOFTBANK_ISO_2022_JP: |
|
tkent
2017/03/08 02:17:05
Should detectTextEncoding() return |false| for the
Jinsuk Kim
2017/03/08 02:27:05
Hm, returning false is not the solution for the li
|
| encoding = ASCII_7BIT; |
| break; |
| default: |