Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(137)

Side by Side Diff: third_party/WebKit/Source/wtf/text/UTF8.cpp

Issue 1721373002: UTF-8 detector for pages missing encoding info (Closed) Base URL: https://chromium.googlesource.com/chromium/src.git@master
Patch Set: Created 4 years, 9 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 /* 1 /*
2 * Copyright (C) 2007 Apple Inc. All rights reserved. 2 * Copyright (C) 2007 Apple Inc. All rights reserved.
3 * Copyright (C) 2010 Patrick Gansterer <paroga@paroga.com> 3 * Copyright (C) 2010 Patrick Gansterer <paroga@paroga.com>
4 * 4 *
5 * Redistribution and use in source and binary forms, with or without 5 * Redistribution and use in source and binary forms, with or without
6 * modification, are permitted provided that the following conditions 6 * modification, are permitted provided that the following conditions
7 * are met: 7 * are met:
8 * 1. Redistributions of source code must retain the above copyright 8 * 1. Redistributions of source code must retain the above copyright
9 * notice, this list of conditions and the following disclaimer. 9 * notice, this list of conditions and the following disclaimer.
10 * 2. Redistributions in binary form must reproduce the above copyright 10 * 2. Redistributions in binary form must reproduce the above copyright
(...skipping 424 matching lines...) Expand 10 before | Expand all | Expand 10 after
435 bool equalUTF16WithUTF8(const UChar* a, const UChar* aEnd, const char* b, const char* bEnd) 435 bool equalUTF16WithUTF8(const UChar* a, const UChar* aEnd, const char* b, const char* bEnd)
436 { 436 {
437 return equalWithUTF8Internal(a, aEnd, b, bEnd); 437 return equalWithUTF8Internal(a, aEnd, b, bEnd);
438 } 438 }
439 439
440 bool equalLatin1WithUTF8(const LChar* a, const LChar* aEnd, const char* b, const char* bEnd) 440 bool equalLatin1WithUTF8(const LChar* a, const LChar* aEnd, const char* b, const char* bEnd)
441 { 441 {
442 return equalWithUTF8Internal(a, aEnd, b, bEnd); 442 return equalWithUTF8Internal(a, aEnd, b, bEnd);
443 } 443 }
444 444
445 bool isUTF8Encoded(const char* data, size_t length)
jungshik at Google 2016/04/03 00:52:04 Without looking at the header file or the function
Jinsuk Kim 2016/04/06 04:34:11 Chose isUTF8andNotASCII
446 {
447 // This cast is necessary because U8_NEXT uses int32_ts.
448 int32_t srcLen = static_cast<int32_t>(length);
449 int32_t charIndex = 0;
450 bool markDetected = false;
jungshik at Google 2016/04/03 00:52:04 At first, it took me a while to figure out what th
Jinsuk Kim 2016/04/06 04:34:11 Done.
451
452 while (charIndex < srcLen) {
453 int32_t codePoint;
454 if (static_cast<uint8_t>(data[charIndex]) >= 0x80)
455 markDetected = true;
456 U8_NEXT(data, charIndex, srcLen, codePoint);
457 if (!U_IS_UNICODE_CHAR(codePoint))
458 return false;
459 }
460 return markDetected;
461 }
462
445 } // namespace Unicode 463 } // namespace Unicode
446 } // namespace WTF 464 } // namespace WTF
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698