Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(237)

Issue 1125403004: Switch language detection to use CLD2's DetectLanguageCheckUTF8 method. (Closed)

Created:
5 years, 7 months ago by Andrew Hayden (chromium.org)
Modified:
5 years, 7 months ago
CC:
chromium-reviews, dsites_google.com, riesa1
Base URL:
https://chromium.googlesource.com/chromium/src.git@master
Target Ref:
refs/pending/heads/master
Project:
chromium
Visibility:
Public.

Description

Switch language detection to use CLD2's DetectLanguageCheckUTF8 method. There is a long-standing bug in issue 444258 describing a crash in CLD2::QuadHashV3Lookup4. A hypothesis for the crashes is that invalid UTF-8 is reaching the CLD2 code layer, and this is confirmed in bug 444258. We must use the "safe" (CheckUTF8) variant of language detection instead. The existing code uses DetectLanguageSummary, but there is no "safe" variant of this method in CLD2. However, the existing code doesn't consider the extra data returned by DetectLanguageSummary (i.e., multiple language guesses with accompanying probabilities) so there's no reason to stick with it. Using the simpler and safer DetectLanguageCheckUTF8 should produce the same results, will be safer, and should have comparable performance for Chromium's use cases. BUG=444258 Committed: https://crrev.com/90bb2b934366c595c1c979e0b2363f0a822e1b92 Cr-Commit-Position: refs/heads/master@{#328947}

Patch Set 1 #

Total comments: 7

Patch Set 2 : droger@ comments #

Patch Set 3 : Remove DCHECK and reintroduce retry behavior #

Patch Set 4 : Fix compile error under CLD1 for Android #

Patch Set 5 : git cl format #

Unified diffs Side-by-side diffs Delta from patch set Stats (+21 lines, -12 lines) Patch
M components/translate/core/language_detection/language_detection_util.cc View 1 2 3 4 4 chunks +21 lines, -12 lines 0 comments Download

Messages

Total messages: 17 (3 generated)
Andrew Hayden (chromium.org)
Can I get an owner to take a look, please. Hajime, how about you? :)
5 years, 7 months ago (2015-05-07 11:47:54 UTC) #2
droger
I'm not an expert of CLD, but this LGTM. https://codereview.chromium.org/1125403004/diff/1/components/translate/core/language_detection/language_detection_util.cc File components/translate/core/language_detection/language_detection_util.cc (right): https://codereview.chromium.org/1125403004/diff/1/components/translate/core/language_detection/language_detection_util.cc#newcode115 components/translate/core/language_detection/language_detection_util.cc:115: ...
5 years, 7 months ago (2015-05-07 11:56:57 UTC) #3
Andrew Hayden (chromium.org)
Patchset 2 addresses comments from droger@ and fixes a bug in the check-for-invalid-utf8 code. https://codereview.chromium.org/1125403004/diff/1/components/translate/core/language_detection/language_detection_util.cc ...
5 years, 7 months ago (2015-05-07 12:24:05 UTC) #4
Andrew Hayden (chromium.org)
Fascinating. AutomationApiTest.Find is failing the new NOTREACHED(). I'll see if I can hunt this down.
5 years, 7 months ago (2015-05-07 13:37:31 UTC) #5
Andrew Hayden (chromium.org)
On 2015/05/07 13:37:31, Andrew Hayden wrote: > Fascinating. AutomationApiTest.Find is failing the new NOTREACHED(). I'll ...
5 years, 7 months ago (2015-05-07 14:40:51 UTC) #6
Andrew Hayden (chromium.org)
OK, new patchset and trying this again based on the outcome of bug discussion and ...
5 years, 7 months ago (2015-05-08 09:13:47 UTC) #7
Andrew Hayden (chromium.org)
Updated the change description with the new learnings from the bug.
5 years, 7 months ago (2015-05-08 09:16:30 UTC) #8
Takashi Toyoshima
Is this a tentative fix? If not, UMA on the retry counts will be useful ...
5 years, 7 months ago (2015-05-08 09:31:01 UTC) #9
Andrew Hayden (chromium.org)
On 2015/05/08 09:31:01, Takashi Toyoshima (chromium) wrote: > Is this a tentative fix? If not, ...
5 years, 7 months ago (2015-05-08 09:36:33 UTC) #10
Takashi Toyoshima
LGTM to fix the current issue, but can you keep a bug opened for investigating ...
5 years, 7 months ago (2015-05-08 12:06:23 UTC) #11
Andrew Hayden (chromium.org)
On 2015/05/08 12:06:23, Takashi Toyoshima (chromium) wrote: > LGTM to fix the current issue, but ...
5 years, 7 months ago (2015-05-08 12:15:00 UTC) #12
commit-bot: I haz the power
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1125403004/80001
5 years, 7 months ago (2015-05-08 12:15:24 UTC) #15
commit-bot: I haz the power
Committed patchset #5 (id:80001)
5 years, 7 months ago (2015-05-08 12:48:05 UTC) #16
commit-bot: I haz the power
5 years, 7 months ago (2015-05-08 12:49:07 UTC) #17
Message was sent while issue was closed.
Patchset 5 (id:??) landed as
https://crrev.com/90bb2b934366c595c1c979e0b2363f0a822e1b92
Cr-Commit-Position: refs/heads/master@{#328947}

Powered by Google App Engine
This is Rietveld 408576698