|
|
Created:
8 years, 7 months ago by Alexei Svitkine (slow) Modified:
8 years, 7 months ago CC:
chromium-reviews Base URL:
svn://svn.chromium.org/chrome/trunk/src/ Visibility:
Public. |
DescriptionDetect missing glyphs as returned by Vista in RenderTextWin.
Vista's version of Uniscribe (or possibly of the Meiryo font)
doesn't return glyphs with value wgDefault for characters
that cannot be displayed by that font. Instead, they have
value wgBlank with the fZeroWidth attribute bit set.
This change detects that and determines if there were
missing glyphs based on whether the character(s) that
correspond to the glyph are whitespace characters.
BUG=125629, 105550
TEST=See http://crbug.com/125629
Committed: http://src.chromium.org/viewvc/chrome?view=rev&revision=135206
Patch Set 1 #Patch Set 2 : #
Total comments: 3
Patch Set 3 : #
Total comments: 4
Patch Set 4 : #Messages
Total messages: 20 (0 generated)
Unfortunately, it looks like this isn't enough. There is a false positive with the following string: أي‌بي‌إم (Making an html file with that string in it and right-clicking on it to get the string to show up in the context menu triggers this.)
On 2012/05/02 16:57:51, Alexei Svitkine wrote: > Unfortunately, it looks like this isn't enough. There is a false positive with > the following string: > > أي‌بي‌إم > > (Making an html file with that string in it and right-clicking on it to get the > string to show up in the context menu triggers this.) Perhaps we can configure Uniscribe to be more aggressive with glyphs that aren't found somehow?
I've been looking, but haven't found anything promising yet. > Perhaps we can configure Uniscribe to be more aggressive with glyphs that > aren't > found somehow? > > http://codereview.chromium.org/10315007/
On 2012/05/02 18:43:31, Alexei Svitkine wrote: > I've been looking, but haven't found anything promising yet. > > > Perhaps we can configure Uniscribe to be more aggressive with glyphs that > > aren't > > found somehow? > > > > http://codereview.chromium.org/10315007/ Eh, nothing in SCRIPT_CONTROL or SCRIPT_STATE look related except maybe: http://msdn.microsoft.com/en-us/library/windows/desktop/dd374043(v=vs.85).aspx fDisplayZWG - Value indicating if nondisplayable control characters are shaped as representational glyphs for languages that need reordering or different glyph shapes, depending on the positions of the characters within a word. Possible values are defined in the following table. Typically, the characters are not displayed. They are shaped to the blank glyph and given a width of 0. Can you infer anything from other SCRIPT_FONTPROPERTIES members? (wgInvalid, wgKashida, iKashidaWidth)? How about something from the SCRIPT_PROPERTIES (fInvalidGlyph, fInvalidLogAttr, fRejectInvalid)? Or SCRIPT_LOGATTR's fInvalid? Otherwise, I'd looks at what Webkit does in these cases, and how they handle font fallback/substitution, etc.
On 2012/05/02 19:02:47, msw wrote: > On 2012/05/02 18:43:31, Alexei Svitkine wrote: > > I've been looking, but haven't found anything promising yet. > > > > > Perhaps we can configure Uniscribe to be more aggressive with glyphs that > > > aren't > > > found somehow? > > > > > > http://codereview.chromium.org/10315007/ > > Eh, nothing in SCRIPT_CONTROL or SCRIPT_STATE look related except maybe: > http://msdn.microsoft.com/en-us/library/windows/desktop/dd374043%28v=vs.85%29... > fDisplayZWG - Value indicating if nondisplayable control characters are shaped > as representational glyphs for languages that need reordering or different glyph > shapes, depending on the positions of the characters within a word. Possible > values are defined in the following table. Typically, the characters are not > displayed. They are shaped to the blank glyph and given a width of 0. > > Can you infer anything from other SCRIPT_FONTPROPERTIES members? (wgInvalid, > wgKashida, iKashidaWidth)? How about something from the SCRIPT_PROPERTIES > (fInvalidGlyph, fInvalidLogAttr, fRejectInvalid)? Or SCRIPT_LOGATTR's fInvalid? > > Otherwise, I'd looks at what Webkit does in these cases, and how they handle > font fallback/substitution, etc. I'm prototyping the solution jshin@ and I discussed - i.e. detecting if the wgBreak glyph came from whitespace or not. We can do it only for Vista and fZeroWidth wgBreak glyphs, which should minimize the performance hit of it. Let me see if I can get something that works.
Please take another look. I've implemented the approach I discussed with jshin@. It fixes both the original problem and correctly handles the false positive I identified with my original implementation.
Do we know how Webkit (or any other Uniscribe consumers) handle these cases? http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc File ui/gfx/render_text_win.cc (right): http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc#... ui/gfx/render_text_win.cc:657: // On Vista, when trying to draw Simplified Chinese via font linking, Perhaps generalize this comment if there may be other plaform/font triggers. Consider mentioning the bug. http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc#... ui/gfx/render_text_win.cc:665: DCHECK_NE(0U, range.length()); Use DCHECK_GT? Also DCHECK_LT/LE(range.end(), run->range.end()))?
On 2012/05/02 22:19:09, msw wrote: > Do we know how Webkit (or any other Uniscribe consumers) handle these cases? Perhaps Jungshik can elaborate on this more, but my understanding is that there are two Windows implementation of font fallback - the one used by Chromium and the one used by Safari. The one used by Chromium does not use font linking, Uniscribe, etc at all - it uses various hard-coded lists of fonts to handle different ranges of characters. So it doesn't have this problem, presumably. The one used by Safari does use font linking, in addition to Uniscribe (similar to what we do in RenderText), but in preference to both of the above, it also uses IMLangFontLink2 interface - which is basically, from my understanding, embedding IE's text rendering engine that can combine different fonts together to create a virtual font that combines them. Since I didn't see that code special case this Uniscribe return value, presumably it may get the correct result through. When I implemented font fallback in RenderText I consciously decided to avoid using IMLangFontLink2 because pulling in IE's text rendering stuff into Chrome sounds terrible. > > http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc > File ui/gfx/render_text_win.cc (right): > > http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc#... > ui/gfx/render_text_win.cc:657: // On Vista, when trying to draw Simplified > Chinese via font linking, > Perhaps generalize this comment if there may be other plaform/font triggers. > Consider mentioning the bug. > > http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc#... > ui/gfx/render_text_win.cc:665: DCHECK_NE(0U, range.length()); > Use DCHECK_GT? Also DCHECK_LT/LE(range.end(), run->range.end()))?
On 2012/05/03 03:56:34, Alexei Svitkine wrote: > On 2012/05/02 22:19:09, msw wrote: > > Do we know how Webkit (or any other Uniscribe consumers) handle these cases? > > Perhaps Jungshik can elaborate on this more, but my understanding is that there > are two Windows implementation of font fallback - the one used by Chromium and > the one used by Safari. > > The one used by Chromium does not use font linking, Uniscribe, etc at all - it > uses various hard-coded lists of fonts to handle different ranges of characters. > So it doesn't have this problem, presumably. > > The one used by Safari does use font linking, in addition to Uniscribe (similar > to what we do in RenderText), but in preference to both of the above, it also > uses IMLangFontLink2 interface - which is basically, from my understanding, > embedding IE's text rendering engine that can combine different fonts together > to create a virtual font that combines them. Since I didn't see that code > special case this Uniscribe return value, presumably it may get the correct > result through. Sorry, that should have said "presumably it may get the correct result through IMLangFontLink2. "
Okay, just ensuring that we're not missing any obviously better approach; I hope we're not signing up to handle tons of weird font behaviors like this (mostly for your sanity). Consider my inline comments, and I think this'll be fine.
Actually, some code in UniscribeHelper.cpp seems relevant: http://code.google.com/searchframe#OAMlx_jo-ck/src/third_party/WebKit/Source/... It looks like what that code does is to iterate chars (instead of glyphs). It skips all zero-width chars and for each char, gets the glyph and check if: if (glyph == properties->wgDefault || (glyph == properties->wgInvalid && glyph != properties->wgBlank)) return true; I don't see how that code would handle this case. I get the part where it skips zero-width chars. But it's check for invalid glyphs (above) wouldn't catch this case. In this case, glyph != wgDefault (so first part wouldn't match), and glyph == wgBlank (so second part wouldn't match). So from that code, I don't think it would handle this case correctly. But perhaps higher level code (that picks which fonts to try), never ends up picking Meiryo for text where this problem would occur (I don't quite understand the architecture of how the different WebKit classes interact).
> perhaps higher level code (that picks which fonts to try), never ends up picking > Meiryo for text where this problem would occur (I don't quite understand the > architecture of how the different WebKit classes interact). That's possible, and yeah I remember the Webkit Uniscribe code being a little convoluted/disorganized. Does it make more sense to ensure the non-whitespace run chars don't end up wgBlank [and/or fZeroWidth] (instead of the current, opposite pattern)?
On 2012/05/03 04:26:26, msw wrote: > > perhaps higher level code (that picks which fonts to try), never ends up > picking > > Meiryo for text where this problem would occur (I don't quite understand the > > architecture of how the different WebKit classes interact). > > That's possible, and yeah I remember the Webkit Uniscribe code being a little > convoluted/disorganized. Does it make more sense to ensure the non-whitespace > run chars don't end up wgBlank [and/or fZeroWidth] (instead of the current, > opposite pattern)? It would be more efficient, potentially. In the pattern implemented by my patch, when we see a wgBlank and fZeroWidth glyph, we have to iterate log clusters, making it potentially an O(n^2) loop. However, that complexity only happens if we see wgBlank and fZeroWidth glyph - which I believe is quite rare in practice (except in the case of this Vista bug). I guess the other question is, if we iterate chars and not glyphs, is there a possibility of missing some glyphs? Could there be glyphs in the output that do not correspond to any chars (i.e. where no logical_clusters[ch] has that glyph index?). It seems unlikely, but I don't have the expertise to say for sure. Although, actually that would break the assumptions in this CL too. Perhaps Jungshik could offer some insight here. Looking at the UniscribeHelper.cpp history, it seems he actually touched UniscribeHelper::containsMissingGlyphs() last... My intuition is that I think it makes sense to rewrite it to iterate chars and not glyphs - since it seems like it does the job for WebKit and avoids the O(n^2) complexity. So I'll give it a shot tomorrow.
CL updated. Please take another look. http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc File ui/gfx/render_text_win.cc (right): http://codereview.chromium.org/10315007/diff/10002/ui/gfx/render_text_win.cc#... ui/gfx/render_text_win.cc:657: // On Vista, when trying to draw Simplified Chinese via font linking, On 2012/05/02 22:19:09, msw wrote: > Perhaps generalize this comment if there may be other plaform/font triggers. > Consider mentioning the bug. Done.
This is much more straightforward, thanks for taking the extra time. LGTM. http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.cc File ui/gfx/render_text_win.cc (right): http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.cc#... ui/gfx/render_text_win.cc:625: bool glyphs_missing = false; nit: move this down to usage; optionally nix and just do @ line 649: // Skip font substitution if there are no missing glyphs. // If |hr| is S_OK, there could still .... if (hr != USP_E_SCRIPT_NOT_IN_FONT && !HasMissingGlyphs(run)) break; http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.h File ui/gfx/render_text_win.h (right): http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.h#n... ui/gfx/render_text_win.h:102: bool HasMissingGlyphs(internal::TextRun* run) const; nit: make run const? thx for fixing First/LastSel... below.
http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.cc File ui/gfx/render_text_win.cc (right): http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.cc#... ui/gfx/render_text_win.cc:625: bool glyphs_missing = false; On 2012/05/03 17:33:34, msw wrote: > nit: move this down to usage; Done. http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.h File ui/gfx/render_text_win.h (right): http://codereview.chromium.org/10315007/diff/25002/ui/gfx/render_text_win.h#n... ui/gfx/render_text_win.h:102: bool HasMissingGlyphs(internal::TextRun* run) const; On 2012/05/03 17:33:34, msw wrote: > nit: make run const? thx for fixing First/LastSel... below. Can't unfortunately, because ScriptGetFontProperties() takes &run->script_cache as a non-const parameter.
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/asvitkine@chromium.org/10315007/21005
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/asvitkine@chromium.org/10315007/2005
Change committed as 135206 |