|
|
DescriptionAndroid URL formatting: Disable RTL URLs for Android 4.2 and below.
Android 4.2 and below has a problematic RTL implementation. Force
formatted URLs to display in LTR.
BUG=709417
Review-Url: https://codereview.chromium.org/2839843002
Cr-Commit-Position: refs/heads/master@{#467740}
Committed: https://chromium.googlesource.com/chromium/src/+/e1fa8fc5972327ce78225c7de95153c63d3e3bb3
Patch Set 1 #
Total comments: 4
Patch Set 2 : fix #
Total comments: 4
Patch Set 3 : make constant #Messages
Total messages: 25 (15 generated)
The CQ bit was checked by tommycli@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
tommycli@chromium.org changed reviewers: + mgiuca@chromium.org
mgiuca: PTAL, This CL contains a possible fix, but it's probably not quite right. I made some notes below asking some questions about Bidi control characters in unicode. https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... File components/url_formatter/url_formatter.cc (right): https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... components/url_formatter/url_formatter.cc:837: url_string = base::i18n::GetDisplayStringInLTRDirectionality(url_string); Hey, I'm not sure this is the correct thing to do. The other thing I investigated was modifying escape.cc as you suggested. I looked into modifying HasThreeByteBidiControlCharAtIndex. However, it didn't seem like an exact fit. The other thing I was considering was using base::i18n::StringContainsStrongRTLChars within escape.cc, but I wasn't sure that all RTL characters were three bytes (are they?) At that point I decided to throw something up and seek some feedback before proceeding further... thanks!
https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... File components/url_formatter/url_formatter.cc (right): https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... components/url_formatter/url_formatter.cc:837: url_string = base::i18n::GetDisplayStringInLTRDirectionality(url_string); On 2017/04/25 00:06:21, tommycli wrote: > Hey, I'm not sure this is the correct thing to do. > > The other thing I investigated was modifying escape.cc as you suggested. I > looked into modifying HasThreeByteBidiControlCharAtIndex. However, it didn't > seem like an exact fit. > > The other thing I was considering was using > base::i18n::StringContainsStrongRTLChars within escape.cc, but I wasn't sure > that all RTL characters were three bytes (are they?) It's probably the case that they are all three bytes, but you shouldn't make that assumption -- new characters are added all the time. Yeah escape.cc is annoyingly structured around the number of bytes representing a character, which invites bugs and is an irrelevant implementation detail when deciding what characters to use. > At that point I decided to throw something up and seek some feedback before > proceeding further... thanks! I don't think this will work (have you tested on Android 4.2?) This just wraps the string in LRE...PDF; we already do an equivalent thing and it *should* work, but relies on the renderer to behave correctly. If RTL is broken on Android 4.2 as is claimed, then we can't expect this to work.
https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... File components/url_formatter/url_formatter.cc (right): https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... components/url_formatter/url_formatter.cc:837: url_string = base::i18n::GetDisplayStringInLTRDirectionality(url_string); On 2017/04/26 00:01:05, Matt Giuca wrote: > On 2017/04/25 00:06:21, tommycli wrote: > > Hey, I'm not sure this is the correct thing to do. > > > > The other thing I investigated was modifying escape.cc as you suggested. I > > looked into modifying HasThreeByteBidiControlCharAtIndex. However, it didn't > > seem like an exact fit. > > > > The other thing I was considering was using > > base::i18n::StringContainsStrongRTLChars within escape.cc, but I wasn't sure > > that all RTL characters were three bytes (are they?) > > It's probably the case that they are all three bytes, but you shouldn't make > that assumption -- new characters are added all the time. > > Yeah escape.cc is annoyingly structured around the number of bytes representing > a character, which invites bugs and is an irrelevant implementation detail when > deciding what characters to use. > > > At that point I decided to throw something up and seek some feedback before > > proceeding further... thanks! > > I don't think this will work (have you tested on Android 4.2?) This just wraps > the string in LRE...PDF; we already do an equivalent thing and it *should* work, > but relies on the renderer to behave correctly. If RTL is broken on Android 4.2 > as is claimed, then we can't expect this to work. After we had a chat: It looks like this *is* the correct way to fix it. Currently, we fix this by using the ApiCompatibilityUtils.setTextDirection() API on Android, which accomplishes the same thing but doesn't work before 4.2. So it's correct to insert those characters instead. Perhaps in the future we can remove the use of setTextDirection and just go with this for all versions. But for now let's fix the bug and that can be a refactor. As discussed, my concern is that this changes the text in the Omnibox that is user-editable. If the user can copy the string and it includes the bidi mark characters, that is bad, and we need to find a way to insert those characters later in the pipeline (after the editable text field is updated, before rendering). The fact that this is the same function that removes "http://" indicates that it is indeed changing the user-editable text string (not just the display string).
Patchset #2 (id:20001) has been deleted
The CQ bit was checked by tommycli@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
matt: PTAL again, thanks! https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... File components/url_formatter/url_formatter.cc (right): https://codereview.chromium.org/2839843002/diff/1/components/url_formatter/ur... components/url_formatter/url_formatter.cc:837: url_string = base::i18n::GetDisplayStringInLTRDirectionality(url_string); On 2017/04/26 05:57:09, Matt Giuca wrote: > On 2017/04/26 00:01:05, Matt Giuca wrote: > > On 2017/04/25 00:06:21, tommycli wrote: > > > Hey, I'm not sure this is the correct thing to do. > > > > > > The other thing I investigated was modifying escape.cc as you suggested. I > > > looked into modifying HasThreeByteBidiControlCharAtIndex. However, it didn't > > > seem like an exact fit. > > > > > > The other thing I was considering was using > > > base::i18n::StringContainsStrongRTLChars within escape.cc, but I wasn't sure > > > that all RTL characters were three bytes (are they?) > > > > It's probably the case that they are all three bytes, but you shouldn't make > > that assumption -- new characters are added all the time. > > > > Yeah escape.cc is annoyingly structured around the number of bytes > representing > > a character, which invites bugs and is an irrelevant implementation detail > when > > deciding what characters to use. > > > > > At that point I decided to throw something up and seek some feedback before > > > proceeding further... thanks! > > > > I don't think this will work (have you tested on Android 4.2?) This just wraps > > the string in LRE...PDF; we already do an equivalent thing and it *should* > work, > > but relies on the renderer to behave correctly. If RTL is broken on Android > 4.2 > > as is claimed, then we can't expect this to work. > > After we had a chat: > > It looks like this *is* the correct way to fix it. Currently, we fix this by > using the ApiCompatibilityUtils.setTextDirection() API on Android, which > accomplishes the same thing but doesn't work before 4.2. So it's correct to > insert those characters instead. > > Perhaps in the future we can remove the use of setTextDirection and just go with > this for all versions. But for now let's fix the bug and that can be a refactor. > > As discussed, my concern is that this changes the text in the Omnibox that is > user-editable. If the user can copy the string and it includes the bidi mark > characters, that is bad, and we need to find a way to insert those characters > later in the pipeline (after the editable text field is updated, before > rendering). > > The fact that this is the same function that removes "http://" indicates that it > is indeed changing the user-editable text string (not just the display string). You're right, this does mess up the copy and pasted content. I have a new patch... https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... File chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java (right): https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java:766: formattedUrl = '\u200E' + formattedUrl; This I have confirmed does not negatively affect the copy and paste. This is because mOriginalUrlLocation is used for copy and paste, not the formatted URL. Also... it prevents android specific code from creeping into the platform neutral C++. One negative... there's no Android constants for '\u200E', so I just had to jam in in there... Overall I think this CL is better though. It also fixes alignment. I added a screenshot to the bug. Please take a look... i'm not 100% sure that the LRM character is the right one to use, but https://www.ietf.org/rfc/rfc3987.txt hints that it is...
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... File chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java (right): https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java:766: formattedUrl = '\u200E' + formattedUrl; On 2017/04/27 01:36:33, tommycli wrote: > This I have confirmed does not negatively affect the copy and paste. This is > because mOriginalUrlLocation is used for copy and paste, not the formatted URL. > > Also... it prevents android specific code from creeping into the platform > neutral C++. Yeah this looks like exactly the right spot. > One negative... there's no Android constants for '\u200E', so I just had to jam > in in there... Overall I think this CL is better though. Yeah. I would prefer a constant declaration even if it's just at the top of this class, so it's clear that '\u200e' is the LRM. > It also fixes alignment. I added a screenshot to the bug. Good! > Please take a look... i'm not 100% sure that the LRM character is the right one > to use, but https://www.ietf.org/rfc/rfc3987.txt hints that it is... That document says it should be "the same way as they would be if they were in a left-to-right embedding", so explicitly that's U+202A at the start and U+202C at the end. It later says that a LRM "may be sufficient". I think these are equivalent as long as it's applied to the whole paragraph. (i.e., LRE..PDF around an entire para is equivalent to LRM at the start of a para). The only difference would be if the string was concatenated to other strings; then it would be important to use LRE..PDF. I think it's fine either way. I think this is sufficiently clean that we make it the universal rule on Android (remove the setParagraphDirection API call and always use this). This can be done in a follow-up CL.
lgtm, with one nit to move the Unicode character into a named constant.
tedchoc@chromium.org changed reviewers: + tedchoc@chromium.org
drive by owners lgtm https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... File chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java (right): https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java:766: formattedUrl = '\u200E' + formattedUrl; On 2017/04/27 03:33:07, Matt Giuca wrote: > On 2017/04/27 01:36:33, tommycli wrote: > > This I have confirmed does not negatively affect the copy and paste. This is > > because mOriginalUrlLocation is used for copy and paste, not the formatted > URL. > > > > Also... it prevents android specific code from creeping into the platform > > neutral C++. > > Yeah this looks like exactly the right spot. > > > One negative... there's no Android constants for '\u200E', so I just had to > jam > > in in there... Overall I think this CL is better though. > > Yeah. I would prefer a constant declaration even if it's just at the top of this > class, so it's clear that '\u200e' is the LRM. Agreed...I didn't even know what this was before reading this patch. I'd be tempted to even include the ietf link you did (or mention the RFC by number) as I don't think many future code readers would understand the importance of the > > > It also fixes alignment. I added a screenshot to the bug. > > Good! > > > Please take a look... i'm not 100% sure that the LRM character is the right > one > > to use, but https://www.ietf.org/rfc/rfc3987.txt hints that it is... > > That document says it should be "the same way as they would be if they were in a > left-to-right embedding", so explicitly that's U+202A at the start and U+202C at > the end. It later says that a LRM "may be sufficient". > > I think these are equivalent as long as it's applied to the whole paragraph. > (i.e., LRE..PDF around an entire para is equivalent to LRM at the start of a > para). The only difference would be if the string was concatenated to other > strings; then it would be important to use LRE..PDF. I think it's fine either > way. > > I think this is sufficiently clean that we make it the universal rule on Android > (remove the setParagraphDirection API call and always use this). This can be > done in a follow-up CL.
The CQ bit was checked by tommycli@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from tedchoc@chromium.org, mgiuca@chromium.org Link to the patchset: https://codereview.chromium.org/2839843002/#ps60001 (title: "make constant")
CQ is trying da patch. Follow status at: https://chromium-cq-status.appspot.com/v2/patch-status/codereview.chromium.or...
https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... File chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java (right): https://codereview.chromium.org/2839843002/diff/40001/chrome/android/java/src... chrome/android/java/src/org/chromium/chrome/browser/omnibox/UrlBar.java:766: formattedUrl = '\u200E' + formattedUrl; On 2017/04/27 03:33:07, Matt Giuca wrote: > On 2017/04/27 01:36:33, tommycli wrote: > > This I have confirmed does not negatively affect the copy and paste. This is > > because mOriginalUrlLocation is used for copy and paste, not the formatted > URL. > > > > Also... it prevents android specific code from creeping into the platform > > neutral C++. > > Yeah this looks like exactly the right spot. > > > One negative... there's no Android constants for '\u200E', so I just had to > jam > > in in there... Overall I think this CL is better though. > > Yeah. I would prefer a constant declaration even if it's just at the top of this > class, so it's clear that '\u200e' is the LRM. > > > It also fixes alignment. I added a screenshot to the bug. > > Good! > > > Please take a look... i'm not 100% sure that the LRM character is the right > one > > to use, but https://www.ietf.org/rfc/rfc3987.txt hints that it is... > > That document says it should be "the same way as they would be if they were in a > left-to-right embedding", so explicitly that's U+202A at the start and U+202C at > the end. It later says that a LRM "may be sufficient". > > I think these are equivalent as long as it's applied to the whole paragraph. > (i.e., LRE..PDF around an entire para is equivalent to LRM at the start of a > para). The only difference would be if the string was concatenated to other > strings; then it would be important to use LRE..PDF. I think it's fine either > way. > > I think this is sufficiently clean that we make it the universal rule on Android > (remove the setParagraphDirection API call and always use this). This can be > done in a follow-up CL. Done. Great I made a constant at the top of the file, as well as included the ietf link in the comment. This is fairly clean -- but I'm not sure it's a complete replacement for higher API levels. The higher API levels do tricky stuff where they restore the text input direction to RTL when the input is focused for editing. This patch doesn't support that, and leaves the text LTR even during editing.
CQ is committing da patch. Bot data: {"patchset_id": 60001, "attempt_start_ts": 1493313699790010, "parent_rev": "8fdab62089307f5fe2307335dd1c71c4c6661867", "commit_rev": "e1fa8fc5972327ce78225c7de95153c63d3e3bb3"}
Message was sent while issue was closed.
Description was changed from ========== Android URL formatting: Disable RTL URLs for Android 4.2 and below. Android 4.2 and below has a problematic RTL implementation. Force formatted URLs to display in LTR. BUG=709417 ========== to ========== Android URL formatting: Disable RTL URLs for Android 4.2 and below. Android 4.2 and below has a problematic RTL implementation. Force formatted URLs to display in LTR. BUG=709417 Review-Url: https://codereview.chromium.org/2839843002 Cr-Commit-Position: refs/heads/master@{#467740} Committed: https://chromium.googlesource.com/chromium/src/+/e1fa8fc5972327ce78225c7de951... ==========
Message was sent while issue was closed.
Committed patchset #3 (id:60001) as https://chromium.googlesource.com/chromium/src/+/e1fa8fc5972327ce78225c7de951... |