OLD | NEW |
1 // Copyright (c) 2012, the Dart project authors. Please see the AUTHORS file | 1 // Copyright (c) 2012, the Dart project authors. Please see the AUTHORS file |
2 // for details. All rights reserved. Use of this source code is governed by a | 2 // for details. All rights reserved. Use of this source code is governed by a |
3 // BSD-style license that can be found in the LICENSE file. | 3 // BSD-style license that can be found in the LICENSE file. |
4 | 4 |
5 part of intl; | 5 part of intl; |
6 | 6 |
7 /** | 7 /// Bidi stands for Bi-directional text. According to |
8 * Bidi stands for Bi-directional text. | 8 /// [Wikipedia](http://en.wikipedia.org/wiki/Bi-directional_text): |
9 * According to [Wikipedia](http://en.wikipedia.org/wiki/Bi-directional_text): | 9 /// Bi-directional text is text containing text in both text directionalities, |
10 * Bi-directional text is text containing text in both text directionalities, | 10 /// both right-to-left (RTL) and left-to-right (LTR). It generally involves text |
11 * both right-to-left (RTL) and left-to-right (LTR). It generally involves text | 11 /// containing different types of alphabets, but may also refer to |
12 * containing different types of alphabets, but may also refer to boustrophedon, | 12 /// boustrophedon, which is changing text directionality in each row. |
13 * which is changing text directionality in each row. | 13 /// |
14 * | 14 /// Utility class for formatting display text in a potentially |
15 * Utility class for formatting display text in a potentially | 15 /// opposite-directionality context without garbling layout issues. Mostly a |
16 * opposite-directionality context without garbling layout issues. | 16 /// very "slimmed-down" and dart-ified port of the Closure Birectional |
17 * Mostly a very "slimmed-down" and dart-ified port of the Closure Birectional | 17 /// formatting libary. If there is a utility in the Closure library (or ICU, or |
18 * formatting libary. If there is a utility in the Closure library (or ICU, or | 18 /// elsewhere) that you would like this formatter to make available, please |
19 * elsewhere) that you would like this formatter to make available, please | 19 /// contact the Dart team. |
20 * contact the Dart team. | 20 /// |
21 * | 21 /// Provides the following functionality: |
22 * Provides the following functionality: | 22 /// |
23 * | 23 /// 1. *BiDi Wrapping* |
24 * 1. *BiDi Wrapping* | 24 /// When text in one language is mixed into a document in another, opposite- |
25 * When text in one language is mixed into a document in another, opposite- | 25 /// directionality language, e.g. when an English business name is embedded in a |
26 * directionality language, e.g. when an English business name is embedded in a | 26 /// Hebrew web page, both the inserted string and the text following it may be |
27 * Hebrew web page, both the inserted string and the text following it may be | 27 /// displayed incorrectly unless the inserted string is explicitly separated |
28 * displayed incorrectly unless the inserted string is explicitly separated | 28 /// from the surrounding text in a "wrapper" that declares its directionality at |
29 * from the surrounding text in a "wrapper" that declares its directionality at | 29 /// the start and then resets it back at the end. This wrapping can be done in |
30 * the start and then resets it back at the end. This wrapping can be done in | 30 /// HTML mark-up (e.g. a 'span dir=rtl' tag) or - only in contexts where mark-up |
31 * HTML mark-up (e.g. a 'span dir=rtl' tag) or - only in contexts where mark-up | 31 /// can not be used - in Unicode BiDi formatting codes (LRE|RLE and PDF). |
32 * can not be used - in Unicode BiDi formatting codes (LRE|RLE and PDF). | 32 /// Providing such wrapping services is the basic purpose of the BiDi formatter. |
33 * Providing such wrapping services is the basic purpose of the BiDi formatter. | 33 /// |
34 * | 34 /// 2. *Directionality estimation* |
35 * 2. *Directionality estimation* | 35 /// How does one know whether a string about to be inserted into surrounding |
36 * How does one know whether a string about to be inserted into surrounding | 36 /// text has the same directionality? Well, in many cases, one knows that this |
37 * text has the same directionality? Well, in many cases, one knows that this | 37 /// must be the case when writing the code doing the insertion, e.g. when a |
38 * must be the case when writing the code doing the insertion, e.g. when a | 38 /// localized message is inserted into a localized page. In such cases there is |
39 * localized message is inserted into a localized page. In such cases there is | 39 /// no need to involve the BiDi formatter at all. In the remaining cases, e.g. |
40 * no need to involve the BiDi formatter at all. In the remaining cases, e.g. | 40 /// when the string is user-entered or comes from a database, the language of |
41 * when the string is user-entered or comes from a database, the language of | 41 /// the string (and thus its directionality) is not known a priori, and must be |
42 * the string (and thus its directionality) is not known a priori, and must be | 42 /// estimated at run-time. The BiDi formatter does this automatically. |
43 * estimated at run-time. The BiDi formatter does this automatically. | 43 /// |
44 * | 44 /// 3. *Escaping* |
45 * 3. *Escaping* | 45 /// When wrapping plain text - i.e. text that is not already HTML or HTML- |
46 * When wrapping plain text - i.e. text that is not already HTML or HTML- | 46 /// escaped - in HTML mark-up, the text must first be HTML-escaped to prevent |
47 * escaped - in HTML mark-up, the text must first be HTML-escaped to prevent XSS | 47 /// XSS attacks and other nasty business. This of course is always true, but the |
48 * attacks and other nasty business. This of course is always true, but the | 48 /// escaping cannot be done after the string has already been wrapped in |
49 * escaping cannot be done after the string has already been wrapped in | 49 /// mark-up, so the BiDi formatter also serves as a last chance and includes |
50 * mark-up, so the BiDi formatter also serves as a last chance and includes | 50 /// escaping services. |
51 * escaping services. | 51 /// |
52 * | 52 /// Thus, in a single call, the formatter will escape the input string as |
53 * Thus, in a single call, the formatter will escape the input string as | 53 /// specified, determine its directionality, and wrap it as necessary. It is |
54 * specified, determine its directionality, and wrap it as necessary. It is | 54 /// then up to the caller to insert the return value in the output. |
55 * then up to the caller to insert the return value in the output. | |
56 */ | |
57 | 55 |
58 class BidiFormatter { | 56 class BidiFormatter { |
59 | 57 |
60 /** The direction of the surrounding text (the context). */ | 58 /// The direction of the surrounding text (the context). |
61 TextDirection contextDirection; | 59 TextDirection contextDirection; |
62 | 60 |
63 /** | 61 /// Indicates if we should always wrap the formatted text in a <span<,. |
64 * Indicates if we should always wrap the formatted text in a <span<,. | |
65 */ | |
66 bool _alwaysSpan; | 62 bool _alwaysSpan; |
67 | 63 |
68 /** | 64 /// Create a formatting object with a direction. If [alwaysSpan] is true we |
69 * Create a formatting object with a direction. If [alwaysSpan] is true we | 65 /// should always use a `span` tag, even when the input directionality is |
70 * should always use a `span` tag, even when the input directionality is | 66 /// neutral or matches the context, so that the DOM structure of the output |
71 * neutral or matches the context, so that the DOM structure of the output | 67 /// does not depend on the combination of directionalities. |
72 * does not depend on the combination of directionalities. | |
73 */ | |
74 BidiFormatter.LTR([alwaysSpan = false]) | 68 BidiFormatter.LTR([alwaysSpan = false]) |
75 : contextDirection = TextDirection.LTR, | 69 : contextDirection = TextDirection.LTR, |
76 _alwaysSpan = alwaysSpan; | 70 _alwaysSpan = alwaysSpan; |
77 BidiFormatter.RTL([alwaysSpan = false]) | 71 BidiFormatter.RTL([alwaysSpan = false]) |
78 : contextDirection = TextDirection.RTL, | 72 : contextDirection = TextDirection.RTL, |
79 _alwaysSpan = alwaysSpan; | 73 _alwaysSpan = alwaysSpan; |
80 BidiFormatter.UNKNOWN([alwaysSpan = false]) | 74 BidiFormatter.UNKNOWN([alwaysSpan = false]) |
81 : contextDirection = TextDirection.UNKNOWN, | 75 : contextDirection = TextDirection.UNKNOWN, |
82 _alwaysSpan = alwaysSpan; | 76 _alwaysSpan = alwaysSpan; |
83 | 77 |
84 /** Is true if the known context direction for this formatter is RTL. */ | 78 /// Is true if the known context direction for this formatter is RTL. |
85 bool get isRTL => contextDirection == TextDirection.RTL; | 79 bool get isRTL => contextDirection == TextDirection.RTL; |
86 | 80 |
87 /** | 81 /// Formats a string of a given (or estimated, if not provided) [direction] |
88 * Formats a string of a given (or estimated, if not provided) | 82 /// for use in HTML output of the context directionality, so an |
89 * [direction] for use in HTML output of the context directionality, so | 83 /// opposite-directionality string is neither garbled nor garbles what follows |
90 * an opposite-directionality string is neither garbled nor garbles what | 84 /// it. |
91 * follows it. | 85 /// |
92 * If the input string's directionality doesn't match the context | 86 ///If the input string's directionality doesn't match the context |
93 * directionality, we wrap it with a `span` tag and add a `dir` attribute | 87 /// directionality, we wrap it with a `span` tag and add a `dir` attribute |
94 * (either "dir=rtl" or "dir=ltr"). | 88 /// (either "dir=rtl" or "dir=ltr"). If alwaysSpan was true when constructing |
95 * If alwaysSpan was true when constructing the formatter, the input is always | 89 /// the formatter, the input is always wrapped with `span` tag, skipping the |
96 * wrapped with `span` tag, skipping the dir attribute when it's not needed. | 90 /// dir attribute when it's not needed. |
97 * | 91 /// |
98 * If [resetDir] is true and the overall directionality or the exit | 92 /// If [resetDir] is true and the overall directionality or the exit |
99 * directionality of [text] is opposite to the context directionality, | 93 /// directionality of [text] is opposite to the context directionality, |
100 * a trailing unicode BiDi mark matching the context directionality is | 94 /// a trailing unicode BiDi mark matching the context directionality is |
101 * appended (LRM or RLM). If [isHtml] is false, we HTML-escape the [text]. | 95 /// appended (LRM or RLM). If [isHtml] is false, we HTML-escape the [text]. |
102 */ | |
103 String wrapWithSpan(String text, | 96 String wrapWithSpan(String text, |
104 {bool isHtml: false, bool resetDir: true, TextDirection direction}) { | 97 {bool isHtml: false, bool resetDir: true, TextDirection direction}) { |
105 if (direction == null) direction = estimateDirection(text, isHtml: isHtml); | 98 if (direction == null) direction = estimateDirection(text, isHtml: isHtml); |
106 var result; | 99 var result; |
107 if (!isHtml) text = HTML_ESCAPE.convert(text); | 100 if (!isHtml) text = HTML_ESCAPE.convert(text); |
108 var directionChange = contextDirection.isDirectionChange(direction); | 101 var directionChange = contextDirection.isDirectionChange(direction); |
109 if (_alwaysSpan || directionChange) { | 102 if (_alwaysSpan || directionChange) { |
110 var spanDirection = ''; | 103 var spanDirection = ''; |
111 if (directionChange) { | 104 if (directionChange) { |
112 spanDirection = ' dir=${direction.spanText}'; | 105 spanDirection = ' dir=${direction.spanText}'; |
113 } | 106 } |
114 result = '<span$spanDirection>$text</span>'; | 107 result = '<span$spanDirection>$text</span>'; |
115 } else { | 108 } else { |
116 result = text; | 109 result = text; |
117 } | 110 } |
118 return result + (resetDir ? _resetDir(text, direction, isHtml) : ''); | 111 return result + (resetDir ? _resetDir(text, direction, isHtml) : ''); |
119 } | 112 } |
120 | 113 |
121 /** | 114 /// Format [text] of a known (if specified) or estimated [direction] for use |
122 * Format [text] of a known (if specified) or estimated [direction] for use | 115 /// in *plain-text* output of the context directionality, so an |
123 * in *plain-text* output of the context directionality, so an | 116 /// opposite-directionality text is neither garbled nor garbles what follows |
124 * opposite-directionality text is neither garbled nor garbles what follows | 117 /// it. Unlike wrapWithSpan, this makes use of unicode BiDi formatting |
125 * it. Unlike wrapWithSpan, this makes use of unicode BiDi formatting | 118 /// characters instead of spans for wrapping. The returned string would be |
126 * characters instead of spans for wrapping. The returned string would be | 119 /// RLE+text+PDF for RTL text, or LRE+text+PDF for LTR text. |
127 * RLE+text+PDF for RTL text, or LRE+text+PDF for LTR text. | 120 /// |
128 * | 121 /// If [resetDir] is true, and if the overall directionality or the exit |
129 * If [resetDir] is true, and if the overall directionality or the exit | 122 /// directionality of text are opposite to the context directionality, |
130 * directionality of text are opposite to the context directionality, | 123 /// a trailing unicode BiDi mark matching the context directionality is |
131 * a trailing unicode BiDi mark matching the context directionality is | 124 /// appended (LRM or RLM). |
132 * appended (LRM or RLM). | 125 /// |
133 * | 126 /// In HTML, the *only* valid use of this function is inside of elements that |
134 * In HTML, the *only* valid use of this function is inside of elements that | 127 /// do not allow markup, e.g. an 'option' tag. |
135 * do not allow markup, e.g. an 'option' tag. | 128 /// This function does *not* do HTML-escaping regardless of the value of |
136 * This function does *not* do HTML-escaping regardless of the value of | 129 /// [isHtml]. [isHtml] is used to designate if the text contains HTML (escaped |
137 * [isHtml]. [isHtml] is used to designate if the text contains HTML (escaped | 130 /// or unescaped). |
138 * or unescaped). | |
139 */ | |
140 String wrapWithUnicode(String text, | 131 String wrapWithUnicode(String text, |
141 {bool isHtml: false, bool resetDir: true, TextDirection direction}) { | 132 {bool isHtml: false, bool resetDir: true, TextDirection direction}) { |
142 if (direction == null) direction = estimateDirection(text, isHtml: isHtml); | 133 if (direction == null) direction = estimateDirection(text, isHtml: isHtml); |
143 var result = text; | 134 var result = text; |
144 if (contextDirection.isDirectionChange(direction)) { | 135 if (contextDirection.isDirectionChange(direction)) { |
145 var marker = direction == TextDirection.RTL ? Bidi.RLE : Bidi.LRE; | 136 var marker = direction == TextDirection.RTL ? Bidi.RLE : Bidi.LRE; |
146 result = "${marker}$text${Bidi.PDF}"; | 137 result = "${marker}$text${Bidi.PDF}"; |
147 } | 138 } |
148 return result + (resetDir ? _resetDir(text, direction, isHtml) : ''); | 139 return result + (resetDir ? _resetDir(text, direction, isHtml) : ''); |
149 } | 140 } |
150 | 141 |
151 /** | 142 /// Estimates the directionality of [text] using the best known |
152 * Estimates the directionality of [text] using the best known | 143 /// general-purpose method (using relative word counts). A |
153 * general-purpose method (using relative word counts). A | 144 /// TextDirection.UNKNOWN return value indicates completely neutral input. |
154 * TextDirection.UNKNOWN return value indicates completely neutral input. | 145 /// [isHtml] is true if [text] HTML or HTML-escaped. |
155 * [isHtml] is true if [text] HTML or HTML-escaped. | |
156 */ | |
157 TextDirection estimateDirection(String text, {bool isHtml: false}) { | 146 TextDirection estimateDirection(String text, {bool isHtml: false}) { |
158 return Bidi.estimateDirectionOfText(text, isHtml: isHtml); //TODO~!!! | 147 return Bidi.estimateDirectionOfText(text, isHtml: isHtml); //TODO~!!! |
159 } | 148 } |
160 | 149 |
161 /** | 150 /// Returns a unicode BiDi mark matching the surrounding context's [direction] |
162 * Returns a unicode BiDi mark matching the surrounding context's [direction] | 151 /// (not necessarily the direction of [text]). The function returns an LRM or |
163 * (not necessarily the direction of [text]). The function returns an LRM or | 152 /// RLM if the overall directionality or the exit directionality of [text] is |
164 * RLM if the overall directionality or the exit directionality of [text] is | 153 /// opposite the context directionality. Otherwise |
165 * opposite the context directionality. Otherwise | 154 /// return the empty string. [isHtml] is true if [text] is HTML or |
166 * return the empty string. [isHtml] is true if [text] is HTML or | 155 /// HTML-escaped. |
167 * HTML-escaped. | |
168 */ | |
169 String _resetDir(String text, TextDirection direction, bool isHtml) { | 156 String _resetDir(String text, TextDirection direction, bool isHtml) { |
170 // endsWithRtl and endsWithLtr are called only if needed (short-circuit). | 157 // endsWithRtl and endsWithLtr are called only if needed (short-circuit). |
171 if ((contextDirection == TextDirection.LTR && | 158 if ((contextDirection == TextDirection.LTR && |
172 (direction == TextDirection.RTL || | 159 (direction == TextDirection.RTL || |
173 Bidi.endsWithRtl(text, isHtml))) || | 160 Bidi.endsWithRtl(text, isHtml))) || |
174 (contextDirection == TextDirection.RTL && | 161 (contextDirection == TextDirection.RTL && |
175 (direction == TextDirection.LTR || | 162 (direction == TextDirection.LTR || |
176 Bidi.endsWithLtr(text, isHtml)))) { | 163 Bidi.endsWithLtr(text, isHtml)))) { |
177 return contextDirection == TextDirection.LTR ? Bidi.LRM : Bidi.RLM; | 164 return contextDirection == TextDirection.LTR ? Bidi.LRM : Bidi.RLM; |
178 } else { | 165 } else { |
179 return ''; | 166 return ''; |
180 } | 167 } |
181 } | 168 } |
182 } | 169 } |
OLD | NEW |