|
|
Created:
4 years, 11 months ago by Yang Modified:
4 years, 11 months ago Reviewers:
erikcorry, Erik Corry, Dan Ehrenberg, mathias CC:
v8-reviews_googlegroups.com Base URL:
https://chromium.googlesource.com/v8/v8.git@master Target Ref:
refs/pending/heads/master Project:
v8 Visibility:
Public. |
Description[regexp] implement character classes for unicode regexps.
We divide character ranges into
- BMP, matched normally.
- non-BMP, matched as alternatives of surrogate pair ranges.
- lone surrogates, matched with lookaround assertion that its indeed lone.
R=erik.corry@gmail.com
BUG=v8:2952
LOG=N
Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04
Cr-Commit-Position: refs/heads/master@{#33432}
Committed: https://crrev.com/e709aa24c0c17abf684972fbb9e887731b20fd41
Cr-Commit-Position: refs/heads/master@{#33437}
Patch Set 1 #Patch Set 2 : simplify /./u a bit. #Patch Set 3 : use constants #Patch Set 4 : refactorings #Patch Set 5 : windows warning fix #Patch Set 6 : lookaround builder #
Total comments: 18
Patch Set 7 : rebase #Patch Set 8 : addressed comments #
Total comments: 4
Patch Set 9 : addressed comments on the test case. #Patch Set 10 : test cases #Patch Set 11 : rebase #
Total comments: 2
Patch Set 12 : fix #Patch Set 13 : more tests #
Created: 4 years, 11 months ago
Dependent Patchsets: Messages
Total messages: 54 (26 generated)
Description was changed from ========== [regexp] implement character classes for unicode regexps. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally - non-BMP, matched as surrogate pair - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ==========
Description was changed from ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally - non-BMP, matched as surrogate pair - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ==========
On 2016/01/13 14:23:24, Yang wrote: ping
The CQ bit was checked by yangguo@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/60001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/60001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_win_nosnap_shared_compile_rel on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_win_nosnap_shared_compil...)
The CQ bit was checked by yangguo@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/80001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/80001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
littledan@chromium.org changed reviewers: + littledan@chromium.org - erik.corry@gmail.com
yangguo@chromium.org changed reviewers: + erik.corry@gmail.com
The CQ bit was checked by yangguo@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/100001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/100001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
erikcorry@google.com changed reviewers: + erikcorry@google.com
https://codereview.chromium.org/1578253005/diff/100001/src/ostreams.cc File src/ostreams.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/ostreams.cc#newcode65 src/ostreams.cc:65: if (c <= 0xffff) return PrintUC16(os, static_cast<uint16_t>(c), pred); kMaxUtf16CodeUnit https://codereview.chromium.org/1578253005/diff/100001/src/ostreams.cc#newcode66 src/ostreams.cc:66: char buf[12]; Perhaps 13 so that if c.value is somehow a maximal 32 bit value outside the Unicode range we still have space for the final } https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:997: inline bool ignore_case() { return flags_ & JSRegExp::kIgnoreCase; } Do we allow implicit int->bool conversions? https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:4889: non_bmp_(nullptr) { What is going on here? Some explanation needed, I feel. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:4970: result->AddAlternative(GuardedAlternative(TextNode::SurrogatePair( This one we only need if from_t is not kTrailSurrogateStart. Proposal: if (from_t != kTrailSurrogateStart) { result->AddAlternative(... from_l++; } if (to_t != kTrailSurrogateEnd) { result->AddAlternative(... to_l--; } result->AddAlternative(... This replaces the else-if and the else clause. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:5039: ? NegativeLookbehindAndMatch(compiler, trail_surrogates, I don't think so. Even if you are reading backwards, it is the position after the match that needs to be checked. But in the backwards case you want to check it chronologically before you check the actual character (and move the current position). https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:5099: if (ranges->at(i).from() <= String::kMaxUtf16CodeUnit) break; For the one-byte case don't we want to have a lower limit than kMaxUtf16CodeUnit? https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.h File src/regexp/jsregexp.h (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.h#... src/regexp/jsregexp.h:721: static TextNode* CharacterRanges(Zone* zone, ZoneList<CharacterRange>* ranges, These two could use a 1-line comment explaining what they do. They are 'named constructors', so perhaps they should also have the word 'Create' in their names.
Patchset #8 (id:140001) has been deleted
Thanks for the comments. Hopefully addressed them sufficiently in the newest patch set. https://codereview.chromium.org/1578253005/diff/100001/src/ostreams.cc File src/ostreams.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/ostreams.cc#newcode65 src/ostreams.cc:65: if (c <= 0xffff) return PrintUC16(os, static_cast<uint16_t>(c), pred); On 2016/01/20 10:47:05, erikcorry wrote: > kMaxUtf16CodeUnit Done. https://codereview.chromium.org/1578253005/diff/100001/src/ostreams.cc#newcode66 src/ostreams.cc:66: char buf[12]; On 2016/01/20 10:47:05, erikcorry wrote: > Perhaps 13 so that if c.value is somehow a maximal 32 bit value outside the > Unicode range we still have space for the final } Done. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:997: inline bool ignore_case() { return flags_ & JSRegExp::kIgnoreCase; } On 2016/01/20 10:47:05, erikcorry wrote: > Do we allow implicit int->bool conversions? We do this else where, for example in src/compiler/machine-operator.h. But I guess it can't hurt to have it explicit. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:4889: non_bmp_(nullptr) { On 2016/01/20 10:47:05, erikcorry wrote: > What is going on here? Some explanation needed, I feel. Added a wall of comment. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:4970: result->AddAlternative(GuardedAlternative(TextNode::SurrogatePair( On 2016/01/20 10:47:05, erikcorry wrote: > This one we only need if from_t is not kTrailSurrogateStart. > > Proposal: > > if (from_t != kTrailSurrogateStart) { > result->AddAlternative(... > from_l++; > } > if (to_t != kTrailSurrogateEnd) { > result->AddAlternative(... > to_l--; > } > result->AddAlternative(... > > This replaces the else-if and the else clause. Done. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:5039: ? NegativeLookbehindAndMatch(compiler, trail_surrogates, On 2016/01/20 10:47:05, erikcorry wrote: > I don't think so. Even if you are reading backwards, it is the position after > the match that needs to be checked. But in the backwards case you want to check > it chronologically before you check the actual character (and move the current > position). I think it's the naming that's confusing. Inside the function, if the last parameter is true (read_backward), we first perform a negative lookahead for trailing surrogates, and then do a backward match for the lead surrogate. NegativeLookbehindAndMatch is actually to be understood as such in reading forward case. When reading backward, we would actually be doing a negative lookahead. I renamed it to be easier to understand, and elaborated in the comments. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:5099: if (ranges->at(i).from() <= String::kMaxUtf16CodeUnit) break; On 2016/01/20 10:47:05, erikcorry wrote: > For the one-byte case don't we want to have a lower limit than > kMaxUtf16CodeUnit? Actually this is not necessary since we do the limiting again in EmitCharClass, but then properly account for the one-byte case. Removed. https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.h File src/regexp/jsregexp.h (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.h#... src/regexp/jsregexp.h:721: static TextNode* CharacterRanges(Zone* zone, ZoneList<CharacterRange>* ranges, On 2016/01/20 10:47:05, erikcorry wrote: > These two could use a 1-line comment explaining what they do. They are 'named > constructors', so perhaps they should also have the word 'Create' in their > names. Done.
https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... File test/mjsunit/harmony/unicode-character-ranges.js (right): https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... test/mjsunit/harmony/unicode-character-ranges.js:12: function tests(expectation, string, subject) { string -> source or regexp_source https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... test/mjsunit/harmony/unicode-character-ranges.js:93: "\ud801\udc00AB\udc00AB\ud802\ud803AB"); This is saying that if we try to match lone-surrogate-followed by A it should match the \udc00 even though it is not lone?
https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... File test/mjsunit/harmony/unicode-character-ranges.js (right): https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... test/mjsunit/harmony/unicode-character-ranges.js:12: function tests(expectation, string, subject) { On 2016/01/20 13:48:51, erikcorry wrote: > string -> source or regexp_source Done. https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... test/mjsunit/harmony/unicode-character-ranges.js:93: "\ud801\udc00AB\udc00AB\ud802\ud803AB"); On 2016/01/20 13:48:51, erikcorry wrote: > This is saying that if we try to match lone-surrogate-followed by A it should > match the \udc00 even though it is not lone? This is just (a random test) to check that character ranges inside a lookbehind works. I added another one.
On 2016/01/20 14:04:14, Yang wrote: > https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... > File test/mjsunit/harmony/unicode-character-ranges.js (right): > > https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... > test/mjsunit/harmony/unicode-character-ranges.js:12: function tests(expectation, > string, subject) { > On 2016/01/20 13:48:51, erikcorry wrote: > > string -> source or regexp_source > > Done. > > https://codereview.chromium.org/1578253005/diff/160001/test/mjsunit/harmony/u... > test/mjsunit/harmony/unicode-character-ranges.js:93: > "\ud801\udc00AB\udc00AB\ud802\ud803AB"); > On 2016/01/20 13:48:51, erikcorry wrote: > > This is saying that if we try to match lone-surrogate-followed by A it should > > match the \udc00 even though it is not lone? > > This is just (a random test) to check that character ranges inside a lookbehind > works. I added another one. Added your proposed set of tests, and added a few random ones myself.
The CQ bit was checked by yangguo@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/200001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/200001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
LGTM https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:1716: static inline bool EmitAtomLetter(Isolate* isolate, This is not /u-ified. Does it get called in /ui mode?
LGTM https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:1716: static inline bool EmitAtomLetter(Isolate* isolate, This is not /u-ified. Does it get called in /ui mode?
https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/1578253005/diff/100001/src/regexp/jsregexp.cc... src/regexp/jsregexp.cc:1716: static inline bool EmitAtomLetter(Isolate* isolate, On 2016/01/21 10:58:56, erikcorry wrote: > This is not /u-ified. Does it get called in /ui mode? No. This only gets called if the TextNode element kind is an TextElement::ATOM as opposed to TextElement::CHAR_CLASS. However, in all the cases where unicode is relevant (surrogate pair of lone surrogates), we create a RegExpCharacterClass as a standalone term, which then desugars into whatever we need. TextNodes it desugars to are all created via TextNode::CharClass though, EmitCharClass eventually does the job of emitting code, which generates good code for singleton character classes as well. For case independent matching, which is handled by EmitAtomLetter, I have a separate CL prepared.
The CQ bit was checked by yangguo@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from erikcorry@google.com Link to the patchset: https://codereview.chromium.org/1578253005/#ps220001 (title: "rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/220001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/220001
Message was sent while issue was closed.
Description was changed from ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ==========
Message was sent while issue was closed.
Committed patchset #11 (id:220001)
Message was sent while issue was closed.
Description was changed from ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} ==========
Message was sent while issue was closed.
Patchset 11 (id:??) landed as https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432}
Message was sent while issue was closed.
A revert of this CL (patchset #11 id:220001) has been created in https://codereview.chromium.org/1618753002/ by yangguo@chromium.org. The reason for reverting is: Compile failure on arm. https://build.chromium.org/p/client.v8/builders/V8%20Arm%20-%20debug%20builde....
Message was sent while issue was closed.
Description was changed from ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} ==========
mathias@qiwi.be changed reviewers: + mathias@qiwi.be
Some more functional tests: https://mathias.html5.org/tests/javascript/regexp/ (not exhaustive, just a bunch of examples from https://mathiasbynens.be/notes/es6-unicode-regex) https://codereview.chromium.org/1578253005/diff/220001/test/mjsunit/harmony/u... File test/mjsunit/harmony/unicode-character-ranges.js (right): https://codereview.chromium.org/1578253005/diff/220001/test/mjsunit/harmony/u... test/mjsunit/harmony/unicode-character-ranges.js:45: execs(null, "[^\u{12340}-\u{12345}]", "\u{12342}"); To test the equivalent regular expression using surrogate pair notation as well: execl(["\u{12342}"], /[\u{12340}-\u{12345}]/u, "\u{12342}"); execs(["\u{12342}"], "[\u{12340}-\u{12345}]", "\u{12342}"); execl(["\u{12342}"], /[^\ud808\udf40-\ud808\udf45]/u, "\u{12342}"); execs(["\u{12342}"], "[^\ud808\udf40-\ud808\udf45]", "\u{12342}"); execl(null, /[^\u{12340}-\u{12345}]/u, "\u{12342}"); execs(null, "[^\u{12340}-\u{12345}]", "\u{12342}"); execl(null, /[^\ud808\udf40-\ud808\udf45]/u, "\u{12342}"); execs(null, "[^\ud808\udf40-\ud808\udf45]", "\u{12342}"); (In `u` mode, surrogate pairs are treated as a single code point.) https://codereview.chromium.org/1578253005/diff/220001/test/mjsunit/harmony/u... test/mjsunit/harmony/unicode-character-ranges.js:50: execs(null, "[^\u{ff80}-\u{12345}]", "\u{ffff}"); Same here: execl(["\u{ffff}"], /[\u{ff80}-\u{12345}]/u, "\u{ffff}"); execs(["\u{ffff}"], "[\u{ff80}-\u{12345}]", "\u{ffff}"); execl(["\u{ffff}"], /[\u{ff80}-\ud808\udf45]/u, "\u{ffff}"); execs(["\u{ffff}"], "[\u{ff80}-\ud808\udf45]", "\u{ffff}"); execl(null, /[^\u{ff80}-\u{12345}]/u, "\u{ffff}"); execs(null, "[^\u{ff80}-\u{12345}]", "\u{ffff}"); execl(null, /[^\u{ff80}-\ud808\udf45]/u, "\u{ffff}"); execs(null, "[^\u{ff80}-\ud808\udf45]", "\u{ffff}");
The CQ bit was checked by yangguo@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/240001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/240001
The CQ bit was unchecked by yangguo@chromium.org
On 2016/01/21 12:44:37, mathias wrote: > Some more functional tests: https://mathias.html5.org/tests/javascript/regexp/ > (not exhaustive, just a bunch of examples from > https://mathiasbynens.be/notes/es6-unicode-regex) > > https://codereview.chromium.org/1578253005/diff/220001/test/mjsunit/harmony/u... > File test/mjsunit/harmony/unicode-character-ranges.js (right): > > https://codereview.chromium.org/1578253005/diff/220001/test/mjsunit/harmony/u... > test/mjsunit/harmony/unicode-character-ranges.js:45: execs(null, > "[^\u{12340}-\u{12345}]", "\u{12342}"); > To test the equivalent regular expression using surrogate pair notation as well: > > execl(["\u{12342}"], /[\u{12340}-\u{12345}]/u, "\u{12342}"); > execs(["\u{12342}"], "[\u{12340}-\u{12345}]", "\u{12342}"); > execl(["\u{12342}"], /[^\ud808\udf40-\ud808\udf45]/u, "\u{12342}"); > execs(["\u{12342}"], "[^\ud808\udf40-\ud808\udf45]", "\u{12342}"); These two seem to be wrong, with the "^" negation :) > execl(null, /[^\u{12340}-\u{12345}]/u, "\u{12342}"); > execs(null, "[^\u{12340}-\u{12345}]", "\u{12342}"); > execl(null, /[^\ud808\udf40-\ud808\udf45]/u, "\u{12342}"); > execs(null, "[^\ud808\udf40-\ud808\udf45]", "\u{12342}"); > > (In `u` mode, surrogate pairs are treated as a single code point.) > > https://codereview.chromium.org/1578253005/diff/220001/test/mjsunit/harmony/u... > test/mjsunit/harmony/unicode-character-ranges.js:50: execs(null, > "[^\u{ff80}-\u{12345}]", "\u{ffff}"); > Same here: > > execl(["\u{ffff}"], /[\u{ff80}-\u{12345}]/u, "\u{ffff}"); > execs(["\u{ffff}"], "[\u{ff80}-\u{12345}]", "\u{ffff}"); > execl(["\u{ffff}"], /[\u{ff80}-\ud808\udf45]/u, "\u{ffff}"); > execs(["\u{ffff}"], "[\u{ff80}-\ud808\udf45]", "\u{ffff}"); > execl(null, /[^\u{ff80}-\u{12345}]/u, "\u{ffff}"); > execs(null, "[^\u{ff80}-\u{12345}]", "\u{ffff}"); > execl(null, /[^\u{ff80}-\ud808\udf45]/u, "\u{ffff}"); > execs(null, "[^\u{ff80}-\ud808\udf45]", "\u{ffff}");
The CQ bit was checked by yangguo@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from erikcorry@google.com Link to the patchset: https://codereview.chromium.org/1578253005/#ps260001 (title: "more tests")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/1578253005/260001 View timeline at https://chromium-cq-status.appspot.com/patch-timeline/1578253005/260001
Message was sent while issue was closed.
Description was changed from ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} ==========
Message was sent while issue was closed.
Committed patchset #13 (id:260001)
Message was sent while issue was closed.
Description was changed from ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} ========== to ========== [regexp] implement character classes for unicode regexps. We divide character ranges into - BMP, matched normally. - non-BMP, matched as alternatives of surrogate pair ranges. - lone surrogates, matched with lookaround assertion that its indeed lone. R=erik.corry@gmail.com BUG=v8:2952 LOG=N Committed: https://crrev.com/ea820ad5fa282a323a86fe20e64f83ee67ba5f04 Cr-Commit-Position: refs/heads/master@{#33432} Committed: https://crrev.com/e709aa24c0c17abf684972fbb9e887731b20fd41 Cr-Commit-Position: refs/heads/master@{#33437} ==========
Message was sent while issue was closed.
Patchset 13 (id:??) landed as https://crrev.com/e709aa24c0c17abf684972fbb9e887731b20fd41 Cr-Commit-Position: refs/heads/master@{#33437} |