|
|
Description[regexp] Experimental support for regexp named captures
Named capture groups may be specified using the /(?<name>pattern)/u
syntax, with named backreferences specified as /\k<name>/u. They're
hidden behind the --harmony-regexp-named-captures flag, and are only
enabled for unicode regexps.
R=yangguo@chromium.org
BUG=
Committed: https://crrev.com/ae23436cbf21ff26b083f4d91b3322edc783d69a
Cr-Commit-Position: refs/heads/master@{#36986}
Patch Set 1 #Patch Set 2 : Proper fixed array cast #
Total comments: 2
Patch Set 3 : ZoneVector instead of Vector #Patch Set 4 : static_cast<int> #
Total comments: 40
Patch Set 5 : Address comments & disable property creation for now #Patch Set 6 : Remove invalid DCHECKs #
Total comments: 1
Patch Set 7 : Address comments, base on ZoneVector CL #Patch Set 8 : Base on patchset #2 #Patch Set 9 : Remove broken test #Patch Set 10 : Use NewStringFromTwoByte overload #Patch Set 11 : Rebase #
Depends on Patchset: Messages
Total messages: 58 (26 generated)
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/1
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux_dbg_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_dbg_ng/builds/7056) v8_linux_dbg_ng_triggered on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_dbg_ng_triggered/b...)
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/20001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
Hi Yang, please take another look at this. Group names are now parsed straight into Vector<const uc16>, which is then converted into String. Still not sure whether Vector is the way to go here. I have another version of this using ZoneVector which simplifies e.g. CaptureNameBuffer, but requires a final (cheap) conversion into Vector<> when constructing the String instances.
https://codereview.chromium.org/2050343002/diff/20001/src/regexp/regexp-parse... File src/regexp/regexp-parser.cc (right): https://codereview.chromium.org/2050343002/diff/20001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:757: Vector<byte> backing_store_; Can't we simply use a ZoneList here? It already has all the on-demand expanding, and you can call ToConstVector to get a vector. It's zone-allocated so you won't need to dispose it later.
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/40001
https://codereview.chromium.org/2050343002/diff/20001/src/regexp/regexp-parse... File src/regexp/regexp-parser.cc (right): https://codereview.chromium.org/2050343002/diff/20001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:757: Vector<byte> backing_store_; On 2016/06/10 15:43:27, Yang wrote: > Can't we simply use a ZoneList here? It already has all the on-demand expanding, > and you can call ToConstVector to get a vector. It's zone-allocated so you won't > need to dispose it later. Right. I switched the CL to my ZoneVector version.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_avx2_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_avx2_rel_ng/buil...) v8_linux64_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_rel_ng/builds/7181) v8_linux_arm64_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_arm64_rel_ng/build...)
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/60001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2050343002/diff/60001/src/js/regexp.js File src/js/regexp.js (right): https://codereview.chromium.org/2050343002/diff/60001/src/js/regexp.js#newcod... src/js/regexp.js:165: %RegExpResultAttachNamedCaptures(result, JSREGEXP); This is going to be fairly expensive, and affects every regexp, even ones that has no named captures. I'm pretty sure this affects the Octane benchmark score, especially the regexp benchmark. We need a more efficient way to check whether the regexp has named captures at all, and only go into runtime if we do. I can think of a few alternatives, in descending order of what I think is best. - We implement this whole macro as turbofan stub. That's going to be fairly involved. - We check for kIrregexpCaptureNameMapIndex in regexp data in the generated code of %_RegExpConstructResult and call into %RegExpResultAttachNamedCaptures from there if necessary. - We use a hidden property using a private symbol as key to store whether we have named captures on the regexp. Loading from a property should be a lot cheaper thanks to inline caching and optimizations in optimized code. We could even store the fixed array for the property names there. - We change the RegExpExecStub to encode another element in MATCHINFO to indicate whether there is a named capture. - We can add another bit to regexp flags so that we can check upfront using %_RegExpFlags (see macros.py, REGEXP_GLOBAL). This is somewhat hacky, but easier to implement. How about we first land the rest of the CL without storing named properties to the result object? https://codereview.chromium.org/2050343002/diff/60001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/regexp/jsregexp.cc#... src/regexp/jsregexp.cc:423: value.is_null() ? nullptr : *value); let's store undefined or Smi::FromInt(0) instead. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... File src/regexp/regexp-parser.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:295: FLAG_harmony_regexp_named_captures) { I don't think this check is still necessary. We can just always Advance. If none of the flags below are set, we can just fall through and fail parsing. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:330: CreateNamedCaptureAtIndex(name, captures_started_ CHECK_FAILED); Can we simply attach the name to the parser state here and create the capture group later with the name at the place where we call GetCapture? That way we don't add another place where we create capture groups. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:531: // FALLTHROUGH I don't think we need all caps here :) Above we have "Fall through." as comment. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:731: ZoneVector<uc16>* backing_store_; Let's make this a non-dynamic member, like BytecodeArrayWriter::bytecodes_, and initialize it eagerly in the constructor's initializer list. In fact, we probably can avoid this whole class and just inline the surrogate encoding into ParseCaptureGroupName and define the ZoneVector there. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:786: named_captures_ = new (zone()) ZoneList<RegExpCapture*>(1, zone()); Let's make named_captures_ a non-dynamic member of RegExpParser. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:789: for (int i = 0; i < named_captures_->length(); i++) { You can use C++11 syntax here. for (const auto& named_capture : *named_captures_) https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:820: const int index = LookupCaptureGroupIndex(name); Let's not do this twice, here and in PatchNamedBackReferences. Let's always do this in the latter. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:830: named_back_references_ = Same here, let's make named_back_references_ a non-dynamic member. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:861: int RegExpParser::LookupCaptureGroupIndex(const ZoneVector<uc16>* name) { This can be inlined into PatchNamedBackReferences if this is only called from there. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:905: Vector<const uc16> vector(&(*capture->name())[0], Could we use a ZoneList for capture->name() instead of ZoneVector? They essentially do the same anyways. With ZoneList, you can simply use ToConstVector. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:916: void RegExpParser::FreeCaptureStrings() { Do we still need this and capture_strings_? https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... File src/runtime/runtime-regexp.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1022: if (regexp->TypeTag() != JSRegExp::IRREGEXP || We don't actually have to check this, right? We can assert this in the then-block, but for non-IRREGEXP the capture name map should not be a fixed array anyways. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1023: !regexp->DataAt(JSRegExp::kIrregexpCaptureNameMapIndex)->IsFixedArray()) Please add brackets around the then-block. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1034: const int ix = Smi::cast(captures->get(i + 1))->value(); Accessing captures like this is not GC-safe. You need to put captures into a handle. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1038: continue; // Skip conflicts. I don't think this can actually happen unless the property is on the array prototype. Let's not care about that and just always add the property. We can still change this once the spec takes shape.
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/80001
https://codereview.chromium.org/2050343002/diff/60001/src/js/regexp.js File src/js/regexp.js (right): https://codereview.chromium.org/2050343002/diff/60001/src/js/regexp.js#newcod... src/js/regexp.js:165: %RegExpResultAttachNamedCaptures(result, JSREGEXP); On 2016/06/13 10:54:52, Yang wrote: > This is going to be fairly expensive, and affects every regexp, even ones that > has no named captures. I'm pretty sure this affects the Octane benchmark score, > especially the regexp benchmark. > > We need a more efficient way to check whether the regexp has named captures at > all, and only go into runtime if we do. > > I can think of a few alternatives, in descending order of what I think is best. > - We implement this whole macro as turbofan stub. That's going to be fairly > involved. > - We check for kIrregexpCaptureNameMapIndex in regexp data in the generated code > of %_RegExpConstructResult and call into %RegExpResultAttachNamedCaptures from > there if necessary. I'm tending towards this as an initial approach. This would mean passing the JSRegExp into %_RegExpConstructResult for accessing the capture map though. > > How about we first land the rest of the CL without storing named properties to > the result object? Sounds good. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/jsregexp.cc File src/regexp/jsregexp.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/regexp/jsregexp.cc#... src/regexp/jsregexp.cc:423: value.is_null() ? nullptr : *value); On 2016/06/13 10:54:52, Yang wrote: > let's store undefined or Smi::FromInt(0) instead. Done. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... File src/regexp/regexp-parser.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:295: FLAG_harmony_regexp_named_captures) { On 2016/06/13 10:54:52, Yang wrote: > I don't think this check is still necessary. We can just always Advance. If none > of the flags below are set, we can just fall through and fail parsing. Done. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:330: CreateNamedCaptureAtIndex(name, captures_started_ CHECK_FAILED); On 2016/06/13 10:54:52, Yang wrote: > Can we simply attach the name to the parser state here and create the capture > group later with the name at the place where we call GetCapture? That way we > don't add another place where we create capture groups. Done. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:531: // FALLTHROUGH On 2016/06/13 10:54:53, Yang wrote: > I don't think we need all caps here :) > Above we have "Fall through." as comment. Looking at the rest of the file, I think we have every variation of capitalization and white space possible. There actually is a second instance of FALLTHROUGH right above this, which is what I tried to stay consistent with (didn't notice the others). I can go through in a follow-up commit and make sure we use a single style :) https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:731: ZoneVector<uc16>* backing_store_; On 2016/06/13 10:54:52, Yang wrote: > Let's make this a non-dynamic member, like BytecodeArrayWriter::bytecodes_, and > initialize it eagerly in the constructor's initializer list. > > In fact, we probably can avoid this whole class and just inline the surrogate > encoding into ParseCaptureGroupName and define the ZoneVector there. Done. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:786: named_captures_ = new (zone()) ZoneList<RegExpCapture*>(1, zone()); On 2016/06/13 10:54:52, Yang wrote: > Let's make named_captures_ a non-dynamic member of RegExpParser. Do you have an intuition about how much overhead initialization of named_captures_ and named_back_references_ adds to the case of regexps without named captures? The captures_ list itself is also dynamic, and I assumed it was due to performance reasons. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:789: for (int i = 0; i < named_captures_->length(); i++) { On 2016/06/13 10:54:52, Yang wrote: > You can use C++11 syntax here. > > for (const auto& named_capture : *named_captures_) Done. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:820: const int index = LookupCaptureGroupIndex(name); On 2016/06/13 10:54:53, Yang wrote: > Let's not do this twice, here and in PatchNamedBackReferences. Let's always do > this in the latter. We needed the index here to determine whether to call AddEmpty() or add a back reference node. But since we now store the name in state, we can skip this lookup. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:830: named_back_references_ = On 2016/06/13 10:54:53, Yang wrote: > Same here, let's make named_back_references_ a non-dynamic member. See above. This change is trivial, just want to make sure it's the right way to go :) https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:861: int RegExpParser::LookupCaptureGroupIndex(const ZoneVector<uc16>* name) { On 2016/06/13 10:54:52, Yang wrote: > This can be inlined into PatchNamedBackReferences if this is only called from > there. Done. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:905: Vector<const uc16> vector(&(*capture->name())[0], On 2016/06/13 10:54:53, Yang wrote: > Could we use a ZoneList for capture->name() instead of ZoneVector? They > essentially do the same anyways. With ZoneList, you can simply use > ToConstVector. I used ZoneVector because of mstarzinger's comment '[...] once we got rid of our own home-grown ZoneList [...]' in zone-containers. I'm fine with either container type, but is there a real benefit to switching to ZoneVector? What about adding ToConstVector to ZoneVector? https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:916: void RegExpParser::FreeCaptureStrings() { On 2016/06/13 10:54:52, Yang wrote: > Do we still need this and capture_strings_? No. Thanks, good catch. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... File src/runtime/runtime-regexp.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1022: if (regexp->TypeTag() != JSRegExp::IRREGEXP || On 2016/06/13 10:54:53, Yang wrote: > We don't actually have to check this, right? We can assert this in the > then-block, but for non-IRREGEXP the capture name map should not be a fixed > array anyways. I think we need to check, as ATOM JSRegExps have a shorter data array (see Factory::SetRegExpAtomData) and we'd have an out of bounds access. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1023: !regexp->DataAt(JSRegExp::kIrregexpCaptureNameMapIndex)->IsFixedArray()) On 2016/06/13 10:54:53, Yang wrote: > Please add brackets around the then-block. Done. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1034: const int ix = Smi::cast(captures->get(i + 1))->value(); On 2016/06/13 10:54:53, Yang wrote: > Accessing captures like this is not GC-safe. You need to put captures into a > handle. Done. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1038: continue; // Skip conflicts. On 2016/06/13 10:54:53, Yang wrote: > I don't think this can actually happen unless the property is on the array > prototype. Let's not care about that and just always add the property. We can > still change this once the spec takes shape. This happens for regexp properties 'index' and 'input'. There's two test cases covering this. If we remove this check, those properties could be overwritten.
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux_arm64_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_arm64_rel_ng/build...) v8_linux_arm64_rel_ng_triggered on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_arm64_rel_ng_trigg...)
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/100001
lgtm https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... File src/regexp/regexp-parser.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:786: named_captures_ = new (zone()) ZoneList<RegExpCapture*>(1, zone()); On 2016/06/13 13:10:00, jgruber wrote: > On 2016/06/13 10:54:52, Yang wrote: > > Let's make named_captures_ a non-dynamic member of RegExpParser. > > Do you have an intuition about how much overhead initialization of > named_captures_ and named_back_references_ adds to the case of regexps without > named captures? > > The captures_ list itself is also dynamic, and I assumed it was due to > performance reasons. Not a lot. List takes 3 pointers, so dynamic allocation would take 1 or 4 depending on whether its initialized. If we don't allocate dynamically, we also save some code. I don't have strong opinion on this, and can be convinced either way. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:905: Vector<const uc16> vector(&(*capture->name())[0], On 2016/06/13 13:10:00, jgruber wrote: > On 2016/06/13 10:54:53, Yang wrote: > > Could we use a ZoneList for capture->name() instead of ZoneVector? They > > essentially do the same anyways. With ZoneList, you can simply use > > ToConstVector. > > I used ZoneVector because of mstarzinger's comment '[...] once we got rid of our > own home-grown ZoneList [...]' in zone-containers. > > I'm fine with either container type, but is there a real benefit to switching to > ZoneVector? What about adding ToConstVector to ZoneVector? I guess adding ToConstVector to ZoneVector also works. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... File src/runtime/runtime-regexp.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1038: continue; // Skip conflicts. On 2016/06/13 13:10:00, jgruber wrote: > On 2016/06/13 10:54:53, Yang wrote: > > I don't think this can actually happen unless the property is on the array > > prototype. Let's not care about that and just always add the property. We can > > still change this once the spec takes shape. > > This happens for regexp properties 'index' and 'input'. There's two test cases > covering this. If we remove this check, those properties could be overwritten. I see. Can we use JSReceiver::HasOwnProperty for this? https://codereview.chromium.org/2050343002/diff/100001/test/mjsunit/harmony/r... File test/mjsunit/harmony/regexp-named-captures.js (right): https://codereview.chromium.org/2050343002/diff/100001/test/mjsunit/harmony/r... test/mjsunit/harmony/regexp-named-captures.js:1: // Copyright 2015 the V8 project authors. All rights reserved. Can we also some parsing-related tests in cctest/test-regexp/RegExpParser?
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/120001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_avx2_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_avx2_rel_ng/buil...) v8_linux64_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_rel_ng/builds/7225) v8_linux_arm64_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux_arm64_rel_ng/build...)
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/140001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: Try jobs failed on following builders: v8_linux64_avx2_rel_ng on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_avx2_rel_ng/buil...) v8_linux64_avx2_rel_ng_triggered on tryserver.v8 (JOB_FAILED, http://build.chromium.org/p/tryserver.v8/builders/v8_linux64_avx2_rel_ng_trig...)
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/160001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... File src/regexp/regexp-parser.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:786: named_captures_ = new (zone()) ZoneList<RegExpCapture*>(1, zone()); On 2016/06/13 13:38:00, Yang wrote: > On 2016/06/13 13:10:00, jgruber wrote: > > On 2016/06/13 10:54:52, Yang wrote: > > > Let's make named_captures_ a non-dynamic member of RegExpParser. > > > > Do you have an intuition about how much overhead initialization of > > named_captures_ and named_back_references_ adds to the case of regexps without > > named captures? > > > > The captures_ list itself is also dynamic, and I assumed it was due to > > performance reasons. > > Not a lot. List takes 3 pointers, so dynamic allocation would take 1 or 4 > depending on whether its initialized. If we don't allocate dynamically, we also > save some code. > > I don't have strong opinion on this, and can be convinced either way. Ok. I'll stick with dynamic lists for now, just to stay consistent to captures_ and to avoid object creation for patterns without named captures. https://codereview.chromium.org/2050343002/diff/60001/src/regexp/regexp-parse... src/regexp/regexp-parser.cc:905: Vector<const uc16> vector(&(*capture->name())[0], On 2016/06/13 13:38:00, Yang wrote: > On 2016/06/13 13:10:00, jgruber wrote: > > On 2016/06/13 10:54:53, Yang wrote: > > > Could we use a ZoneList for capture->name() instead of ZoneVector? They > > > essentially do the same anyways. With ZoneList, you can simply use > > > ToConstVector. > > > > I used ZoneVector because of mstarzinger's comment '[...] once we got rid of > our > > own home-grown ZoneList [...]' in zone-containers. > > > > I'm fine with either container type, but is there a real benefit to switching > to > > ZoneVector? What about adding ToConstVector to ZoneVector? > > I guess adding ToConstVector to ZoneVector also works. Done. https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... File src/runtime/runtime-regexp.cc (right): https://codereview.chromium.org/2050343002/diff/60001/src/runtime/runtime-reg... src/runtime/runtime-regexp.cc:1038: continue; // Skip conflicts. On 2016/06/13 13:38:00, Yang wrote: > On 2016/06/13 13:10:00, jgruber wrote: > > On 2016/06/13 10:54:53, Yang wrote: > > > I don't think this can actually happen unless the property is on the array > > > prototype. Let's not care about that and just always add the property. We > can > > > still change this once the spec takes shape. > > > > This happens for regexp properties 'index' and 'input'. There's two test cases > > covering this. If we remove this check, those properties could be overwritten. > > I see. Can we use JSReceiver::HasOwnProperty for this? Sounds good, will change this when we look at adding properties for named captures.
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/180001
The CQ bit was checked by jgruber@chromium.org to run a CQ dry run
Dry run: CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/200001
The CQ bit was unchecked by commit-bot@chromium.org
Dry run: This issue passed the CQ dry run.
The CQ bit was checked by jgruber@chromium.org
The patchset sent to the CQ was uploaded after l-g-t-m from yangguo@chromium.org Link to the patchset: https://codereview.chromium.org/2050343002/#ps200001 (title: "Rebase")
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/200001
The CQ bit was unchecked by jgruber@chromium.org
The CQ bit was checked by jgruber@chromium.org
CQ is trying da patch. Follow status at https://chromium-cq-status.appspot.com/patch-status/2050343002/200001
Message was sent while issue was closed.
Committed patchset #11 (id:200001)
Message was sent while issue was closed.
Description was changed from ========== [regexp] Experimental support for regexp named captures Named capture groups may be specified using the /(?<name>pattern)/u syntax, with named backreferences specified as /\k<name>/u. They're hidden behind the --harmony-regexp-named-captures flag, and are only enabled for unicode regexps. R=yangguo@chromium.org BUG= ========== to ========== [regexp] Experimental support for regexp named captures Named capture groups may be specified using the /(?<name>pattern)/u syntax, with named backreferences specified as /\k<name>/u. They're hidden behind the --harmony-regexp-named-captures flag, and are only enabled for unicode regexps. R=yangguo@chromium.org BUG= Committed: https://crrev.com/ae23436cbf21ff26b083f4d91b3322edc783d69a Cr-Commit-Position: refs/heads/master@{#36986} ==========
Message was sent while issue was closed.
Patchset 11 (id:??) landed as https://crrev.com/ae23436cbf21ff26b083f4d91b3322edc783d69a Cr-Commit-Position: refs/heads/master@{#36986} |