|
|
Chromium Code Reviews|
Created:
6 years, 4 months ago by srawlins Modified:
6 years, 3 months ago CC:
reviews_dartlang.org, vm-dev_dartlang.org Visibility:
Public. |
DescriptionAdd optional startIndex to String.replaceFirst
BUG= https://code.google.com/p/dart/issues/detail?id=3194
R=lrn@google.com
Committed: https://code.google.com/p/dart/source/detail?r=39490
Patch Set 1 #
Total comments: 26
Patch Set 2 : Update using allMatches's startIndex; more tests #
Total comments: 12
Patch Set 3 : Cleaner after lrn@'s feedback #
Total comments: 6
Patch Set 4 : Little fixes #
Messages
Total messages: 15 (0 generated)
Hi Bill, would you mind reviewing? (I chose you because you opened this issue, although it was more than 2 years ago :) If someone else should review, could you add them? Thanks!
I would like Lasse to review this, since he is making decisions about core library changes. If we do add this, then should we also add a start index to .allMatches, which is the fundamental method that the others are based on? I prefer named optional arguments, but if we use positional arguments in the core libraries, that is fine. My opinion is that this should be part of a larger change that adds startIndex to more search (and replace) methods, but it would be fine on its own, or as the first commit of a series of changes. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dart File runtime/lib/string_patch.dart (left): https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:424: StringBuffer buffer = new StringBuffer(); This could be made more efficient by moving the StringBuffer and prefix write inside the iterator.moveNext() check. If there is no match, just return the original string. It is too bad that allMatches doesn't allow a start index. https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... File tests/corelib/string_replace_test.dart (right): https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... tests/corelib/string_replace_test.dart:62: // Test negative startIndex Add tests for null and object as startIndex.
https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dart File runtime/lib/string_patch.dart (right): https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:429: String remaining = this.substring(startIndex); Creating a substring, and then using allMatches, is far too inefficient. If we can't do it better, I'd prefer to not do it at all. Now, what I would have want, and probably can't have, is to add startIndex as an optional argument to Pattern.allMatches. It should, technically, be a non-breaking change, but it can still be hard to get changes to fundamental code through reviews.
https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dart File runtime/lib/string_patch.dart (right): https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:429: String remaining = this.substring(startIndex); This implementation is actually wrong: A "^" must only match at the start of the string. By using a substring, it may now match at the startIndex instead: "abcdefabcdef".replaceFirst(new RegExp("^a"), "b", 6) should NOT replace the second "a".
https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:72: stringReplaceJS(receiver, replacer, to, [startIndex = 0]) { No need to make startIndex optional here. It's an internal function so we always control the number of arguments. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:84: return JS('String', r'# + #', prefix, result); This also looks too complicated. It might be necessary, but I'm not convinced yet :) https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:197: return stringReplaceJS(receiver, from, to, startIndex); Have we tried timing this against: var index = receiver.indexOf(from, startIndex); if (index < 0) return receiver; return receiver.substring(0, index) + to + receiver.substring(index + from.length);
https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dart File runtime/lib/string_patch.dart (right): https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:417: String replaceFirst(Pattern pattern, String replacement, [int startIndex = 0]) { long line. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:425: throw new RangeError.value(startIndex); You can use throw new RangeError.range(startIndex, 0, this.length); https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:429: String remaining = this.substring(startIndex); And now we DO have start-index on allMatches, so this could be just: Iterator iterator = startIndex == 0 ? pattern.allMatches(this) : pattern.allMatches(this, startIndex); if (!iterator.moveNext()) return this; Match match = iterator.current; return "${this.substring(0, match.start)}$replacement${this.substring(match.end)}"; The special casing of startIndex == 0 is important to avoid breaking implementations of pattern that doesn't have the second argument. Existing programs using those would also not pass an extra argument to this method. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:84: return JS('String', r'# + #', prefix, result); It's also wrong due to the "^" problem. Maybe not modify this function, so it only works with startIndex = 0, and use a different function for the other case: replaceFirstJSRE(receiver, regexp, to, startIndex) { var match = regexp._execGlobal(receiver, startIndex); if (match == null) return receiver; var start = match.start; var end = match.end; return "${receiver.substring(0,start)}$to${receiver.substring(end)}"; } Maybe just inline this in the stringReplaceFirstUnchecked function's regexp-case. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:200: return stringReplaceJS(receiver, re, to, startIndex); Use this only if startIndex is zero, otherwise use the code above. https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... File tests/corelib/string_replace_test.dart (right): https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... tests/corelib/string_replace_test.dart:70: } Test with regexps too, and with strings that contains something that looks like a regexp, e.g., Expect.equals("aaax", "aaaa{3}".replaceFirst("a{3}", "x", 1)); Expect.equals("ax{3}", "aaaa{3}".replaceFirst( new RegExp("a{3}"), "x", 1));
Thanks so much, Lasse and Bill. This patch should be much higher quality, thanks largely to Lasse's prototypes and Pattern.allMatches patch. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dart File runtime/lib/string_patch.dart (left): https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:424: StringBuffer buffer = new StringBuffer(); On 2014/08/11 07:47:47, Bill Hesse wrote: > This could be made more efficient by moving the StringBuffer and prefix write > inside the iterator.moveNext() check. If there is no match, just return the > original string. It is too bad that allMatches doesn't allow a start index. Done. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dart File runtime/lib/string_patch.dart (right): https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:417: String replaceFirst(Pattern pattern, String replacement, [int startIndex = 0]) { On 2014/08/13 08:35:09, Lasse Reichstein Nielsen wrote: > long line. Done. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:425: throw new RangeError.value(startIndex); On 2014/08/13 08:35:09, Lasse Reichstein Nielsen wrote: > You can use > throw new RangeError.range(startIndex, 0, this.length); Done. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:429: String remaining = this.substring(startIndex); On 2014/08/11 08:13:32, Lasse Reichstein Nielsen wrote: > Creating a substring, and then using allMatches, is far too inefficient. If we > can't do it better, I'd prefer to not do it at all. > > Now, what I would have want, and probably can't have, is to add startIndex as an > optional argument to Pattern.allMatches. > > It should, technically, be a non-breaking change, but it can still be hard to > get changes to fundamental code through reviews. Thanks for the patch to Pattern.allMatches! This is better now (no "String remaining"). https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:429: String remaining = this.substring(startIndex); On 2014/08/11 09:37:11, Lasse Reichstein Nielsen wrote: > This implementation is actually wrong: A "^" must only match at the start of the > string. By using a substring, it may now match at the startIndex instead: > > "abcdefabcdef".replaceFirst(new RegExp("^a"), "b", 6) > > should NOT replace the second "a". Good catch. This implementation now depends on passing startIndex to Regexp.allMatches, so that should cover this case. I'm adding a test as well. https://codereview.chromium.org/462463003/diff/1/runtime/lib/string_patch.dar... runtime/lib/string_patch.dart:429: String remaining = this.substring(startIndex); On 2014/08/13 08:35:09, Lasse Reichstein Nielsen wrote: > And now we DO have start-index on allMatches, so this could be just: > > Iterator iterator = > startIndex == 0 ? pattern.allMatches(this) > : pattern.allMatches(this, startIndex); > if (!iterator.moveNext()) return this; > Match match = iterator.current; > return "${this.substring(0, > match.start)}$replacement${this.substring(match.end)}"; > > The special casing of startIndex == 0 is important to avoid breaking > implementations of pattern that doesn't have the second argument. Existing > programs using those would also not pass an extra argument to this method. Done. Thanks very much for this cleanup! https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:72: stringReplaceJS(receiver, replacer, to, [startIndex = 0]) { On 2014/08/11 10:20:01, Lasse Reichstein Nielsen wrote: > No need to make startIndex optional here. It's an internal function so we always > control the number of arguments. Done. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:84: return JS('String', r'# + #', prefix, result); On 2014/08/13 08:35:09, Lasse Reichstein Nielsen wrote: > It's also wrong due to the "^" problem. > > Maybe not modify this function, so it only works with startIndex = 0, and use a > different function for the other case: > > replaceFirstJSRE(receiver, regexp, > to, startIndex) { > var match = regexp._execGlobal(receiver, startIndex); > if (match == null) return receiver; > var start = match.start; > var end = match.end; > return "${receiver.substring(0,start)}$to${receiver.substring(end)}"; > } > > Maybe just inline this in the stringReplaceFirstUnchecked function's > regexp-case. Thanks! I'll leave this as a separate function for now. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:84: return JS('String', r'# + #', prefix, result); On 2014/08/11 10:20:01, Lasse Reichstein Nielsen wrote: > This also looks too complicated. It might be necessary, but I'm not convinced > yet :) > Acknowledged. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:197: return stringReplaceJS(receiver, from, to, startIndex); On 2014/08/11 10:20:01, Lasse Reichstein Nielsen wrote: > Have we tried timing this against: > var index = receiver.indexOf(from, startIndex); > if (index < 0) return receiver; > return receiver.substring(0, index) + to + receiver.substring(index + > from.length); Your indexOf alternative must definitely be faster now that we're splitting stringReplaceJS into a method with startIndex and one without: The second method always fires up the regex engine, so I'd say this is faster. https://codereview.chromium.org/462463003/diff/1/sdk/lib/_internal/lib/string... sdk/lib/_internal/lib/string_helper.dart:200: return stringReplaceJS(receiver, re, to, startIndex); On 2014/08/13 08:35:09, Lasse Reichstein Nielsen wrote: > Use this only if startIndex is zero, otherwise use the code above. Done. https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... File tests/corelib/string_replace_test.dart (right): https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... tests/corelib/string_replace_test.dart:62: // Test negative startIndex On 2014/08/11 07:47:47, Bill Hesse wrote: > Add tests for null and object as startIndex. Done. https://codereview.chromium.org/462463003/diff/1/tests/corelib/string_replace... tests/corelib/string_replace_test.dart:70: } On 2014/08/13 08:35:09, Lasse Reichstein Nielsen wrote: > Test with regexps too, and with strings that contains something that looks like > a regexp, e.g., > Expect.equals("aaax", "aaaa{3}".replaceFirst("a{3}", "x", 1)); > Expect.equals("ax{3}", "aaaa{3}".replaceFirst( > new RegExp("a{3}"), "x", 1)); Done.
https://codereview.chromium.org/462463003/diff/20001/runtime/lib/string_patch... File runtime/lib/string_patch.dart (right): https://codereview.chromium.org/462463003/diff/20001/runtime/lib/string_patch... runtime/lib/string_patch.dart:433: startIndex == 0 ? pattern.allMatches(this).iterator Indent by four. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:80: stringReplaceJSRE(receiver, regexp, to, startIndex) { Name is nagging me. How about stringReplaceFirstRE ? https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:198: //return stringReplaceJS(receiver, from, to, startIndex); Remove commented-out line. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:201: return receiver.substring(0, index) + to + receiver.substring(index + from.length); Might want to use a String-interpolation like in stringReplaceJSRE. I don't actually know which is faster for dart2js. The interpolation is slightly faster in the VM because it doesn't optimize repeated string concatenations (yet). https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:203: var re = regExpGetNative(from); Inline the "re" value, so you don't compute it if startIndex > 0. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/core/string.dart File sdk/lib/core/string.dart (right): https://codereview.chromium.org/462463003/diff/20001/sdk/lib/core/string.dart... sdk/lib/core/string.dart:412: * is replaced with [to], optionally starting at [startIndex]: Not optionally. It always starts at startIndex, sometimes startIndex is 0.
Done. Thanks for the review! https://codereview.chromium.org/462463003/diff/20001/runtime/lib/string_patch... File runtime/lib/string_patch.dart (right): https://codereview.chromium.org/462463003/diff/20001/runtime/lib/string_patch... runtime/lib/string_patch.dart:433: startIndex == 0 ? pattern.allMatches(this).iterator On 2014/08/19 07:26:45, Lasse Reichstein Nielsen wrote: > Indent by four. Done. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:80: stringReplaceJSRE(receiver, regexp, to, startIndex) { On 2014/08/19 07:26:45, Lasse Reichstein Nielsen wrote: > Name is nagging me. > How about > stringReplaceFirstRE > ? Good suggestion. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:198: //return stringReplaceJS(receiver, from, to, startIndex); On 2014/08/19 07:26:45, Lasse Reichstein Nielsen wrote: > Remove commented-out line. Done. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:201: return receiver.substring(0, index) + to + receiver.substring(index + from.length); On 2014/08/19 07:26:45, Lasse Reichstein Nielsen wrote: > Might want to use a String-interpolation like in stringReplaceJSRE. > I don't actually know which is faster for dart2js. The interpolation is slightly > faster in the VM because it doesn't optimize repeated string concatenations > (yet). Done. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:203: var re = regExpGetNative(from); On 2014/08/19 07:26:45, Lasse Reichstein Nielsen wrote: > Inline the "re" value, so you don't compute it if startIndex > 0. Done. https://codereview.chromium.org/462463003/diff/20001/sdk/lib/core/string.dart File sdk/lib/core/string.dart (right): https://codereview.chromium.org/462463003/diff/20001/sdk/lib/core/string.dart... sdk/lib/core/string.dart:412: * is replaced with [to], optionally starting at [startIndex]: On 2014/08/19 07:26:45, Lasse Reichstein Nielsen wrote: > Not optionally. It always starts at startIndex, sometimes startIndex is 0. Done.
LGTM with lines broken and the last test fixed. https://codereview.chromium.org/462463003/diff/40001/sdk/lib/_internal/lib/st... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/40001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:200: return '${receiver.substring(0, index)}$to${receiver.substring(index + from.length)}'; Long line. https://codereview.chromium.org/462463003/diff/40001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:203: : stringReplaceFirstRE(receiver, from, to, startIndex); Two long lines. https://codereview.chromium.org/462463003/diff/40001/tests/corelib/string_rep... File tests/corelib/string_replace_test.dart (right): https://codereview.chromium.org/462463003/diff/40001/tests/corelib/string_rep... tests/corelib/string_replace_test.dart:90: (e) => e is TypeError); This is only correct in checked mode on the VM. In unchecked mode it is an ArgumentError. I'd just drop the type check, and be happy that it throws.
Excellent! I'll commit it.
Message was sent while issue was closed.
Committed patchset #4 manually as 39490 (presubmit successful).
Message was sent while issue was closed.
Thanks much Lasse! https://codereview.chromium.org/462463003/diff/40001/sdk/lib/_internal/lib/st... File sdk/lib/_internal/lib/string_helper.dart (right): https://codereview.chromium.org/462463003/diff/40001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:200: return '${receiver.substring(0, index)}$to${receiver.substring(index + from.length)}'; On 2014/08/20 08:58:53, Lasse Reichstein Nielsen wrote: > Long line. Done. https://codereview.chromium.org/462463003/diff/40001/sdk/lib/_internal/lib/st... sdk/lib/_internal/lib/string_helper.dart:203: : stringReplaceFirstRE(receiver, from, to, startIndex); On 2014/08/20 08:58:53, Lasse Reichstein Nielsen wrote: > Two long lines. Done. https://codereview.chromium.org/462463003/diff/40001/tests/corelib/string_rep... File tests/corelib/string_replace_test.dart (right): https://codereview.chromium.org/462463003/diff/40001/tests/corelib/string_rep... tests/corelib/string_replace_test.dart:90: (e) => e is TypeError); On 2014/08/20 08:58:53, Lasse Reichstein Nielsen wrote: > This is only correct in checked mode on the VM. In unchecked mode it is an > ArgumentError. > I'd just drop the type check, and be happy that it throws. Done.
Message was sent while issue was closed.
On 2014/08/22 11:27:40, Lasse Reichstein Nielsen wrote: > Committed patchset #4 manually as 39490 (presubmit successful). This commit caused a 3.4% regression in Tracer. Please revert and fix. Thanks, -Ivan
Message was sent while issue was closed.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
