Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(767)

Issue 19539: Irregexp: Added derived knowledge of whether we are at the start of input. (Closed)

Created:
11 years, 10 months ago by Lasse Reichstein
Modified:
9 years, 6 months ago
Reviewers:
Erik Corry
CC:
v8-dev
Visibility:
Public.

Description

Trace contains information about whether we know that we are at the start of input. Choice nodes may know that they are never not at the start of input. This can remove start_of_input assertions in cases where they are statically known to fail. The initial loop is unrolled once if the regexp might check for the start of input. Only the first iteration may be at the start, the following loop knows that it isn't.

Patch Set 1 #

Total comments: 14

Patch Set 2 : Addressed review comments #

Patch Set 3 : Fixed lint issues. #

Unified diffs Side-by-side diffs Delta from patch set Stats (+169 lines, -51 lines) Patch
M src/ast.h View 1 chunk +2 lines, -1 line 0 comments Download
M src/ast.cc View 1 chunk +7 lines, -1 line 0 comments Download
M src/flags.cc View 1 chunk +1 line, -0 lines 0 comments Download
M src/jsregexp.h View 1 19 chunks +51 lines, -20 lines 0 comments Download
M src/jsregexp.cc View 1 2 18 chunks +94 lines, -23 lines 0 comments Download
M src/parser.cc View 1 2 5 chunks +13 lines, -5 lines 0 comments Download
M src/regexp-macro-assembler-ia32.cc View 1 chunk +1 line, -1 line 0 comments Download

Messages

Total messages: 3 (0 generated)
Lasse Reichstein
Semi-tricky review.
11 years, 10 months ago (2009-02-02 14:53:05 UTC) #1
Erik Corry
Excellent stuff. I assume there are no regressions of performance or correctness and that it ...
11 years, 10 months ago (2009-02-02 21:22:01 UTC) #2
Lasse Reichstein
11 years, 10 months ago (2009-02-03 11:18:19 UTC) #3
No expected performance regressions. 
The not_at_start field is only on ChoiceNode

http://codereview.chromium.org/19539/diff/1/2
File src/ast.cc (right):

http://codereview.chromium.org/19539/diff/1/2#newcode257
Line 257: if (node->max_match() > 0) { return false; }
It will actually work for min_match too.
We currently compute whether the expression contain an initial anchor.  If we
change our goal to computing whether the expression contains an unguarded anchor
(i.e., one that must definitly be matched for the expression to match), then it
will be effectively the same, and we can even ignore the min_match check
entirely - however, on non-stupid expressions, this check will likely save some
time.

http://codereview.chromium.org/19539/diff/1/5
File src/jsregexp.cc (right):

http://codereview.chromium.org/19539/diff/1/5#newcode3156
Line 3156: bool all_omitted = true;
On 2009/02/02 21:22:01, Erik Corry wrote:
> I think you can omit this if you implement the suggestions below.

Done.

http://codereview.chromium.org/19539/diff/1/5#newcode3196
Line 3196: } else if (alt_gen->quick_check_details.cannot_match()) {
And I should. I think it's a bug that it doesn't do anything special if the last
case is omitted.

http://codereview.chromium.org/19539/diff/1/5#newcode3971
Line 3971: 0, new_max, is_greedy, body, compiler, on_success, not_at_start);
On 2009/02/02 21:22:01, Erik Corry wrote:
> not_at_start can be true here since we know that min > 0 and body can't be
> empty.

Done.

http://codereview.chromium.org/19539/diff/1/5#newcode4839
Line 4839: // Unroll once, and tell the loop that it's not at the start of the
input.
Ah, yes. Refactoring breaks comments.

http://codereview.chromium.org/19539/diff/1/6
File src/jsregexp.h (right):

http://codereview.chromium.org/19539/diff/1/6#newcode1199
Line 1199: cp_offset_ == 0 &&
On 2009/02/02 21:22:01, Erik Corry wrote:
> A trace is not trivial unless at_start_ is UNKNOWN.

Done.

http://codereview.chromium.org/19539/diff/1/7
File src/parser.cc (right):

http://codereview.chromium.org/19539/diff/1/7#newcode3611
Line 3611: new RegExpAssertion(RegExpAssertion::START_OF_LINE));
I'm not sure the extra overhead of unrolling the initial loop is worth avoiding
one extra CheckNotAtStart.
Unless there is a good reason to optimize "start of line" assertions, I prefer
to keep the concepts separate.

Powered by Google App Engine
This is Rietveld 408576698