Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(411)

Unified Diff: src/ast/ast.cc

Issue 1522353002: [regexp] break recursion in mutually recursive capture/back references. (Closed) Base URL: https://chromium.googlesource.com/v8/v8.git@master
Patch Set: lazify min/max match computation Created 5 years ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View side-by-side diff with in-line comments
Download patch
« no previous file with comments | « src/ast/ast.h ('k') | src/parsing/parser.cc » ('j') | no next file with comments »
Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
Index: src/ast/ast.cc
diff --git a/src/ast/ast.cc b/src/ast/ast.cc
index 3c8eb1ff7f1c109e4e9ee83824bc551815d2f55b..d140a40407b3da105d62d9cfe5d20e7c78068591 100644
--- a/src/ast/ast.cc
+++ b/src/ast/ast.cc
@@ -926,6 +926,31 @@ bool RegExpCapture::IsAnchoredAtEnd() {
}
+int RegExpBackReference::max_match() {
+ switch (max_match_) {
+ case kRecursionMarker:
+ // The back reference references a capture that references this back
+ // reference, leading to a recursion, e.g. /(\2)(\1)/.
+ // Such captures cannot be satisfied and have the length 0.
erikcorry 2015/12/15 09:45:43 This may well just be a bug in a comment, but here
Yang 2015/12/15 09:52:18 You are right. It should be "Such back references
erikcorry 2015/12/15 10:18:18 My example regexp seems to contradict the revised
+ max_match_ = 0;
+ break;
+ case kUninitialized:
+ // The max match length of a back reference is the length of the capture.
+ // Mark this back reference to break possible recursion before
+ // descending into the capture.
+ max_match_ = kRecursionMarker;
+ // The capture has been parsed.
+ DCHECK_NOT_NULL(capture_->body());
+ max_match_ = capture_->max_match();
+ break;
+ default:
+ // max_match_ has already been found.
+ break;
+ }
+ return max_match_;
+}
+
+
// Convert regular expression trees to a simple sexp representation.
// This representation should be different from the input grammar
// in as many cases as possible, to make it more difficult for incorrect
@@ -1091,14 +1116,14 @@ std::ostream& RegExpTree::Print(std::ostream& os, Zone* zone) { // NOLINT
}
-RegExpDisjunction::RegExpDisjunction(ZoneList<RegExpTree*>* alternatives)
- : alternatives_(alternatives) {
- DCHECK(alternatives->length() > 1);
- RegExpTree* first_alternative = alternatives->at(0);
+void RegExpDisjunction::LazilyComputeMatchLengths() {
+ // We cannot compute this in the constructor because at the time of parsing
+ // back referenced captures have not yet been parsed yet.
+ RegExpTree* first_alternative = alternatives_->at(0);
min_match_ = first_alternative->min_match();
erikcorry 2015/12/15 09:45:43 This looks a bit dodgy. We are not done calculati
Yang 2015/12/15 09:52:18 I think this is fine. We can only have a recursion
max_match_ = first_alternative->max_match();
- for (int i = 1; i < alternatives->length(); i++) {
- RegExpTree* alternative = alternatives->at(i);
+ for (int i = 1; i < alternatives_->length(); i++) {
+ RegExpTree* alternative = alternatives_->at(i);
min_match_ = Min(min_match_, alternative->min_match());
max_match_ = Max(max_match_, alternative->max_match());
}
@@ -1113,13 +1138,14 @@ static int IncreaseBy(int previous, int increase) {
}
}
-RegExpAlternative::RegExpAlternative(ZoneList<RegExpTree*>* nodes)
- : nodes_(nodes) {
- DCHECK(nodes->length() > 1);
+
+void RegExpAlternative::LazilyComputeMatchLengths() {
+ // We cannot compute this in the constructor because at the time of parsing
+ // back referenced captures have not yet been parsed yet.
min_match_ = 0;
max_match_ = 0;
- for (int i = 0; i < nodes->length(); i++) {
- RegExpTree* node = nodes->at(i);
+ for (int i = 0; i < nodes_->length(); i++) {
+ RegExpTree* node = nodes_->at(i);
int node_min_match = node->min_match();
min_match_ = IncreaseBy(min_match_, node_min_match);
int node_max_match = node->max_match();
@@ -1128,6 +1154,18 @@ RegExpAlternative::RegExpAlternative(ZoneList<RegExpTree*>* nodes)
}
+void RegExpQuantifier::LazilyComputeMatchLengths() {
+ // We cannot compute this in the constructor because at the time of parsing
+ // back referenced captures have not yet been parsed yet.
+ min_match_ = min_ * body_->min_match();
+ if (max_ > 0 && body_->max_match() > kInfinity / max_) {
+ max_match_ = kInfinity;
+ } else {
+ max_match_ = max_ * body_->max_match();
+ }
+}
+
+
CaseClause::CaseClause(Zone* zone, Expression* label,
ZoneList<Statement*>* statements, int pos)
: Expression(zone, pos),
« no previous file with comments | « src/ast/ast.h ('k') | src/parsing/parser.cc » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698