Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(216)

Side by Side Diff: src/regexp-macro-assembler-ia32.cc

Issue 42441: Made regexp robust against changes to a string's implementation. (Closed)
Patch Set: Removed unused addition to memory.h Created 11 years, 9 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
OLDNEW
1 // Copyright 2008 the V8 project authors. All rights reserved. 1 // Copyright 2008 the V8 project authors. All rights reserved.
2 // Redistribution and use in source and binary forms, with or without 2 // Redistribution and use in source and binary forms, with or without
3 // modification, are permitted provided that the following conditions are 3 // modification, are permitted provided that the following conditions are
4 // met: 4 // met:
5 // 5 //
6 // * Redistributions of source code must retain the above copyright 6 // * Redistributions of source code must retain the above copyright
7 // notice, this list of conditions and the following disclaimer. 7 // notice, this list of conditions and the following disclaimer.
8 // * Redistributions in binary form must reproduce the above 8 // * Redistributions in binary form must reproduce the above
9 // copyright notice, this list of conditions and the following 9 // copyright notice, this list of conditions and the following
10 // disclaimer in the documentation and/or other materials provided 10 // disclaimer in the documentation and/or other materials provided
11 // with the distribution. 11 // with the distribution.
12 // * Neither the name of Google Inc. nor the names of its 12 // * Neither the name of Google Inc. nor the names of its
13 // contributors may be used to endorse or promote products derived 13 // contributors may be used to endorse or promote products derived
14 // from this software without specific prior written permission. 14 // from this software without specific prior written permission.
15 // 15 //
16 // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 16 // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
17 // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 17 // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
18 // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 18 // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
19 // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 19 // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
20 // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 20 // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
21 // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 21 // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
22 // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 22 // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
23 // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 23 // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
24 // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 24 // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
25 // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26 // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27 27
28 #include <string.h>
29 #include "v8.h" 28 #include "v8.h"
30 #include "unicode.h" 29 #include "unicode.h"
31 #include "log.h" 30 #include "log.h"
32 #include "ast.h" 31 #include "ast.h"
33 #include "regexp-stack.h" 32 #include "regexp-stack.h"
34 #include "macro-assembler.h" 33 #include "macro-assembler.h"
35 #include "regexp-macro-assembler.h" 34 #include "regexp-macro-assembler.h"
36 #include "macro-assembler-ia32.h" 35 #include "macro-assembler-ia32.h"
37 #include "regexp-macro-assembler-ia32.h" 36 #include "regexp-macro-assembler-ia32.h"
38 37
39 namespace v8 { namespace internal { 38 namespace v8 { namespace internal {
40 39
41 /* 40 /*
42 * This assembler uses the following register assignment convention 41 * This assembler uses the following register assignment convention
43 * - edx : current character. Must be loaded using LoadCurrentCharacter 42 * - edx : current character. Must be loaded using LoadCurrentCharacter
44 * before using any of the dispatch methods. 43 * before using any of the dispatch methods.
45 * - edi : current position in input, as negative offset from end of string. 44 * - edi : current position in input, as negative offset from end of string.
46 * Please notice that this is the byte offset, not the character offset! 45 * Please notice that this is the byte offset, not the character offset!
47 * - esi : end of input (points to byte after last character in input). 46 * - esi : end of input (points to byte after last character in input).
48 * - ebp : points to the location above the registers on the stack, 47 * - ebp : frame pointer. Used to access arguments, local variables and
49 * as if by the "enter <register_count>" opcode. 48 * RegExp registers.
50 * - esp : points to tip of C stack. 49 * - esp : points to tip of C stack.
51 * - ecx : points to tip of backtrack stack 50 * - ecx : points to tip of backtrack stack
52 * 51 *
53 * The registers eax, ebx and ecx are free to use for computations. 52 * The registers eax, ebx and ecx are free to use for computations.
54 * 53 *
55 * Each call to a public method should retain this convention. 54 * Each call to a public method should retain this convention.
56 * The stack will have the following structure: 55 * The stack will have the following structure:
57 * - stack_area_top (High end of the memory area to use as 56 * - stack_area_top (High end of the memory area to use as
58 * backtracking stack) 57 * backtracking stack)
59 * - at_start (if 1, start at start of string, if 0, don't) 58 * - at_start (if 1, start at start of string, if 0, don't)
60 * - int* capture_array (int[num_saved_registers_], for output). 59 * - int* capture_array (int[num_saved_registers_], for output).
61 * - end of input (index of end of string, relative to *string_base) 60 * - end of input (Address of end of string)
62 * - start of input (index of first character in string, relative 61 * - start of input (Address of first character in string)
63 * to *string_base) 62 * - void* input_string (location of a handle containing the string)
64 * - void** string_base (location of a handle containing the string) 63 * --- frame alignment (if applicable) ---
65 * - return address 64 * - return address
66 * ebp-> - old ebp 65 * ebp-> - old ebp
67 * - backup of caller esi 66 * - backup of caller esi
68 * - backup of caller edi 67 * - backup of caller edi
69 * - backup of caller ebx 68 * - backup of caller ebx
69 * - Offset of location before start of input (effectively character
70 * position -1). Used to initialize capture registers to a non-position.
70 * - register 0 ebp[-4] (Only positions must be stored in the first 71 * - register 0 ebp[-4] (Only positions must be stored in the first
71 * - register 1 ebp[-8] num_saved_registers_ registers) 72 * - register 1 ebp[-8] num_saved_registers_ registers)
72 * - ... 73 * - ...
73 * 74 *
74 * The first num_saved_registers_ registers are initialized to point to 75 * The first num_saved_registers_ registers are initialized to point to
75 * "character -1" in the string (i.e., char_size() bytes before the first 76 * "character -1" in the string (i.e., char_size() bytes before the first
76 * character of the string). The remaining registers starts out as garbage. 77 * character of the string). The remaining registers starts out as garbage.
77 * 78 *
78 * The data up to the return address must be placed there by the calling 79 * The data up to the return address must be placed there by the calling
79 * code, e.g., by calling the code as cast to: 80 * code, e.g., by calling the code entry as cast to:
80 * bool (*match)(String** string_base, 81 * int (*match)(String* input_string,
81 * int start_offset, 82 * Address start,
82 * int end_offset, 83 * Address end,
83 * int* capture_output_array, 84 * int* capture_output_array,
84 * bool at_start, 85 * bool at_start,
85 * byte* stack_area_top) 86 * byte* stack_area_top)
86 */ 87 */
87 88
88 #define __ masm_-> 89 #define __ masm_->
89 90
90 RegExpMacroAssemblerIA32::RegExpMacroAssemblerIA32( 91 RegExpMacroAssemblerIA32::RegExpMacroAssemblerIA32(
91 Mode mode, 92 Mode mode,
92 int registers_to_save) 93 int registers_to_save)
93 : masm_(new MacroAssembler(NULL, kRegExpCodeSize)), 94 : masm_(new MacroAssembler(NULL, kRegExpCodeSize)),
94 constants_(kRegExpConstantsSize), 95 constants_(kRegExpConstantsSize),
95 mode_(mode), 96 mode_(mode),
(...skipping 71 matching lines...) Expand 10 before | Expand all | Expand 10 after
167 } 168 }
168 169
169 170
170 void RegExpMacroAssemblerIA32::CheckCharacterGT(uc16 limit, Label* on_greater) { 171 void RegExpMacroAssemblerIA32::CheckCharacterGT(uc16 limit, Label* on_greater) {
171 __ cmp(current_character(), limit); 172 __ cmp(current_character(), limit);
172 BranchOrBacktrack(greater, on_greater); 173 BranchOrBacktrack(greater, on_greater);
173 } 174 }
174 175
175 176
176 void RegExpMacroAssemblerIA32::CheckAtStart(Label* on_at_start) { 177 void RegExpMacroAssemblerIA32::CheckAtStart(Label* on_at_start) {
177 Label ok; 178 Label not_at_start;
178 // Did we start the match at the start of the string at all? 179 // Did we start the match at the start of the string at all?
179 __ cmp(Operand(ebp, kAtStart), Immediate(0)); 180 __ cmp(Operand(ebp, kAtStart), Immediate(0));
180 BranchOrBacktrack(equal, &ok); 181 BranchOrBacktrack(equal, &not_at_start);
181 // If we did, are we still at the start of the input? 182 // If we did, are we still at the start of the input?
182 __ mov(eax, Operand(ebp, kInputEndOffset)); 183 __ lea(eax, Operand(esi, edi, times_1, 0));
183 __ add(eax, Operand(edi)); 184 __ cmp(eax, Operand(ebp, kInputStart));
184 __ cmp(eax, Operand(ebp, kInputStartOffset));
185 BranchOrBacktrack(equal, on_at_start); 185 BranchOrBacktrack(equal, on_at_start);
186 __ bind(&ok); 186 __ bind(&not_at_start);
187 } 187 }
188 188
189 189
190 void RegExpMacroAssemblerIA32::CheckNotAtStart(Label* on_not_at_start) { 190 void RegExpMacroAssemblerIA32::CheckNotAtStart(Label* on_not_at_start) {
191 // Did we start the match at the start of the string at all? 191 // Did we start the match at the start of the string at all?
192 __ cmp(Operand(ebp, kAtStart), Immediate(0)); 192 __ cmp(Operand(ebp, kAtStart), Immediate(0));
193 BranchOrBacktrack(equal, on_not_at_start); 193 BranchOrBacktrack(equal, on_not_at_start);
194 // If we did, are we still at the start of the input? 194 // If we did, are we still at the start of the input?
195 __ mov(eax, Operand(ebp, kInputEndOffset)); 195 __ lea(eax, Operand(esi, edi, times_1, 0));
196 __ add(eax, Operand(edi)); 196 __ cmp(eax, Operand(ebp, kInputStart));
197 __ cmp(eax, Operand(ebp, kInputStartOffset));
198 BranchOrBacktrack(not_equal, on_not_at_start); 197 BranchOrBacktrack(not_equal, on_not_at_start);
199 } 198 }
200 199
201 200
202 void RegExpMacroAssemblerIA32::CheckCharacterLT(uc16 limit, Label* on_less) { 201 void RegExpMacroAssemblerIA32::CheckCharacterLT(uc16 limit, Label* on_less) {
203 __ cmp(current_character(), limit); 202 __ cmp(current_character(), limit);
204 BranchOrBacktrack(less, on_less); 203 BranchOrBacktrack(less, on_less);
205 } 204 }
206 205
207 206
(...skipping 114 matching lines...) Expand 10 before | Expand all | Expand 10 after
322 __ add(Operand(esp), Immediate(kPointerSize)); 321 __ add(Operand(esp), Immediate(kPointerSize));
323 // Compute new value of character position after the matched part. 322 // Compute new value of character position after the matched part.
324 __ sub(edi, Operand(esi)); 323 __ sub(edi, Operand(esi));
325 } else { 324 } else {
326 ASSERT(mode_ == UC16); 325 ASSERT(mode_ == UC16);
327 // Save registers before calling C function. 326 // Save registers before calling C function.
328 __ push(esi); 327 __ push(esi);
329 __ push(edi); 328 __ push(edi);
330 __ push(backtrack_stackpointer()); 329 __ push(backtrack_stackpointer());
331 __ push(ebx); 330 __ push(ebx);
332 const int four_arguments = 4; 331
333 FrameAlign(four_arguments, ecx); 332 const int argument_count = 3;
333 FrameAlign(argument_count, ecx);
334 // Put arguments into allocated stack area, last argument highest on stack. 334 // Put arguments into allocated stack area, last argument highest on stack.
335 // Parameters are 335 // Parameters are
336 // UC16** buffer - really the String** of the input string 336 // Address byte_offset1 - Address captured substring's start.
337 // int byte_offset1 - byte offset from *buffer of start of capture 337 // Address byte_offset2 - Address of current character position.
338 // int byte_offset2 - byte offset from *buffer of current position
339 // size_t byte_length - length of capture in bytes(!) 338 // size_t byte_length - length of capture in bytes(!)
340 339
341 // Set byte_length. 340 // Set byte_length.
342 __ mov(Operand(esp, 3 * kPointerSize), ebx); 341 __ mov(Operand(esp, 2 * kPointerSize), ebx);
343 // Set byte_offset2. 342 // Set byte_offset2.
344 // Found by adding negative string-end offset of current position (edi) 343 // Found by adding negative string-end offset of current position (edi)
345 // to String** offset of end of string. 344 // to end of string.
346 __ mov(ecx, Operand(ebp, kInputEndOffset)); 345 __ add(edi, Operand(esi));
347 __ add(edi, Operand(ecx)); 346 __ mov(Operand(esp, 1 * kPointerSize), edi);
348 __ mov(Operand(esp, 2 * kPointerSize), edi);
349 // Set byte_offset1. 347 // Set byte_offset1.
350 // Start of capture, where edx already holds string-end negative offset. 348 // Start of capture, where edx already holds string-end negative offset.
351 __ add(edx, Operand(ecx)); 349 __ add(edx, Operand(esi));
352 __ mov(Operand(esp, 1 * kPointerSize), edx); 350 __ mov(Operand(esp, 0 * kPointerSize), edx);
353 // Set buffer. Original String** parameter to regexp code.
354 __ mov(eax, Operand(ebp, kInputBuffer));
355 __ mov(Operand(esp, 0 * kPointerSize), eax);
356 351
357 Address function_address = FUNCTION_ADDR(&CaseInsensitiveCompareUC16); 352 Address function_address = FUNCTION_ADDR(&CaseInsensitiveCompareUC16);
358 CallCFunction(function_address, four_arguments); 353 CallCFunction(function_address, argument_count);
359 // Pop original values before reacting on result value. 354 // Pop original values before reacting on result value.
360 __ pop(ebx); 355 __ pop(ebx);
361 __ pop(backtrack_stackpointer()); 356 __ pop(backtrack_stackpointer());
362 __ pop(edi); 357 __ pop(edi);
363 __ pop(esi); 358 __ pop(esi);
364 359
365 // Check if function returned non-zero for success or zero for failure. 360 // Check if function returned non-zero for success or zero for failure.
366 __ or_(eax, Operand(eax)); 361 __ or_(eax, Operand(eax));
367 BranchOrBacktrack(zero, on_no_match); 362 BranchOrBacktrack(zero, on_no_match);
368 // On success, increment position by length of capture. 363 // On success, increment position by length of capture.
(...skipping 254 matching lines...) Expand 10 before | Expand all | Expand 10 after
623 618
624 Handle<Object> RegExpMacroAssemblerIA32::GetCode(Handle<String> source) { 619 Handle<Object> RegExpMacroAssemblerIA32::GetCode(Handle<String> source) {
625 // Finalize code - write the entry point code now we know how many 620 // Finalize code - write the entry point code now we know how many
626 // registers we need. 621 // registers we need.
627 622
628 // Entry code: 623 // Entry code:
629 __ bind(&entry_label_); 624 __ bind(&entry_label_);
630 // Start new stack frame. 625 // Start new stack frame.
631 __ push(ebp); 626 __ push(ebp);
632 __ mov(ebp, esp); 627 __ mov(ebp, esp);
633 // Save callee-save registers. Order here should correspond to order of 628 // Save callee-save registers. Order here should correspond to order of
634 // kBackup_ebx etc. 629 // kBackup_ebx etc.
635 __ push(esi); 630 __ push(esi);
636 __ push(edi); 631 __ push(edi);
637 __ push(ebx); // Callee-save on MacOS. 632 __ push(ebx); // Callee-save on MacOS.
638 __ push(Immediate(0)); // Make room for "input start - 1" constant. 633 __ push(Immediate(0)); // Make room for "input start - 1" constant.
639 634
640 // Check if we have space on the stack for registers. 635 // Check if we have space on the stack for registers.
641 Label retry_stack_check; 636 Label retry_stack_check;
642 Label stack_limit_hit; 637 Label stack_limit_hit;
643 Label stack_ok; 638 Label stack_ok;
644 639
645 __ bind(&retry_stack_check); 640 __ bind(&retry_stack_check);
646 ExternalReference stack_guard_limit = 641 ExternalReference stack_guard_limit =
647 ExternalReference::address_of_stack_guard_limit(); 642 ExternalReference::address_of_stack_guard_limit();
648 __ mov(ecx, esp); 643 __ mov(ecx, esp);
649 __ sub(ecx, Operand::StaticVariable(stack_guard_limit)); 644 __ sub(ecx, Operand::StaticVariable(stack_guard_limit));
650 // Handle it if the stack pointer is already below the stack limit. 645 // Handle it if the stack pointer is already below the stack limit.
651 __ j(below_equal, &stack_limit_hit, not_taken); 646 __ j(below_equal, &stack_limit_hit, not_taken);
652 // Check if there is room for the variable number of registers above 647 // Check if there is room for the variable number of registers above
653 // the stack limit. 648 // the stack limit.
654 __ cmp(ecx, num_registers_ * kPointerSize); 649 __ cmp(ecx, num_registers_ * kPointerSize);
655 __ j(above_equal, &stack_ok, taken); 650 __ j(above_equal, &stack_ok, taken);
656 // Exit with exception. 651 // Exit with OutOfMemory exception. There is not enough space on the stack
652 // for our working registers.
657 __ mov(eax, EXCEPTION); 653 __ mov(eax, EXCEPTION);
658 __ jmp(&exit_label_); 654 __ jmp(&exit_label_);
659 655
660 __ bind(&stack_limit_hit); 656 __ bind(&stack_limit_hit);
661 int num_arguments = 2; 657 CallCheckStackGuardState(ebx);
662 FrameAlign(num_arguments, ebx);
663 __ mov(Operand(esp, 1 * kPointerSize), Immediate(masm_->CodeObject()));
664 __ lea(eax, Operand(esp, -kPointerSize));
665 __ mov(Operand(esp, 0 * kPointerSize), eax);
666 CallCFunction(FUNCTION_ADDR(&CheckStackGuardState), num_arguments);
667 __ or_(eax, Operand(eax)); 658 __ or_(eax, Operand(eax));
668 // If returned value is non-zero, the stack guard reports the actual 659 // If returned value is non-zero, we exit with the returned value as result.
669 // stack limit being hit and an exception has already been raised.
670 // Otherwise it was a preemption and we just check the limit again. 660 // Otherwise it was a preemption and we just check the limit again.
671 __ j(equal, &retry_stack_check); 661 __ j(equal, &retry_stack_check);
672 // Return value was non-zero. Exit with exception. 662 // Return value was non-zero. Exit with exception or retry.
673 __ mov(eax, EXCEPTION);
674 __ jmp(&exit_label_); 663 __ jmp(&exit_label_);
675 664
676 __ bind(&stack_ok); 665 __ bind(&stack_ok);
677 666
678 // Allocate space on stack for registers. 667 // Allocate space on stack for registers.
679 __ sub(Operand(esp), Immediate(num_registers_ * kPointerSize)); 668 __ sub(Operand(esp), Immediate(num_registers_ * kPointerSize));
680 // Load string length. 669 // Load string length.
681 __ mov(esi, Operand(ebp, kInputEndOffset)); 670 __ mov(esi, Operand(ebp, kInputEnd));
682 // Load input position. 671 // Load input position.
683 __ mov(edi, Operand(ebp, kInputStartOffset)); 672 __ mov(edi, Operand(ebp, kInputStart));
684 // Set up edi to be negative offset from string end. 673 // Set up edi to be negative offset from string end.
685 __ sub(edi, Operand(esi)); 674 __ sub(edi, Operand(esi));
686 // Set up esi to be end of string. First get location.
687 __ mov(edx, Operand(ebp, kInputBuffer));
688 // Dereference location to get string start.
689 __ mov(edx, Operand(edx, 0));
690 // Add start to length to complete esi setup.
691 __ add(esi, Operand(edx));
692 if (num_saved_registers_ > 0) { 675 if (num_saved_registers_ > 0) {
693 // Fill saved registers with initial value = start offset - 1 676 // Fill saved registers with initial value = start offset - 1
694 // Fill in stack push order, to avoid accessing across an unwritten 677 // Fill in stack push order, to avoid accessing across an unwritten
695 // page (a problem on Windows). 678 // page (a problem on Windows).
696 __ mov(ecx, kRegisterZero); 679 __ mov(ecx, kRegisterZero);
697 // Set eax to address of char before start of input 680 // Set eax to address of char before start of input
698 // (effectively string position -1). 681 // (effectively string position -1).
699 __ lea(eax, Operand(edi, -char_size())); 682 __ lea(eax, Operand(edi, -char_size()));
700 // Store this value in a local variable, for use when clearing 683 // Store this value in a local variable, for use when clearing
701 // position registers. 684 // position registers.
(...skipping 29 matching lines...) Expand all
731 __ jmp(&start_label_); 714 __ jmp(&start_label_);
732 715
733 716
734 // Exit code: 717 // Exit code:
735 if (success_label_.is_linked()) { 718 if (success_label_.is_linked()) {
736 // Save captures when successful. 719 // Save captures when successful.
737 __ bind(&success_label_); 720 __ bind(&success_label_);
738 if (num_saved_registers_ > 0) { 721 if (num_saved_registers_ > 0) {
739 // copy captures to output 722 // copy captures to output
740 __ mov(ebx, Operand(ebp, kRegisterOutput)); 723 __ mov(ebx, Operand(ebp, kRegisterOutput));
741 __ mov(ecx, Operand(ebp, kInputEndOffset)); 724 __ mov(ecx, Operand(ebp, kInputEnd));
742 __ sub(ecx, Operand(ebp, kInputStartOffset)); 725 __ sub(ecx, Operand(ebp, kInputStart));
743 for (int i = 0; i < num_saved_registers_; i++) { 726 for (int i = 0; i < num_saved_registers_; i++) {
744 __ mov(eax, register_location(i)); 727 __ mov(eax, register_location(i));
745 __ add(eax, Operand(ecx)); // Convert to index from start, not end. 728 __ add(eax, Operand(ecx)); // Convert to index from start, not end.
746 if (mode_ == UC16) { 729 if (mode_ == UC16) {
747 __ sar(eax, 1); // Convert byte index to character index. 730 __ sar(eax, 1); // Convert byte index to character index.
748 } 731 }
749 __ mov(Operand(ebx, i * kPointerSize), eax); 732 __ mov(Operand(ebx, i * kPointerSize), eax);
750 } 733 }
751 } 734 }
752 __ mov(eax, Immediate(SUCCESS)); 735 __ mov(eax, Immediate(SUCCESS));
(...skipping 21 matching lines...) Expand all
774 // Preempt-code 757 // Preempt-code
775 if (check_preempt_label_.is_linked()) { 758 if (check_preempt_label_.is_linked()) {
776 __ bind(&check_preempt_label_); 759 __ bind(&check_preempt_label_);
777 760
778 __ push(backtrack_stackpointer()); 761 __ push(backtrack_stackpointer());
779 __ push(edi); 762 __ push(edi);
780 763
781 Label retry; 764 Label retry;
782 765
783 __ bind(&retry); 766 __ bind(&retry);
784 int num_arguments = 2; 767 CallCheckStackGuardState(ebx);
785 FrameAlign(num_arguments, ebx); 768 __ or_(eax, Operand(eax));
786 __ mov(Operand(esp, 1 * kPointerSize), Immediate(masm_->CodeObject())); 769 // If returning non-zero, we should end execution with the given
787 __ lea(eax, Operand(esp, -kPointerSize)); 770 // result as return value.
788 __ mov(Operand(esp, 0 * kPointerSize), eax); 771 __ j(not_zero, &exit_label_);
789 CallCFunction(FUNCTION_ADDR(&CheckStackGuardState), num_arguments); 772 // Check if we are still preempted.
790 // Return value must be zero. We cannot have a stack overflow at
791 // this point, since we checked the stack on entry and haven't
792 // pushed anything since, that we haven't also popped again.
793
794 ExternalReference stack_guard_limit = 773 ExternalReference stack_guard_limit =
795 ExternalReference::address_of_stack_guard_limit(); 774 ExternalReference::address_of_stack_guard_limit();
796 // Check if we are still preempted.
797 __ cmp(esp, Operand::StaticVariable(stack_guard_limit)); 775 __ cmp(esp, Operand::StaticVariable(stack_guard_limit));
798 __ j(below_equal, &retry); 776 __ j(below_equal, &retry);
799 777
800 __ pop(edi); 778 __ pop(edi);
801 __ pop(backtrack_stackpointer()); 779 __ pop(backtrack_stackpointer());
802 // String might have moved: Recompute esi from scratch. 780 // String might have moved: Reload esi from frame.
803 __ mov(esi, Operand(ebp, kInputBuffer)); 781 __ mov(esi, Operand(ebp, kInputEnd));
804 __ mov(esi, Operand(esi, 0));
805 __ add(esi, Operand(ebp, kInputEndOffset));
806 SafeReturn(); 782 SafeReturn();
807 } 783 }
808 784
809 // Backtrack stack overflow code. 785 // Backtrack stack overflow code.
810 if (stack_overflow_label_.is_linked()) { 786 if (stack_overflow_label_.is_linked()) {
811 __ bind(&stack_overflow_label_); 787 __ bind(&stack_overflow_label_);
812 // Reached if the backtrack-stack limit has been hit. 788 // Reached if the backtrack-stack limit has been hit.
813 789
814 Label grow_failed; 790 Label grow_failed;
815 // Save registers before calling C function 791 // Save registers before calling C function
(...skipping 155 matching lines...) Expand 10 before | Expand all | Expand 10 after
971 } 947 }
972 948
973 949
974 void RegExpMacroAssemblerIA32::WriteStackPointerToRegister(int reg) { 950 void RegExpMacroAssemblerIA32::WriteStackPointerToRegister(int reg) {
975 __ mov(eax, backtrack_stackpointer()); 951 __ mov(eax, backtrack_stackpointer());
976 __ sub(eax, Operand(ebp, kStackHighEnd)); 952 __ sub(eax, Operand(ebp, kStackHighEnd));
977 __ mov(register_location(reg), eax); 953 __ mov(register_location(reg), eax);
978 } 954 }
979 955
980 956
981
982 RegExpMacroAssemblerIA32::Result RegExpMacroAssemblerIA32::Match( 957 RegExpMacroAssemblerIA32::Result RegExpMacroAssemblerIA32::Match(
983 Handle<Code> regexp_code, 958 Handle<Code> regexp_code,
984 Handle<String> subject, 959 Handle<String> subject,
985 int* offsets_vector, 960 int* offsets_vector,
986 int offsets_vector_length, 961 int offsets_vector_length,
987 int previous_index) { 962 int previous_index) {
963
964 ASSERT(subject->IsFlat());
965
966 // No allocations before calling the regexp, but we can't use
967 // AssertNoAllocation, since regexps might be preempted, and preemption code
Erik Corry 2009/03/20 12:45:42 s/preemption code/another thread/
968 // might do allocation.
969
970 String* subject_ptr = *subject;
988 // Character offsets into string. 971 // Character offsets into string.
989 int start_offset = previous_index; 972 int start_offset = previous_index;
990 int end_offset = subject->length(); 973 int end_offset = subject_ptr->length();
991 974
992 if (StringShape(*subject).IsCons()) { 975 if (StringShape(subject_ptr).IsCons()) {
993 subject = 976 subject_ptr = String::cast(ConsString::cast(subject_ptr)->first());
Erik Corry 2009/03/20 12:45:42 I don't think you need to cast this to a string.
994 Handle<String>(String::cast(ConsString::cast(*subject)->first())); 977 } else if (StringShape(subject_ptr).IsSliced()) {
995 } else if (StringShape(*subject).IsSliced()) { 978 SlicedString* slice = SlicedString::cast(subject_ptr);
996 SlicedString* slice = SlicedString::cast(*subject);
997 start_offset += slice->start(); 979 start_offset += slice->start();
998 end_offset += slice->start(); 980 end_offset += slice->start();
999 subject = Handle<String>(String::cast(slice->buffer())); 981 subject_ptr = String::cast(slice->buffer());
Erik Corry 2009/03/20 12:45:42 Or this
1000 } 982 }
1001 983
1002 // String is now either Sequential or External 984 // String is now either Sequential or External
1003 bool is_ascii = StringShape(*subject).IsAsciiRepresentation(); 985 bool is_ascii = StringShape(*subject).IsAsciiRepresentation();
1004 int char_size_shift = is_ascii ? 0 : 1; 986 int char_size_shift = is_ascii ? 0 : 1;
987 int char_length = end_offset - start_offset;
1005 988
1006 RegExpMacroAssemblerIA32::Result res; 989 const byte* input_start =
990 StringCharacterPosition(subject_ptr, start_offset);
991 int byte_length = char_length << char_size_shift;
992 const byte* input_end = input_start + byte_length;
993 RegExpMacroAssemblerIA32::Result res = Execute(*regexp_code,
994 subject_ptr,
995 start_offset,
996 input_start,
997 input_end,
998 offsets_vector,
999 previous_index == 0);
1007 1000
1008 if (StringShape(*subject).IsExternal()) { 1001 if (res == SUCCESS) {
1009 const byte* address;
1010 if (is_ascii) {
1011 ExternalAsciiString* ext = ExternalAsciiString::cast(*subject);
1012 address = reinterpret_cast<const byte*>(ext->resource()->data());
1013 } else {
1014 ExternalTwoByteString* ext = ExternalTwoByteString::cast(*subject);
1015 address = reinterpret_cast<const byte*>(ext->resource()->data());
1016 }
1017
1018 res = Execute(*regexp_code,
1019 const_cast<Address*>(&address),
1020 start_offset << char_size_shift,
1021 end_offset << char_size_shift,
1022 offsets_vector,
1023 previous_index == 0);
1024 } else { // Sequential string
1025 ASSERT(StringShape(*subject).IsSequential());
1026 Address char_address =
1027 is_ascii ? SeqAsciiString::cast(*subject)->GetCharsAddress()
1028 : SeqTwoByteString::cast(*subject)->GetCharsAddress();
1029 int byte_offset = char_address - reinterpret_cast<Address>(*subject);
1030 res = Execute(*regexp_code,
1031 reinterpret_cast<Address*>(subject.location()),
1032 byte_offset + (start_offset << char_size_shift),
1033 byte_offset + (end_offset << char_size_shift),
1034 offsets_vector,
1035 previous_index == 0);
1036 }
1037
1038 if (res == RegExpMacroAssemblerIA32::SUCCESS) {
1039 // Capture values are relative to start_offset only. 1002 // Capture values are relative to start_offset only.
1003 // Convert them to be relative to start of string.
1040 for (int i = 0; i < offsets_vector_length; i++) { 1004 for (int i = 0; i < offsets_vector_length; i++) {
1041 if (offsets_vector[i] >= 0) { 1005 if (offsets_vector[i] >= 0) {
1042 offsets_vector[i] += previous_index; 1006 offsets_vector[i] += previous_index;
1043 } 1007 }
1044 } 1008 }
1045 } 1009 }
1046 1010
1047 return res; 1011 return res;
1048 } 1012 }
1049 1013
1050
1051 // Private methods: 1014 // Private methods:
1052 1015
1053
1054 static unibrow::Mapping<unibrow::Ecma262Canonicalize> canonicalize; 1016 static unibrow::Mapping<unibrow::Ecma262Canonicalize> canonicalize;
1055 1017
1056 RegExpMacroAssemblerIA32::Result RegExpMacroAssemblerIA32::Execute( 1018 RegExpMacroAssemblerIA32::Result RegExpMacroAssemblerIA32::Execute(
1057 Code* code, 1019 Code* code,
1058 Address* input, 1020 String* input,
1059 int start_offset, 1021 int start_offset,
1060 int end_offset, 1022 const byte* input_start,
1023 const byte* input_end,
1061 int* output, 1024 int* output,
1062 bool at_start) { 1025 bool at_start) {
1063 typedef int (*matcher)(Address*, int, int, int*, int, Address); 1026 typedef int (*matcher)(String*, int, const byte*,
1027 const byte*, int*, int, Address);
1064 matcher matcher_func = FUNCTION_CAST<matcher>(code->entry()); 1028 matcher matcher_func = FUNCTION_CAST<matcher>(code->entry());
1065 1029
1066 int at_start_val = at_start ? 1 : 0; 1030 int at_start_val = at_start ? 1 : 0;
1067 1031
1068 // Ensure that the minimum stack has been allocated. 1032 // Ensure that the minimum stack has been allocated.
1069 RegExpStack stack; 1033 RegExpStack stack;
1070 Address stack_top = RegExpStack::stack_top(); 1034 Address stack_top = RegExpStack::stack_top();
1071 1035
1072 int result = matcher_func(input, 1036 int result = matcher_func(input,
1073 start_offset, 1037 start_offset,
1074 end_offset, 1038 input_start,
1039 input_end,
1075 output, 1040 output,
1076 at_start_val, 1041 at_start_val,
1077 stack_top); 1042 stack_top);
1043 ASSERT(result <= SUCCESS);
1044 ASSERT(result >= RETRY);
1078 1045
1079 if (result < 0 && !Top::has_pending_exception()) { 1046 if (result == EXCEPTION && !Top::has_pending_exception()) {
1080 // We detected a stack overflow (on the backtrack stack) in RegExp code, 1047 // We detected a stack overflow (on the backtrack stack) in RegExp code,
1081 // but haven't created the exception yet. 1048 // but haven't created the exception yet.
1082 Top::StackOverflow(); 1049 Top::StackOverflow();
1083 } 1050 }
1084 return (result < 0) ? EXCEPTION : (result ? SUCCESS : FAILURE); 1051 return static_cast<Result>(result);
1085 } 1052 }
1086 1053
1087 1054
1088 int RegExpMacroAssemblerIA32::CaseInsensitiveCompareUC16(uc16** buffer, 1055 int RegExpMacroAssemblerIA32::CaseInsensitiveCompareUC16(Address byte_offset1,
1089 int byte_offset1, 1056 Address byte_offset2,
1090 int byte_offset2,
1091 size_t byte_length) { 1057 size_t byte_length) {
1092 // This function is not allowed to cause a garbage collection. 1058 // This function is not allowed to cause a garbage collection.
1093 // A GC might move the calling generated code and invalidate the 1059 // A GC might move the calling generated code and invalidate the
1094 // return address on the stack. 1060 // return address on the stack.
1095 ASSERT(byte_length % 2 == 0); 1061 ASSERT(byte_length % 2 == 0);
1096 Address buffer_address = reinterpret_cast<Address>(*buffer); 1062 uc16* substring1 = reinterpret_cast<uc16*>(byte_offset1);
1097 uc16* substring1 = reinterpret_cast<uc16*>(buffer_address + byte_offset1); 1063 uc16* substring2 = reinterpret_cast<uc16*>(byte_offset2);
1098 uc16* substring2 = reinterpret_cast<uc16*>(buffer_address + byte_offset2);
1099 size_t length = byte_length >> 1; 1064 size_t length = byte_length >> 1;
1100 1065
1101 for (size_t i = 0; i < length; i++) { 1066 for (size_t i = 0; i < length; i++) {
1102 unibrow::uchar c1 = substring1[i]; 1067 unibrow::uchar c1 = substring1[i];
1103 unibrow::uchar c2 = substring2[i]; 1068 unibrow::uchar c2 = substring2[i];
1104 if (c1 != c2) { 1069 if (c1 != c2) {
1105 canonicalize.get(c1, '\0', &c1); 1070 canonicalize.get(c1, '\0', &c1);
1106 if (c1 != c2) { 1071 if (c1 != c2) {
1107 canonicalize.get(c2, '\0', &c2); 1072 canonicalize.get(c2, '\0', &c2);
1108 if (c1 != c2) { 1073 if (c1 != c2) {
1109 return 0; 1074 return 0;
1110 } 1075 }
1111 } 1076 }
1112 } 1077 }
1113 } 1078 }
1114 return 1; 1079 return 1;
1115 } 1080 }
1116 1081
1117 1082
1083 void RegExpMacroAssemblerIA32::CallCheckStackGuardState(Register scratch) {
1084 int num_arguments = 3;
1085 FrameAlign(num_arguments, scratch);
1086 // RegExp code frame pointer.
1087 __ mov(Operand(esp, 2 * kPointerSize), ebp);
1088 // Code* of self.
1089 __ mov(Operand(esp, 1 * kPointerSize), Immediate(masm_->CodeObject()));
1090 // Next address on the stack (will be address of return address).
1091 __ lea(eax, Operand(esp, -kPointerSize));
1092 __ mov(Operand(esp, 0 * kPointerSize), eax);
1093 CallCFunction(FUNCTION_ADDR(&CheckStackGuardState), num_arguments);
1094 }
1095
1096
1097 // Helper function for reading a value out of a stack frame.
1098 template <typename T>
1099 static T& frame_entry(Address re_frame, int frame_offset) {
1100 return reinterpret_cast<T&>(Memory::int32_at(re_frame + frame_offset));
1101 }
1102
1103
1104 const byte* RegExpMacroAssemblerIA32::StringCharacterPosition(String* subject,
1105 int start_index) {
1106 // Not just flat, but ultra flat.
1107 ASSERT(subject->IsExternalString() || subject->IsSeqString());
1108 ASSERT(start_index >= 0);
1109 ASSERT(start_index <= subject->length());
1110 if (StringShape(subject).IsAsciiRepresentation()) {
1111 const byte* address;
1112 if (subject->IsExternalAsciiString()) {
1113 const char* data = ExternalAsciiString::cast(subject)->resource()->data();
1114 address = reinterpret_cast<const byte*>(data);
1115 } else {
1116 ASSERT(subject->IsSeqAsciiString());
1117 char* data = SeqAsciiString::cast(subject)->GetChars();
1118 address = reinterpret_cast<const byte*>(data);
1119 }
1120 return address + start_index;
1121 }
1122 const uc16* data;
1123 if (subject->IsExternalTwoByteString()) {
1124 data = ExternalTwoByteString::cast(subject)->resource()->data();
1125 } else {
1126 ASSERT(subject->IsSeqTwoByteString());
1127 data = SeqTwoByteString::cast(subject)->GetChars();
1128 }
1129 return reinterpret_cast<const byte*>(data + start_index);
1130 }
1131
1132
1118 int RegExpMacroAssemblerIA32::CheckStackGuardState(Address* return_address, 1133 int RegExpMacroAssemblerIA32::CheckStackGuardState(Address* return_address,
1119 Code* re_code) { 1134 Code* re_code,
1135 Address re_frame) {
1120 if (StackGuard::IsStackOverflow()) { 1136 if (StackGuard::IsStackOverflow()) {
1121 Top::StackOverflow(); 1137 Top::StackOverflow();
1122 return 1; 1138 return EXCEPTION;
1123 } 1139 }
1124 1140
1125 // If not real stack overflow the stack guard was used to interrupt 1141 // If not real stack overflow the stack guard was used to interrupt
1126 // execution for another purpose. 1142 // execution for another purpose.
1127 1143
1128 // Prepare for possible GC. 1144 // Prepare for possible GC.
1145 HandleScope handles;
1129 Handle<Code> code_handle(re_code); 1146 Handle<Code> code_handle(re_code);
1130 1147
1148 Handle<String> subject(frame_entry<String*>(re_frame, kInputString));
1149 // Current string.
1150 bool is_ascii = StringShape(*subject).IsAsciiRepresentation();
1151
1131 ASSERT(re_code->instruction_start() <= *return_address); 1152 ASSERT(re_code->instruction_start() <= *return_address);
1132 ASSERT(*return_address <= 1153 ASSERT(*return_address <=
1133 re_code->instruction_start() + re_code->instruction_size()); 1154 re_code->instruction_start() + re_code->instruction_size());
1134 1155
1135 Object* result = Execution::HandleStackGuardInterrupt(); 1156 Object* result = Execution::HandleStackGuardInterrupt();
1136 1157
1137 if (*code_handle != re_code) { // Return address no longer valid 1158 if (*code_handle != re_code) { // Return address no longer valid
1138 int delta = *code_handle - re_code; 1159 int delta = *code_handle - re_code;
1139 // Overwrite the return address on the stack. 1160 // Overwrite the return address on the stack.
1140 *return_address += delta; 1161 *return_address += delta;
1141 } 1162 }
1142 1163
1143 if (result->IsException()) { 1164 if (result->IsException()) {
1144 return 1; 1165 return EXCEPTION;
1145 } 1166 }
1167
1168 // String might have changed.
1169 if (StringShape(*subject).IsAsciiRepresentation() != is_ascii) {
1170 // If we changed between an ASCII and an UC16 string, the specialized
1171 // code cannot be used, and we need to restart regexp matching from
1172 // scratch (including, potentially, compiling a new version of the code).
1173 return RETRY;
1174 }
1175
1176 // Otherwise, the content of the string might have moved. It must still
1177 // be a sequential or external string with the same content.
1178 // Update the start and end pointers in the stack frame to the current
1179 // location (whether it has actually moved or not).
1180 ASSERT(StringShape(*subject).IsSequential() ||
1181 StringShape(*subject).IsExternal());
1182
1183 // The original start address of the characters to match.
1184 const byte* start_address = frame_entry<const byte*>(re_frame, kInputStart);
1185
1186 // Find the current start address of the same character at the current string
1187 // position.
1188 int start_index = frame_entry<int>(re_frame, kStartIndex);
1189 const byte* new_address = StringCharacterPosition(*subject, start_index);
1190
1191 if (start_address != new_address) {
1192 // If there is a difference, update start and end addresses in the
1193 // RegExp stack frame to match the new value.
1194 const byte* end_address = frame_entry<const byte* >(re_frame, kInputEnd);
1195 int byte_length = end_address - start_address;
1196 frame_entry<const byte*>(re_frame, kInputStart) = new_address;
1197 frame_entry<const byte*>(re_frame, kInputEnd) = new_address + byte_length;
1198 }
1199
1146 return 0; 1200 return 0;
1147 } 1201 }
1148 1202
1149 1203
1150 Address RegExpMacroAssemblerIA32::GrowStack(Address stack_pointer, 1204 Address RegExpMacroAssemblerIA32::GrowStack(Address stack_pointer,
1151 Address* stack_top) { 1205 Address* stack_top) {
1152 size_t size = RegExpStack::stack_capacity(); 1206 size_t size = RegExpStack::stack_capacity();
1153 Address old_stack_top = RegExpStack::stack_top(); 1207 Address old_stack_top = RegExpStack::stack_top();
1154 ASSERT(old_stack_top == *stack_top); 1208 ASSERT(old_stack_top == *stack_top);
1155 ASSERT(stack_pointer <= old_stack_top); 1209 ASSERT(stack_pointer <= old_stack_top);
(...skipping 103 matching lines...) Expand 10 before | Expand all | Expand 10 after
1259 __ j(above, &no_stack_overflow); 1313 __ j(above, &no_stack_overflow);
1260 1314
1261 SafeCall(&stack_overflow_label_); 1315 SafeCall(&stack_overflow_label_);
1262 1316
1263 __ bind(&no_stack_overflow); 1317 __ bind(&no_stack_overflow);
1264 } 1318 }
1265 } 1319 }
1266 1320
1267 1321
1268 void RegExpMacroAssemblerIA32::FrameAlign(int num_arguments, Register scratch) { 1322 void RegExpMacroAssemblerIA32::FrameAlign(int num_arguments, Register scratch) {
1269 // TODO(lrn): Since we no longer use the system stack arbitrarily, we 1323 // TODO(lrn): Since we no longer use the system stack arbitrarily (but we do
1270 // know the current stack alignment - esp points to the last regexp register. 1324 // use it, e.g., for SafeCall), we know the number of elements on the stack
1271 // We can do this simpler then. 1325 // since the last frame alignment. We might be able to do this simpler then.
1272 int frameAlignment = OS::ActivationFrameAlignment(); 1326 int frameAlignment = OS::ActivationFrameAlignment();
1273 if (frameAlignment != 0) { 1327 if (frameAlignment != 0) {
1274 // Make stack end at alignment and make room for num_arguments words 1328 // Make stack end at alignment and make room for num_arguments words
1275 // and the original value of esp. 1329 // and the original value of esp.
1276 __ mov(scratch, esp); 1330 __ mov(scratch, esp);
1277 __ sub(Operand(esp), Immediate((num_arguments + 1) * kPointerSize)); 1331 __ sub(Operand(esp), Immediate((num_arguments + 1) * kPointerSize));
1278 ASSERT(IsPowerOf2(frameAlignment)); 1332 ASSERT(IsPowerOf2(frameAlignment));
1279 __ and_(esp, -frameAlignment); 1333 __ and_(esp, -frameAlignment);
1280 __ mov(Operand(esp, num_arguments * kPointerSize), scratch); 1334 __ mov(Operand(esp, num_arguments * kPointerSize), scratch);
1281 } else { 1335 } else {
(...skipping 40 matching lines...) Expand 10 before | Expand all | Expand 10 after
1322 1376
1323 1377
1324 void RegExpMacroAssemblerIA32::LoadConstantBufferAddress(Register reg, 1378 void RegExpMacroAssemblerIA32::LoadConstantBufferAddress(Register reg,
1325 ArraySlice* buffer) { 1379 ArraySlice* buffer) {
1326 __ mov(reg, buffer->array()); 1380 __ mov(reg, buffer->array());
1327 __ add(Operand(reg), Immediate(buffer->base_offset())); 1381 __ add(Operand(reg), Immediate(buffer->base_offset()));
1328 } 1382 }
1329 1383
1330 #undef __ 1384 #undef __
1331 }} // namespace v8::internal 1385 }} // namespace v8::internal
OLDNEW

Powered by Google App Engine
This is Rietveld 408576698