Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(1637)

Side by Side Diff: source/common/ubidi.c

Issue 845603002: Update ICU to 54.1 step 1 (Closed) Base URL: https://chromium.googlesource.com/chromium/deps/icu.git@master
Patch Set: remove unusued directories Created 5 years, 11 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « source/common/stringpiece.cpp ('k') | source/common/ubidi_props.h » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 /* 1 /*
2 ****************************************************************************** 2 ******************************************************************************
3 * 3 *
4 * Copyright (C) 1999-2013, International Business Machines 4 * Copyright (C) 1999-2014, International Business Machines
5 * Corporation and others. All Rights Reserved. 5 * Corporation and others. All Rights Reserved.
6 * 6 *
7 ****************************************************************************** 7 ******************************************************************************
8 * file name: ubidi.c 8 * file name: ubidi.c
9 * encoding: US-ASCII 9 * encoding: US-ASCII
10 * tab size: 8 (not used) 10 * tab size: 8 (not used)
11 * indentation:4 11 * indentation:4
12 * 12 *
13 * created on: 1999jul27 13 * created on: 1999jul27
14 * created by: Markus W. Scherer, updated by Matitiahu Allouche 14 * created by: Markus W. Scherer, updated by Matitiahu Allouche
15 * 15 *
16 */ 16 */
17 17
18 #include "cmemory.h" 18 #include "cmemory.h"
19 #include "unicode/utypes.h" 19 #include "unicode/utypes.h"
20 #include "unicode/ustring.h" 20 #include "unicode/ustring.h"
21 #include "unicode/uchar.h" 21 #include "unicode/uchar.h"
22 #include "unicode/ubidi.h" 22 #include "unicode/ubidi.h"
23 #include "unicode/utf16.h" 23 #include "unicode/utf16.h"
24 #include "ubidi_props.h" 24 #include "ubidi_props.h"
25 #include "ubidiimp.h" 25 #include "ubidiimp.h"
26 #include "uassert.h" 26 #include "uassert.h"
27 27
28 /* 28 /*
29 * General implementation notes: 29 * General implementation notes:
30 * 30 *
31 * Throughout the implementation, there are comments like (W2) that refer to 31 * Throughout the implementation, there are comments like (W2) that refer to
32 * rules of the BiDi algorithm in its version 5, in this example to the second 32 * rules of the BiDi algorithm, in this example to the second rule of the
33 * rule of the resolution of weak types. 33 * resolution of weak types.
34 * 34 *
35 * For handling surrogate pairs, where two UChar's form one "abstract" (or UTF-3 2) 35 * For handling surrogate pairs, where two UChar's form one "abstract" (or UTF-3 2)
36 * character according to UTF-16, the second UChar gets the directional property of 36 * character according to UTF-16, the second UChar gets the directional property of
37 * the entire character assigned, while the first one gets a BN, a boundary 37 * the entire character assigned, while the first one gets a BN, a boundary
38 * neutral, type, which is ignored by most of the algorithm according to 38 * neutral, type, which is ignored by most of the algorithm according to
39 * rule (X9) and the implementation suggestions of the BiDi algorithm. 39 * rule (X9) and the implementation suggestions of the BiDi algorithm.
40 * 40 *
41 * Later, adjustWSLevels() will set the level for each BN to that of the 41 * Later, adjustWSLevels() will set the level for each BN to that of the
42 * following character (UChar), which results in surrogate pairs getting the 42 * following character (UChar), which results in surrogate pairs getting the
43 * same level on each of their surrogates. 43 * same level on each of their surrogates.
44 * 44 *
45 * In a UTF-8 implementation, the same thing could be done: the last byte of 45 * In a UTF-8 implementation, the same thing could be done: the last byte of
46 * a multi-byte sequence would get the "real" property, while all previous 46 * a multi-byte sequence would get the "real" property, while all previous
47 * bytes of that sequence would get BN. 47 * bytes of that sequence would get BN.
48 * 48 *
49 * It is not possible to assign all those parts of a character the same real 49 * It is not possible to assign all those parts of a character the same real
50 * property because this would fail in the resolution of weak types with rules 50 * property because this would fail in the resolution of weak types with rules
51 * that look at immediately surrounding types. 51 * that look at immediately surrounding types.
52 * 52 *
53 * As a related topic, this implementation does not remove Boundary Neutral 53 * As a related topic, this implementation does not remove Boundary Neutral
54 * types from the input, but ignores them wherever this is relevant. 54 * types from the input, but ignores them wherever this is relevant.
55 * For example, the loop for the resolution of the weak types reads 55 * For example, the loop for the resolution of the weak types reads
56 * types until it finds a non-BN. 56 * types until it finds a non-BN.
57 * Also, explicit embedding codes are neither changed into BN nor removed. 57 * Also, explicit embedding codes are neither changed into BN nor removed.
58 * They are only treated the same way real BNs are. 58 * They are only treated the same way real BNs are.
59 * As stated before, adjustWSLevels() takes care of them at the end. 59 * As stated before, adjustWSLevels() takes care of them at the end.
60 * For the purpose of conformance, the levels of all these codes 60 * For the purpose of conformance, the levels of all these codes
61 * do not matter. 61 * do not matter.
62 * 62 *
63 * Note that this implementation never modifies the dirProps 63 * Note that this implementation modifies the dirProps
64 * after the initial setup, except for FSI which is changed to either 64 * after the initial setup, when applying X5c (replace FSI by LRI or RLI),
65 * LRI or RLI in getDirProps(), and paired brackets which may be changed 65 * X6, N0 (replace paired brackets by L or R).
66 * to L or R according to N0.
67 * 66 *
68 * 67 * In this implementation, the resolution of weak types (W1 to W6),
69 * In this implementation, the resolution of weak types (Wn), 68 * neutrals (N1 and N2), and the assignment of the resolved level (In)
70 * neutrals (Nn), and the assignment of the resolved level (In)
71 * are all done in one single loop, in resolveImplicitLevels(). 69 * are all done in one single loop, in resolveImplicitLevels().
72 * Changes of dirProp values are done on the fly, without writing 70 * Changes of dirProp values are done on the fly, without writing
73 * them back to the dirProps array. 71 * them back to the dirProps array.
74 * 72 *
75 * 73 *
76 * This implementation contains code that allows to bypass steps of the 74 * This implementation contains code that allows to bypass steps of the
77 * algorithm that are not needed on the specific paragraph 75 * algorithm that are not needed on the specific paragraph
78 * in order to speed up the most common cases considerably, 76 * in order to speed up the most common cases considerably,
79 * like text that is entirely LTR, or RTL text without numbers. 77 * like text that is entirely LTR, or RTL text without numbers.
80 * 78 *
(...skipping 26 matching lines...) Expand all
107 * If there are no White Space types in the paragraph, then 105 * If there are no White Space types in the paragraph, then
108 * (L1) is not necessary in adjustWSLevels(). 106 * (L1) is not necessary in adjustWSLevels().
109 */ 107 */
110 108
111 /* to avoid some conditional statements, use tiny constant arrays */ 109 /* to avoid some conditional statements, use tiny constant arrays */
112 static const Flags flagLR[2]={ DIRPROP_FLAG(L), DIRPROP_FLAG(R) }; 110 static const Flags flagLR[2]={ DIRPROP_FLAG(L), DIRPROP_FLAG(R) };
113 static const Flags flagE[2]={ DIRPROP_FLAG(LRE), DIRPROP_FLAG(RLE) }; 111 static const Flags flagE[2]={ DIRPROP_FLAG(LRE), DIRPROP_FLAG(RLE) };
114 static const Flags flagO[2]={ DIRPROP_FLAG(LRO), DIRPROP_FLAG(RLO) }; 112 static const Flags flagO[2]={ DIRPROP_FLAG(LRO), DIRPROP_FLAG(RLO) };
115 113
116 #define DIRPROP_FLAG_LR(level) flagLR[(level)&1] 114 #define DIRPROP_FLAG_LR(level) flagLR[(level)&1]
117 #define DIRPROP_FLAG_E(level) flagE[(level)&1] 115 #define DIRPROP_FLAG_E(level) flagE[(level)&1]
118 #define DIRPROP_FLAG_O(level) flagO[(level)&1] 116 #define DIRPROP_FLAG_O(level) flagO[(level)&1]
119 117
120 #define DIR_FROM_STRONG(strong) ((strong)==L ? L : R) 118 #define DIR_FROM_STRONG(strong) ((strong)==L ? L : R)
121 119
120 #define NO_OVERRIDE(level) ((level)&~UBIDI_LEVEL_OVERRIDE)
121
122 /* UBiDi object management -------------------------------------------------- */ 122 /* UBiDi object management -------------------------------------------------- */
123 123
124 U_CAPI UBiDi * U_EXPORT2 124 U_CAPI UBiDi * U_EXPORT2
125 ubidi_open(void) 125 ubidi_open(void)
126 { 126 {
127 UErrorCode errorCode=U_ZERO_ERROR; 127 UErrorCode errorCode=U_ZERO_ERROR;
128 return ubidi_openSized(0, 0, &errorCode); 128 return ubidi_openSized(0, 0, &errorCode);
129 } 129 }
130 130
131 U_CAPI UBiDi * U_EXPORT2 131 U_CAPI UBiDi * U_EXPORT2
(...skipping 264 matching lines...) Expand 10 before | Expand all | Expand 10 after
396 return result; 396 return result;
397 } 397 }
398 398
399 /* 399 /*
400 * Check that there are enough entries in the array pointed to by pBiDi->paras 400 * Check that there are enough entries in the array pointed to by pBiDi->paras
401 */ 401 */
402 static UBool 402 static UBool
403 checkParaCount(UBiDi *pBiDi) { 403 checkParaCount(UBiDi *pBiDi) {
404 int32_t count=pBiDi->paraCount; 404 int32_t count=pBiDi->paraCount;
405 if(pBiDi->paras==pBiDi->simpleParas) { 405 if(pBiDi->paras==pBiDi->simpleParas) {
406 if(count<=SIMPLE_PARAS_SIZE) 406 if(count<=SIMPLE_PARAS_COUNT)
407 return TRUE; 407 return TRUE;
408 if(!getInitialParasMemory(pBiDi, SIMPLE_PARAS_SIZE * 2)) 408 if(!getInitialParasMemory(pBiDi, SIMPLE_PARAS_COUNT * 2))
409 return FALSE; 409 return FALSE;
410 pBiDi->paras=pBiDi->parasMemory; 410 pBiDi->paras=pBiDi->parasMemory;
411 uprv_memcpy(pBiDi->parasMemory, pBiDi->simpleParas, SIMPLE_PARAS_SIZE * sizeof(Para)); 411 uprv_memcpy(pBiDi->parasMemory, pBiDi->simpleParas, SIMPLE_PARAS_COUNT * sizeof(Para));
412 return TRUE; 412 return TRUE;
413 } 413 }
414 if(!getInitialParasMemory(pBiDi, count * 2)) 414 if(!getInitialParasMemory(pBiDi, count * 2))
415 return FALSE; 415 return FALSE;
416 pBiDi->paras=pBiDi->parasMemory; 416 pBiDi->paras=pBiDi->parasMemory;
417 return TRUE; 417 return TRUE;
418 } 418 }
419 419
420 /* 420 /*
421 * Get the directional properties for the text, calculate the flags bit-set, and 421 * Get the directional properties for the text, calculate the flags bit-set, and
422 * determine the paragraph level if necessary (in pBiDi->paras[i].level). 422 * determine the paragraph level if necessary (in pBiDi->paras[i].level).
423 * FSI initiators are also resolved and their dirProp replaced with LRI or RLI. 423 * FSI initiators are also resolved and their dirProp replaced with LRI or RLI.
424 * When encountering an FSI, it is initially replaced with an LRI, which is the
425 * default. Only if a strong R or AL is found within its scope will the LRI be
426 * replaced by an RLI.
424 */ 427 */
425 static UBool 428 static UBool
426 getDirProps(UBiDi *pBiDi) { 429 getDirProps(UBiDi *pBiDi) {
427 const UChar *text=pBiDi->text; 430 const UChar *text=pBiDi->text;
428 DirProp *dirProps=pBiDi->dirPropsMemory; /* pBiDi->dirProps is const */ 431 DirProp *dirProps=pBiDi->dirPropsMemory; /* pBiDi->dirProps is const */
429 432
430 int32_t i=0, originalLength=pBiDi->originalLength; 433 int32_t i=0, originalLength=pBiDi->originalLength;
431 Flags flags=0; /* collect all directionalities in the text */ 434 Flags flags=0; /* collect all directionalities in the text */
432 UChar32 uchar; 435 UChar32 uchar;
433 DirProp dirProp=0, defaultParaLevel=0; /* initialize to avoid compiler warn ings */ 436 DirProp dirProp=0, defaultParaLevel=0; /* initialize to avoid compiler warn ings */
(...skipping 67 matching lines...) Expand 10 before | Expand all | Expand 10 after
501 } 504 }
502 if(removeBiDiControls && IS_BIDI_CONTROL_CHAR(uchar)) 505 if(removeBiDiControls && IS_BIDI_CONTROL_CHAR(uchar))
503 controlCount++; 506 controlCount++;
504 if(dirProp==L) { 507 if(dirProp==L) {
505 if(state==SEEKING_STRONG_FOR_PARA) { 508 if(state==SEEKING_STRONG_FOR_PARA) {
506 pBiDi->paras[pBiDi->paraCount-1].level=0; 509 pBiDi->paras[pBiDi->paraCount-1].level=0;
507 state=NOT_SEEKING_STRONG; 510 state=NOT_SEEKING_STRONG;
508 } 511 }
509 else if(state==SEEKING_STRONG_FOR_FSI) { 512 else if(state==SEEKING_STRONG_FOR_FSI) {
510 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) { 513 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) {
511 dirProps[isolateStartStack[stackLast]]=LRI; 514 /* no need for next statement, already set by default */
515 /* dirProps[isolateStartStack[stackLast]]=LRI; */
512 flags|=DIRPROP_FLAG(LRI); 516 flags|=DIRPROP_FLAG(LRI);
513 } 517 }
514 state=LOOKING_FOR_PDI; 518 state=LOOKING_FOR_PDI;
515 } 519 }
516 lastStrong=L; 520 lastStrong=L;
517 continue; 521 continue;
518 } 522 }
519 if(dirProp==R || dirProp==AL) { 523 if(dirProp==R || dirProp==AL) {
520 if(state==SEEKING_STRONG_FOR_PARA) { 524 if(state==SEEKING_STRONG_FOR_PARA) {
521 pBiDi->paras[pBiDi->paraCount-1].level=1; 525 pBiDi->paras[pBiDi->paraCount-1].level=1;
(...skipping 10 matching lines...) Expand all
532 if(dirProp==AL) 536 if(dirProp==AL)
533 lastArabicPos=i-1; 537 lastArabicPos=i-1;
534 continue; 538 continue;
535 } 539 }
536 if(dirProp>=FSI && dirProp<=RLI) { /* FSI, LRI or RLI */ 540 if(dirProp>=FSI && dirProp<=RLI) { /* FSI, LRI or RLI */
537 stackLast++; 541 stackLast++;
538 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) { 542 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) {
539 isolateStartStack[stackLast]=i-1; 543 isolateStartStack[stackLast]=i-1;
540 previousStateStack[stackLast]=state; 544 previousStateStack[stackLast]=state;
541 } 545 }
542 if(dirProp==FSI) 546 if(dirProp==FSI) {
547 dirProps[i-1]=LRI; /* default if no strong char */
543 state=SEEKING_STRONG_FOR_FSI; 548 state=SEEKING_STRONG_FOR_FSI;
549 }
544 else 550 else
545 state=LOOKING_FOR_PDI; 551 state=LOOKING_FOR_PDI;
546 continue; 552 continue;
547 } 553 }
548 if(dirProp==PDI) { 554 if(dirProp==PDI) {
549 if(state==SEEKING_STRONG_FOR_FSI) { 555 if(state==SEEKING_STRONG_FOR_FSI) {
550 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) { 556 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) {
551 dirProps[isolateStartStack[stackLast]]=LRI; 557 /* no need for next statement, already set by default */
558 /* dirProps[isolateStartStack[stackLast]]=LRI; */
552 flags|=DIRPROP_FLAG(LRI); 559 flags|=DIRPROP_FLAG(LRI);
553 } 560 }
554 } 561 }
555 if(stackLast>=0) { 562 if(stackLast>=0) {
556 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL) 563 if(stackLast<=UBIDI_MAX_EXPLICIT_LEVEL)
557 state=previousStateStack[stackLast]; 564 state=previousStateStack[stackLast];
558 stackLast--; 565 stackLast--;
559 } 566 }
560 continue; 567 continue;
561 } 568 }
(...skipping 22 matching lines...) Expand all
584 state=NOT_SEEKING_STRONG; 591 state=NOT_SEEKING_STRONG;
585 } 592 }
586 stackLast=-1; 593 stackLast=-1;
587 } 594 }
588 continue; 595 continue;
589 } 596 }
590 } 597 }
591 /* Ignore still open isolate sequences with overflow */ 598 /* Ignore still open isolate sequences with overflow */
592 if(stackLast>UBIDI_MAX_EXPLICIT_LEVEL) { 599 if(stackLast>UBIDI_MAX_EXPLICIT_LEVEL) {
593 stackLast=UBIDI_MAX_EXPLICIT_LEVEL; 600 stackLast=UBIDI_MAX_EXPLICIT_LEVEL;
594 if(dirProps[previousStateStack[UBIDI_MAX_EXPLICIT_LEVEL]]!=FSI) 601 state=SEEKING_STRONG_FOR_FSI; /* to be on the safe side */
595 state=LOOKING_FOR_PDI;
596 } 602 }
597 /* Resolve direction of still unresolved open FSI sequences */ 603 /* Resolve direction of still unresolved open FSI sequences */
598 while(stackLast>=0) { 604 while(stackLast>=0) {
599 if(state==SEEKING_STRONG_FOR_FSI) { 605 if(state==SEEKING_STRONG_FOR_FSI) {
600 dirProps[isolateStartStack[stackLast]]=LRI; 606 /* no need for next statement, already set by default */
607 /* dirProps[isolateStartStack[stackLast]]=LRI; */
601 flags|=DIRPROP_FLAG(LRI); 608 flags|=DIRPROP_FLAG(LRI);
609 break;
602 } 610 }
603 state=previousStateStack[stackLast]; 611 state=previousStateStack[stackLast];
604 stackLast--; 612 stackLast--;
605 } 613 }
606 /* When streaming, ignore text after the last paragraph separator */ 614 /* When streaming, ignore text after the last paragraph separator */
607 if(pBiDi->reorderingOptions & UBIDI_OPTION_STREAMING) { 615 if(pBiDi->reorderingOptions & UBIDI_OPTION_STREAMING) {
608 if(pBiDi->length<originalLength) 616 if(pBiDi->length<originalLength)
609 pBiDi->paraCount--; 617 pBiDi->paraCount--;
610 } else { 618 } else {
611 pBiDi->paras[pBiDi->paraCount-1].limit=originalLength; 619 pBiDi->paras[pBiDi->paraCount-1].limit=originalLength;
(...skipping 48 matching lines...) Expand 10 before | Expand all | Expand 10 after
660 encountered strong character, since these will be needed to resolve 668 encountered strong character, since these will be needed to resolve
661 the level of paired brackets. */ 669 the level of paired brackets. */
662 670
663 static void 671 static void
664 bracketInit(UBiDi *pBiDi, BracketData *bd) { 672 bracketInit(UBiDi *pBiDi, BracketData *bd) {
665 bd->pBiDi=pBiDi; 673 bd->pBiDi=pBiDi;
666 bd->isoRunLast=0; 674 bd->isoRunLast=0;
667 bd->isoRuns[0].start=0; 675 bd->isoRuns[0].start=0;
668 bd->isoRuns[0].limit=0; 676 bd->isoRuns[0].limit=0;
669 bd->isoRuns[0].level=GET_PARALEVEL(pBiDi, 0); 677 bd->isoRuns[0].level=GET_PARALEVEL(pBiDi, 0);
670 bd->isoRuns[0].lastStrong=bd->isoRuns[0].contextDir=GET_PARALEVEL(pBiDi, 0)& 1; 678 bd->isoRuns[0].lastStrong=bd->isoRuns[0].lastBase=bd->isoRuns[0].contextDir= GET_PARALEVEL(pBiDi, 0)&1;
671 bd->isoRuns[0].lastStrongPos=bd->isoRuns[0].contextPos=0; 679 bd->isoRuns[0].contextPos=0;
672 if(pBiDi->openingsMemory) { 680 if(pBiDi->openingsMemory) {
673 bd->openings=pBiDi->openingsMemory; 681 bd->openings=pBiDi->openingsMemory;
674 bd->openingsCount=pBiDi->openingsSize / sizeof(Opening); 682 bd->openingsCount=pBiDi->openingsSize / sizeof(Opening);
675 } else { 683 } else {
676 bd->openings=bd->simpleOpenings; 684 bd->openings=bd->simpleOpenings;
677 bd->openingsCount=SIMPLE_OPENINGS_SIZE; 685 bd->openingsCount=SIMPLE_OPENINGS_COUNT;
678 } 686 }
679 bd->isNumbersSpecial=bd->pBiDi->reorderingMode==UBIDI_REORDER_NUMBERS_SPECIA L || 687 bd->isNumbersSpecial=bd->pBiDi->reorderingMode==UBIDI_REORDER_NUMBERS_SPECIA L ||
680 bd->pBiDi->reorderingMode==UBIDI_REORDER_INVERSE_FOR_NU MBERS_SPECIAL; 688 bd->pBiDi->reorderingMode==UBIDI_REORDER_INVERSE_FOR_NU MBERS_SPECIAL;
681 } 689 }
682 690
683 /* paragraph boundary */ 691 /* paragraph boundary */
684 static void 692 static void
685 bracketProcessB(BracketData *bd, UBiDiLevel level) { 693 bracketProcessB(BracketData *bd, UBiDiLevel level) {
686 bd->isoRunLast=0; 694 bd->isoRunLast=0;
687 bd->isoRuns[0].limit=0; 695 bd->isoRuns[0].limit=0;
688 bd->isoRuns[0].level=level; 696 bd->isoRuns[0].level=level;
689 bd->isoRuns[0].lastStrong=bd->isoRuns[0].contextDir=level&1; 697 bd->isoRuns[0].lastStrong=bd->isoRuns[0].lastBase=bd->isoRuns[0].contextDir= level&1;
690 bd->isoRuns[0].lastStrongPos=bd->isoRuns[0].contextPos=0; 698 bd->isoRuns[0].contextPos=0;
691 } 699 }
692 700
693 /* LRE, LRO, RLE, RLO, PDF */ 701 /* LRE, LRO, RLE, RLO, PDF */
694 static void 702 static void
695 bracketProcessBoundary(BracketData *bd, int32_t lastCcPos, 703 bracketProcessBoundary(BracketData *bd, int32_t lastCcPos,
696 UBiDiLevel contextLevel, UBiDiLevel embeddingLevel) { 704 UBiDiLevel contextLevel, UBiDiLevel embeddingLevel) {
697 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast]; 705 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
698 DirProp *dirProps=bd->pBiDi->dirProps; 706 DirProp *dirProps=bd->pBiDi->dirProps;
699 if(DIRPROP_FLAG(dirProps[lastCcPos])&MASK_ISO) /* after an isolate */ 707 if(DIRPROP_FLAG(dirProps[lastCcPos])&MASK_ISO) /* after an isolate */
700 return; 708 return;
701 if((embeddingLevel&~UBIDI_LEVEL_OVERRIDE)> 709 if(NO_OVERRIDE(embeddingLevel)>NO_OVERRIDE(contextLevel)) /* not a PDF */
702 (contextLevel&~UBIDI_LEVEL_OVERRIDE)) /* not a PDF */
703 contextLevel=embeddingLevel; 710 contextLevel=embeddingLevel;
704 pLastIsoRun->limit=pLastIsoRun->start; 711 pLastIsoRun->limit=pLastIsoRun->start;
705 pLastIsoRun->level=embeddingLevel; 712 pLastIsoRun->level=embeddingLevel;
706 pLastIsoRun->lastStrong=pLastIsoRun->contextDir=contextLevel&1; 713 pLastIsoRun->lastStrong=pLastIsoRun->lastBase=pLastIsoRun->contextDir=contex tLevel&1;
707 pLastIsoRun->lastStrongPos=pLastIsoRun->contextPos=lastCcPos; 714 pLastIsoRun->contextPos=lastCcPos;
708 } 715 }
709 716
710 /* LRI or RLI */ 717 /* LRI or RLI */
711 static void 718 static void
712 bracketProcessLRI_RLI(BracketData *bd, UBiDiLevel level) { 719 bracketProcessLRI_RLI(BracketData *bd, UBiDiLevel level) {
713 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast]; 720 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
714 int16_t lastLimit; 721 int16_t lastLimit;
722 pLastIsoRun->lastBase=ON;
715 lastLimit=pLastIsoRun->limit; 723 lastLimit=pLastIsoRun->limit;
716 bd->isoRunLast++; 724 bd->isoRunLast++;
717 pLastIsoRun++; 725 pLastIsoRun++;
718 pLastIsoRun->start=pLastIsoRun->limit=lastLimit; 726 pLastIsoRun->start=pLastIsoRun->limit=lastLimit;
719 pLastIsoRun->level=level; 727 pLastIsoRun->level=level;
720 pLastIsoRun->lastStrong=pLastIsoRun->contextDir=level&1; 728 pLastIsoRun->lastStrong=pLastIsoRun->lastBase=pLastIsoRun->contextDir=level& 1;
721 pLastIsoRun->lastStrongPos=pLastIsoRun->contextPos=0; 729 pLastIsoRun->contextPos=0;
722 } 730 }
723 731
724 /* PDI */ 732 /* PDI */
725 static void 733 static void
726 bracketProcessPDI(BracketData *bd) { 734 bracketProcessPDI(BracketData *bd) {
735 IsoRun *pLastIsoRun;
727 bd->isoRunLast--; 736 bd->isoRunLast--;
737 pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
738 pLastIsoRun->lastBase=ON;
728 } 739 }
729 740
730 /* newly found opening bracket: create an openings entry */ 741 /* newly found opening bracket: create an openings entry */
731 static UBool /* return TRUE if success */ 742 static UBool /* return TRUE if success */
732 bracketAddOpening(BracketData *bd, UChar match, int32_t position) { 743 bracketAddOpening(BracketData *bd, UChar match, int32_t position) {
733 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast]; 744 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
734 Opening *pOpening; 745 Opening *pOpening;
735 if(pLastIsoRun->limit>=bd->openingsCount) { /* no available new entry */ 746 if(pLastIsoRun->limit>=bd->openingsCount) { /* no available new entry */
736 UBiDi *pBiDi=bd->pBiDi; 747 UBiDi *pBiDi=bd->pBiDi;
737 if(!getInitialOpeningsMemory(pBiDi, pLastIsoRun->limit * 2)) 748 if(!getInitialOpeningsMemory(pBiDi, pLastIsoRun->limit * 2))
738 return FALSE; 749 return FALSE;
739 if(bd->openings==bd->simpleOpenings) 750 if(bd->openings==bd->simpleOpenings)
740 uprv_memcpy(pBiDi->openingsMemory, bd->simpleOpenings, 751 uprv_memcpy(pBiDi->openingsMemory, bd->simpleOpenings,
741 SIMPLE_OPENINGS_SIZE * sizeof(Opening)); 752 SIMPLE_OPENINGS_COUNT * sizeof(Opening));
742 bd->openings=pBiDi->openingsMemory; /* may have changed */ 753 bd->openings=pBiDi->openingsMemory; /* may have changed */
743 bd->openingsCount=pBiDi->openingsSize / sizeof(Opening); 754 bd->openingsCount=pBiDi->openingsSize / sizeof(Opening);
744 } 755 }
745 pOpening=&bd->openings[pLastIsoRun->limit]; 756 pOpening=&bd->openings[pLastIsoRun->limit];
746 pOpening->position=position; 757 pOpening->position=position;
747 pOpening->match=match; 758 pOpening->match=match;
748 pOpening->contextDir=pLastIsoRun->contextDir; 759 pOpening->contextDir=pLastIsoRun->contextDir;
749 pOpening->contextPos=pLastIsoRun->contextPos; 760 pOpening->contextPos=pLastIsoRun->contextPos;
750 pOpening->flags=0; 761 pOpening->flags=0;
751 pLastIsoRun->limit++; 762 pLastIsoRun->limit++;
(...skipping 11 matching lines...) Expand all
763 for(k=openingIndex+1, qOpening=&bd->openings[k]; k<pLastIsoRun->limit; k++, qOpening++) { 774 for(k=openingIndex+1, qOpening=&bd->openings[k]; k<pLastIsoRun->limit; k++, qOpening++) {
764 if(qOpening->match>=0) /* not an N0c match */ 775 if(qOpening->match>=0) /* not an N0c match */
765 continue; 776 continue;
766 if(newPropPosition<qOpening->contextPos) 777 if(newPropPosition<qOpening->contextPos)
767 break; 778 break;
768 if(newPropPosition>=qOpening->position) 779 if(newPropPosition>=qOpening->position)
769 continue; 780 continue;
770 if(newProp==qOpening->contextDir) 781 if(newProp==qOpening->contextDir)
771 break; 782 break;
772 openingPosition=qOpening->position; 783 openingPosition=qOpening->position;
773 dirProps[openingPosition]=dirProps[newPropPosition]; 784 dirProps[openingPosition]=newProp;
774 closingPosition=-(qOpening->match); 785 closingPosition=-(qOpening->match);
775 dirProps[closingPosition]= newProp; /* can never be AL */ 786 dirProps[closingPosition]=newProp;
776 qOpening->match=0; /* prevent further changes */ 787 qOpening->match=0; /* prevent further changes */
777 fixN0c(bd, k, openingPosition, newProp); 788 fixN0c(bd, k, openingPosition, newProp);
778 fixN0c(bd, k, closingPosition, newProp); 789 fixN0c(bd, k, closingPosition, newProp);
779 } 790 }
780 } 791 }
781 792
793 /* process closing bracket */
794 static DirProp /* return L or R if N0b or N0c, ON if N0d */
795 bracketProcessClosing(BracketData *bd, int32_t openIdx, int32_t position) {
796 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
797 Opening *pOpening, *qOpening;
798 UBiDiDirection direction;
799 UBool stable;
800 DirProp newProp;
801 pOpening=&bd->openings[openIdx];
802 direction=pLastIsoRun->level&1;
803 stable=TRUE; /* assume stable until proved otherwise */
804
805 /* The stable flag is set when brackets are paired and their
806 level is resolved and cannot be changed by what will be
807 found later in the source string.
808 An unstable match can occur only when applying N0c, where
809 the resolved level depends on the preceding context, and
810 this context may be affected by text occurring later.
811 Example: RTL paragraph containing: abc[(latin) HEBREW]
812 When the closing parenthesis is encountered, it appears
813 that N0c1 must be applied since 'abc' sets an opposite
814 direction context and both parentheses receive level 2.
815 However, when the closing square bracket is processed,
816 N0b applies because of 'HEBREW' being included within the
817 brackets, thus the square brackets are treated like R and
818 receive level 1. However, this changes the preceding
819 context of the opening parenthesis, and it now appears
820 that N0c2 must be applied to the parentheses rather than
821 N0c1. */
822
823 if((direction==0 && pOpening->flags&FOUND_L) ||
824 (direction==1 && pOpening->flags&FOUND_R)) { /* N0b */
825 newProp=direction;
826 }
827 else if(pOpening->flags&(FOUND_L|FOUND_R)) { /* N0c */
828 /* it is stable if there is no containing pair or in
829 conditions too complicated and not worth checking */
830 stable=(openIdx==pLastIsoRun->start);
831 if(direction!=pOpening->contextDir)
832 newProp=pOpening->contextDir; /* N0c1 */
833 else
834 newProp=direction; /* N0c2 */
835 } else {
836 /* forget this and any brackets nested within this pair */
837 pLastIsoRun->limit=openIdx;
838 return ON; /* N0d */
839 }
840 bd->pBiDi->dirProps[pOpening->position]=newProp;
841 bd->pBiDi->dirProps[position]=newProp;
842 /* Update nested N0c pairs that may be affected */
843 fixN0c(bd, openIdx, pOpening->position, newProp);
844 if(stable) {
845 pLastIsoRun->limit=openIdx; /* forget any brackets nested within this pa ir */
846 /* remove lower located synonyms if any */
847 while(pLastIsoRun->limit>pLastIsoRun->start &&
848 bd->openings[pLastIsoRun->limit-1].position==pOpening->position)
849 pLastIsoRun->limit--;
850 } else {
851 int32_t k;
852 pOpening->match=-position;
853 /* neutralize lower located synonyms if any */
854 k=openIdx-1;
855 while(k>=pLastIsoRun->start &&
856 bd->openings[k].position==pOpening->position)
857 bd->openings[k--].match=0;
858 /* neutralize any unmatched opening between the current pair;
859 this will also neutralize higher located synonyms if any */
860 for(k=openIdx+1; k<pLastIsoRun->limit; k++) {
861 qOpening=&bd->openings[k];
862 if(qOpening->position>=position)
863 break;
864 if(qOpening->match>0)
865 qOpening->match=0;
866 }
867 }
868 return newProp;
869 }
870
782 /* handle strong characters, digits and candidates for closing brackets */ 871 /* handle strong characters, digits and candidates for closing brackets */
783 static UBool /* return TRUE if success */ 872 static UBool /* return TRUE if success */
784 bracketProcessChar(BracketData *bd, int32_t position, DirProp dirProp) { 873 bracketProcessChar(BracketData *bd, int32_t position) {
785 IsoRun *pLastIsoRun; 874 IsoRun *pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
786 Opening *pOpening, *qOpening; 875 DirProp *dirProps, dirProp, newProp;
787 DirProp *dirProps, newProp; 876 UBiDiLevel level;
788 UBiDiDirection direction;
789 uint16_t flag;
790 int32_t i, k;
791 UBool stable;
792 UChar c, match;
793 dirProps=bd->pBiDi->dirProps; 877 dirProps=bd->pBiDi->dirProps;
794 if(DIRPROP_FLAG(dirProp)&MASK_STRONG_EN_AN) { /* L, R, AL, EN or AN */ 878 dirProp=dirProps[position];
795 pLastIsoRun=&bd->isoRuns[bd->isoRunLast]; 879 if(dirProp==ON) {
796 /* AN after R or AL becomes R or AL; after L or L+AN, it is kept as-is * / 880 UChar c, match;
797 if(dirProp==AN && (pLastIsoRun->lastStrong==R || pLastIsoRun->lastStrong ==AL)) 881 int32_t idx;
798 dirProp=pLastIsoRun->lastStrong; 882 /* First see if it is a matching closing bracket. Hopefully, this is
799 /* EN after L or L+AN becomes L; after R or AL, it becomes R or AL */ 883 more efficient than checking if it is a closing bracket at all */
800 if(dirProp==EN) { 884 c=bd->pBiDi->text[position];
801 if(pLastIsoRun->lastStrong==L || pLastIsoRun->lastStrong==AN) { 885 for(idx=pLastIsoRun->limit-1; idx>=pLastIsoRun->start; idx--) {
802 dirProp=L; 886 if(bd->openings[idx].match!=c)
803 if(!bd->isNumbersSpecial) 887 continue;
804 dirProps[position]=ENL; 888 /* We have a match */
805 } 889 newProp=bracketProcessClosing(bd, idx, position);
806 else { 890 if(newProp==ON) { /* N0d */
807 dirProp=pLastIsoRun->lastStrong; /* may be R or AL */ 891 c=0; /* prevent handling as an opening */
808 if(!bd->isNumbersSpecial) 892 break;
809 dirProps[position]= dirProp==AL ? AN : ENR; 893 }
810 } 894 pLastIsoRun->lastBase=ON;
811 }
812 pLastIsoRun->lastStrong=dirProp;
813 pLastIsoRun->contextDir=DIR_FROM_STRONG(dirProp);
814 pLastIsoRun->lastStrongPos=pLastIsoRun->contextPos=position;
815 if(dirProp==AL || dirProp==AN)
816 dirProp=R;
817 flag=DIRPROP_FLAG(dirProp);
818 /* strong characters found after an unmatched opening bracket
819 must be noted for possibly applying N0b */
820 for(i=pLastIsoRun->start; i<pLastIsoRun->limit; i++)
821 bd->openings[i].flags|=flag;
822 return TRUE;
823 }
824 if(dirProp!=ON)
825 return TRUE;
826 /* First see if it is a matching closing bracket. Hopefully, this is more
827 efficient than checking if it is a closing bracket at all */
828 c=bd->pBiDi->text[position];
829 pLastIsoRun=&bd->isoRuns[bd->isoRunLast];
830 for(i=pLastIsoRun->limit-1; i>=pLastIsoRun->start; i--) {
831 if(bd->openings[i].match!=c)
832 continue;
833 /* We have a match */
834 pOpening=&bd->openings[i];
835 direction=pLastIsoRun->level&1;
836 stable=TRUE; /* assume stable until proved otherwise */
837
838 /* The stable flag is set when brackets are paired and their
839 level is resolved and cannot be changed by what will be
840 found later in the source string.
841 An unstable match can occur only when applying N0c, where
842 the resolved level depends on the preceding context, and
843 this context may be affected by text occurring later.
844 Example: RTL paragraph containing: abc[(latin) HEBREW]
845 When the closing parenthesis is encountered, it appears
846 that N0c1 must be applied since 'abc' sets an opposite
847 direction context and both parentheses receive level 2.
848 However, when the closing square bracket is processed,
849 N0b applies because of 'HEBREW' being included within the
850 brackets, thus the square brackets are treated like R and
851 receive level 1. However, this changes the preceding
852 context of the opening parenthesis, and it now appears
853 that N0c2 must be applied to the parentheses rather than
854 N0c1. */
855
856 if((direction==0 && pOpening->flags&FOUND_L) ||
857 (direction==1 && pOpening->flags&FOUND_R)) { /* N0b */
858 newProp=direction;
859 }
860 else if(pOpening->flags&(FOUND_L|FOUND_R)) { /* N0c */
861 if(direction!=pOpening->contextDir) {
862 newProp=pOpening->contextDir; /* N0c1 */
863 /* it is stable if there is no preceding text or in
864 conditions too complicated and not worth checking */
865 stable=(i==pLastIsoRun->start);
866 }
867 else
868 newProp=direction; /* N0c2 */
869 }
870 else {
871 newProp=BN; /* N0d */
872 }
873 if(newProp!=BN) {
874 dirProps[pOpening->position]=newProp;
875 dirProps[position]=newProp;
876 pLastIsoRun->contextDir=newProp; 895 pLastIsoRun->contextDir=newProp;
877 pLastIsoRun->contextPos=position; 896 pLastIsoRun->contextPos=position;
878 } 897 level=bd->pBiDi->levels[position];
879 /* Update nested N0c pairs that may be affected */ 898 if(level&UBIDI_LEVEL_OVERRIDE) { /* X4, X5 */
880 if(newProp==direction) 899 uint16_t flag;
881 fixN0c(bd, i, pOpening->position, newProp); 900 int32_t i;
882 if(stable) { 901 newProp=level&1;
883 pLastIsoRun->limit=i; /* forget any brackets nested within this pa ir */ 902 pLastIsoRun->lastStrong=newProp;
884 /* remove lower located synonyms if any */ 903 flag=DIRPROP_FLAG(newProp);
885 while(pLastIsoRun->limit>pLastIsoRun->start && 904 for(i=pLastIsoRun->start; i<idx; i++)
886 bd->openings[pLastIsoRun->limit-1].position==pOpening->positio n) 905 bd->openings[i].flags|=flag;
887 pLastIsoRun->limit--; 906 /* matching brackets are not overridden by LRO/RLO */
907 bd->pBiDi->levels[position]&=~UBIDI_LEVEL_OVERRIDE;
908 }
909 /* matching brackets are not overridden by LRO/RLO */
910 bd->pBiDi->levels[bd->openings[idx].position]&=~UBIDI_LEVEL_OVERRIDE ;
911 return TRUE;
912 }
913 /* We get here only if the ON character is not a matching closing
914 bracket or it is a case of N0d */
915 /* Now see if it is an opening bracket */
916 if(c)
917 match=u_getBidiPairedBracket(c); /* get the matching char */
918 else
919 match=0;
920 if(match!=c && /* has a matching char */
921 ubidi_getPairedBracketType(bd->pBiDi->bdp, c)==U_BPT_OPEN) { /* openi ng bracket */
922 /* special case: process synonyms
923 create an opening entry for each synonym */
924 if(match==0x232A) { /* RIGHT-POINTING ANGLE BRACKET */
925 if(!bracketAddOpening(bd, 0x3009, position))
926 return FALSE;
927 }
928 else if(match==0x3009) { /* RIGHT ANGLE BRACKET */
929 if(!bracketAddOpening(bd, 0x232A, position))
930 return FALSE;
931 }
932 if(!bracketAddOpening(bd, match, position))
933 return FALSE;
934 }
935 }
936 level=bd->pBiDi->levels[position];
937 if(level&UBIDI_LEVEL_OVERRIDE) { /* X4, X5 */
938 newProp=level&1;
939 if(dirProp!=S && dirProp!=WS && dirProp!=ON)
940 dirProps[position]=newProp;
941 pLastIsoRun->lastBase=newProp;
942 pLastIsoRun->lastStrong=newProp;
943 pLastIsoRun->contextDir=newProp;
944 pLastIsoRun->contextPos=position;
945 }
946 else if(dirProp<=R || dirProp==AL) {
947 newProp=DIR_FROM_STRONG(dirProp);
948 pLastIsoRun->lastBase=dirProp;
949 pLastIsoRun->lastStrong=dirProp;
950 pLastIsoRun->contextDir=newProp;
951 pLastIsoRun->contextPos=position;
952 }
953 else if(dirProp==EN) {
954 pLastIsoRun->lastBase=EN;
955 if(pLastIsoRun->lastStrong==L) {
956 newProp=L; /* W7 */
957 if(!bd->isNumbersSpecial)
958 dirProps[position]=ENL;
959 pLastIsoRun->contextDir=L;
960 pLastIsoRun->contextPos=position;
888 } 961 }
889 else { 962 else {
890 pOpening->match=-position; 963 newProp=R; /* N0 */
891 /* neutralize lower located synonyms if any */ 964 if(pLastIsoRun->lastStrong==AL)
892 k=i-1; 965 dirProps[position]=AN; /* W2 */
893 while(k>=pLastIsoRun->start && 966 else
894 bd->openings[k].position==pOpening->position) 967 dirProps[position]=ENR;
895 bd->openings[k--].match=0; 968 pLastIsoRun->contextDir=R;
896 /* neutralize any unmatched opening between the current pair; 969 pLastIsoRun->contextPos=position;
897 this will also neutralize higher located synonyms if any */ 970 }
898 for(k=i+1; k<pLastIsoRun->limit; k++) { 971 }
899 qOpening=&bd->openings[k]; 972 else if(dirProp==AN) {
900 if(qOpening->position>=position) 973 newProp=R; /* N0 */
901 break; 974 pLastIsoRun->lastBase=AN;
902 if(qOpening->match>0) 975 pLastIsoRun->contextDir=R;
903 qOpening->match=0; 976 pLastIsoRun->contextPos=position;
904 } 977 }
905 } 978 else if(dirProp==NSM) {
906 return TRUE; 979 /* if the last real char was ON, change NSM to ON so that it
907 } 980 will stay ON even if the last real char is a bracket which
908 /* We get here only if the ON character was not a matching closing bracket * / 981 may be changed to L or R */
909 /* Now see if it is an opening bracket */ 982 newProp=pLastIsoRun->lastBase;
910 match=u_getBidiPairedBracket(c); /* get the matching char */ 983 if(newProp==ON)
911 if(match==c) /* if no matching char */ 984 dirProps[position]=newProp;
912 return TRUE; 985 }
913 if(ubidi_getPairedBracketType(bd->pBiDi->bdp, c)!=U_BPT_OPEN) 986 else {
914 return TRUE; /* not an opening bracket */ 987 newProp=dirProp;
915 /* special case: process synonyms 988 pLastIsoRun->lastBase=dirProp;
916 create an opening entry for each synonym */ 989 }
917 if(match==0x232A) { /* RIGHT-POINTING ANGLE BRACKET */ 990 if(newProp<=R || newProp==AL) {
918 if(!bracketAddOpening(bd, 0x3009, position)) 991 int32_t i;
919 return FALSE; 992 uint16_t flag=DIRPROP_FLAG(DIR_FROM_STRONG(newProp));
920 } 993 for(i=pLastIsoRun->start; i<pLastIsoRun->limit; i++)
921 else if(match==0x3009) { /* RIGHT ANGLE BRACKET */ 994 if(position>bd->openings[i].position)
922 if(!bracketAddOpening(bd, 0x232A, position)) 995 bd->openings[i].flags|=flag;
923 return FALSE; 996 }
924 } 997 return TRUE;
925 return bracketAddOpening(bd, match, position);
926 } 998 }
927 999
928 /* perform (X1)..(X9) ------------------------------------------------------- */ 1000 /* perform (X1)..(X9) ------------------------------------------------------- */
929 1001
930 /* determine if the text is mixed-directional or single-directional */ 1002 /* determine if the text is mixed-directional or single-directional */
931 static UBiDiDirection 1003 static UBiDiDirection
932 directionFromFlags(UBiDi *pBiDi) { 1004 directionFromFlags(UBiDi *pBiDi) {
933 Flags flags=pBiDi->flags; 1005 Flags flags=pBiDi->flags;
934 /* if the text contains AN and neutrals, then some neutrals may become RTL * / 1006 /* if the text contains AN and neutrals, then some neutrals may become RTL * /
935 if(!(flags&MASK_RTL || ((flags&DIRPROP_FLAG(AN)) && (flags&MASK_POSSIBLE_N)) )) { 1007 if(!(flags&MASK_RTL || ((flags&DIRPROP_FLAG(AN)) && (flags&MASK_POSSIBLE_N)) )) {
(...skipping 37 matching lines...) Expand 10 before | Expand all | Expand 10 after
973 * on the other hand, this saves another loop to reset these codes, 1045 * on the other hand, this saves another loop to reset these codes,
974 * or saves making and modifying a copy of dirProps[]. 1046 * or saves making and modifying a copy of dirProps[].
975 * 1047 *
976 * 1048 *
977 * Note that (Pn) and (Xn) changed significantly from version 4 of the BiDi algo rithm. 1049 * Note that (Pn) and (Xn) changed significantly from version 4 of the BiDi algo rithm.
978 * 1050 *
979 * 1051 *
980 * Handling the stack of explicit levels (Xn): 1052 * Handling the stack of explicit levels (Xn):
981 * 1053 *
982 * With the BiDi stack of explicit levels, as pushed with each 1054 * With the BiDi stack of explicit levels, as pushed with each
983 * LRE, RLE, LRO, RLO, LRI, RLI and FSO and popped with each PDF and PDI, 1055 * LRE, RLE, LRO, RLO, LRI, RLI and FSI and popped with each PDF and PDI,
984 * the explicit level must never exceed UBIDI_MAX_EXPLICIT_LEVEL. 1056 * the explicit level must never exceed UBIDI_MAX_EXPLICIT_LEVEL.
985 * 1057 *
986 * In order to have a correct push-pop semantics even in the case of overflows, 1058 * In order to have a correct push-pop semantics even in the case of overflows,
987 * overflow counters and a valid isolate counter are used as described in UAX#9 1059 * overflow counters and a valid isolate counter are used as described in UAX#9
988 * section 3.3.2 "Explicit Levels and Directions". 1060 * section 3.3.2 "Explicit Levels and Directions".
989 * 1061 *
990 * This implementation assumes that UBIDI_MAX_EXPLICIT_LEVEL is odd. 1062 * This implementation assumes that UBIDI_MAX_EXPLICIT_LEVEL is odd.
1063 *
1064 * Returns normally the direction; -1 if there was a memory shortage
1065 *
991 */ 1066 */
992 static UBiDiDirection 1067 static UBiDiDirection
993 resolveExplicitLevels(UBiDi *pBiDi, UErrorCode *pErrorCode) { 1068 resolveExplicitLevels(UBiDi *pBiDi, UErrorCode *pErrorCode) {
994 DirProp *dirProps=pBiDi->dirProps; 1069 DirProp *dirProps=pBiDi->dirProps;
995 UBiDiLevel *levels=pBiDi->levels; 1070 UBiDiLevel *levels=pBiDi->levels;
996 const UChar *text=pBiDi->text; 1071 const UChar *text=pBiDi->text;
997 1072
998 int32_t i=0, length=pBiDi->length; 1073 int32_t i=0, length=pBiDi->length;
999 Flags flags=pBiDi->flags; /* collect all directionalities in the text */ 1074 Flags flags=pBiDi->flags; /* collect all directionalities in the text */
1000 DirProp dirProp; 1075 DirProp dirProp;
(...skipping 36 matching lines...) Expand 10 before | Expand all | Expand 10 after
1037 for(paraIndex=0; paraIndex<pBiDi->paraCount; paraIndex++) { 1112 for(paraIndex=0; paraIndex<pBiDi->paraCount; paraIndex++) {
1038 if(paraIndex==0) 1113 if(paraIndex==0)
1039 start=0; 1114 start=0;
1040 else 1115 else
1041 start=pBiDi->paras[paraIndex-1].limit; 1116 start=pBiDi->paras[paraIndex-1].limit;
1042 limit=pBiDi->paras[paraIndex].limit; 1117 limit=pBiDi->paras[paraIndex].limit;
1043 level=pBiDi->paras[paraIndex].level; 1118 level=pBiDi->paras[paraIndex].level;
1044 for(i=start; i<limit; i++) { 1119 for(i=start; i<limit; i++) {
1045 levels[i]=level; 1120 levels[i]=level;
1046 dirProp=dirProps[i]; 1121 dirProp=dirProps[i];
1122 if(dirProp==BN)
1123 continue;
1047 if(dirProp==B) { 1124 if(dirProp==B) {
1048 if((i+1)<length) { 1125 if((i+1)<length) {
1049 if(text[i]==CR && text[i+1]==LF) 1126 if(text[i]==CR && text[i+1]==LF)
1050 continue; /* skip CR when followed by LF */ 1127 continue; /* skip CR when followed by LF */
1051 bracketProcessB(&bracketData, level); 1128 bracketProcessB(&bracketData, level);
1052 } 1129 }
1053 continue; 1130 continue;
1054 } 1131 }
1055 if(!bracketProcessChar(&bracketData, i, dirProp)) { 1132 if(!bracketProcessChar(&bracketData, i)) {
1056 *pErrorCode=U_MEMORY_ALLOCATION_ERROR; 1133 *pErrorCode=U_MEMORY_ALLOCATION_ERROR;
1057 return UBIDI_LTR; 1134 return UBIDI_LTR;
1058 } 1135 }
1059 } 1136 }
1060 } 1137 }
1061 return direction; 1138 return direction;
1062 } 1139 }
1063 { 1140 {
1064 /* continue to perform (Xn) */ 1141 /* continue to perform (Xn) */
1065 1142
1066 /* (X1) level is set for all codes, embeddingLevel keeps track of the pu sh/pop operations */ 1143 /* (X1) level is set for all codes, embeddingLevel keeps track of the pu sh/pop operations */
1067 /* both variables may carry the UBIDI_LEVEL_OVERRIDE flag to indicate th e override status */ 1144 /* both variables may carry the UBIDI_LEVEL_OVERRIDE flag to indicate th e override status */
1068 UBiDiLevel embeddingLevel=level, newLevel; 1145 UBiDiLevel embeddingLevel=level, newLevel;
1069 UBiDiLevel previousLevel=level; /* previous level for regular (not C C) characters */ 1146 UBiDiLevel previousLevel=level; /* previous level for regular (not C C) characters */
1070 int32_t lastCcPos=0; /* index of last effective LRx,RLx, PDx */ 1147 int32_t lastCcPos=0; /* index of last effective LRx,RLx, PDx */
1071 1148
1149 /* The following stack remembers the embedding level and the ISOLATE fla g of level runs.
1150 stackLast points to its current entry. */
1072 uint16_t stack[UBIDI_MAX_EXPLICIT_LEVEL+2]; /* we never push anything >=UBIDI_MAX_EXPLICIT_LEVEL 1151 uint16_t stack[UBIDI_MAX_EXPLICIT_LEVEL+2]; /* we never push anything >=UBIDI_MAX_EXPLICIT_LEVEL
1073 but we need one more ent ry as base */ 1152 but we need one more ent ry as base */
1074 uint32_t stackLast=0; 1153 uint32_t stackLast=0;
1075 int32_t overflowIsolateCount=0; 1154 int32_t overflowIsolateCount=0;
1076 int32_t overflowEmbeddingCount=0; 1155 int32_t overflowEmbeddingCount=0;
1077 int32_t validIsolateCount=0; 1156 int32_t validIsolateCount=0;
1078 BracketData bracketData; 1157 BracketData bracketData;
1079 bracketInit(pBiDi, &bracketData); 1158 bracketInit(pBiDi, &bracketData);
1080 stack[0]=level; /* initialize base entry to para level, no override, no isolate */ 1159 stack[0]=level; /* initialize base entry to para level, no override, no isolate */
1081 1160
1082 /* recalculate the flags */ 1161 /* recalculate the flags */
1083 flags=0; 1162 flags=0;
1084 1163
1085 for(i=0; i<length; ++i) { 1164 for(i=0; i<length; ++i) {
1086 dirProp=dirProps[i]; 1165 dirProp=dirProps[i];
1087 switch(dirProp) { 1166 switch(dirProp) {
1088 case LRE: 1167 case LRE:
1089 case RLE: 1168 case RLE:
1090 case LRO: 1169 case LRO:
1091 case RLO: 1170 case RLO:
1092 /* (X2, X3, X4, X5) */ 1171 /* (X2, X3, X4, X5) */
1093 flags|=DIRPROP_FLAG(BN); 1172 flags|=DIRPROP_FLAG(BN);
1173 levels[i]=previousLevel;
1094 if (dirProp==LRE || dirProp==LRO) 1174 if (dirProp==LRE || dirProp==LRO)
1095 newLevel=(UBiDiLevel)((embeddingLevel+2)&~(UBIDI_LEVEL_OVERR IDE|1)); /* least greater even level */ 1175 /* least greater even level */
1176 newLevel=(UBiDiLevel)((embeddingLevel+2)&~(UBIDI_LEVEL_OVERR IDE|1));
1096 else 1177 else
1097 newLevel=(UBiDiLevel)(((embeddingLevel&~UBIDI_LEVEL_OVERRIDE )+1)|1); /* least greater odd level */ 1178 /* least greater odd level */
1179 newLevel=(UBiDiLevel)((NO_OVERRIDE(embeddingLevel)+1)|1);
1098 if(newLevel<=UBIDI_MAX_EXPLICIT_LEVEL && overflowIsolateCount==0 && 1180 if(newLevel<=UBIDI_MAX_EXPLICIT_LEVEL && overflowIsolateCount==0 &&
1099 overflowEmbeddingCount= =0) { 1181 overflowEmbeddingCount= =0) {
1100 lastCcPos=i; 1182 lastCcPos=i;
1101 embeddingLevel=newLevel; 1183 embeddingLevel=newLevel;
1102 if(dirProp==LRO || dirProp==RLO) 1184 if(dirProp==LRO || dirProp==RLO)
1103 embeddingLevel|=UBIDI_LEVEL_OVERRIDE; 1185 embeddingLevel|=UBIDI_LEVEL_OVERRIDE;
1104 stackLast++; 1186 stackLast++;
1105 stack[stackLast]=embeddingLevel; 1187 stack[stackLast]=embeddingLevel;
1106 /* we don't need to set UBIDI_LEVEL_OVERRIDE off for LRE and RLE 1188 /* we don't need to set UBIDI_LEVEL_OVERRIDE off for LRE and RLE
1107 since this has already been done for newLevel which is 1189 since this has already been done for newLevel which is
1108 the source for embeddingLevel. 1190 the source for embeddingLevel.
1109 */ 1191 */
1110 } else { 1192 } else {
1111 dirProps[i]|=IGNORE_CC;
1112 if(overflowIsolateCount==0) 1193 if(overflowIsolateCount==0)
1113 overflowEmbeddingCount++; 1194 overflowEmbeddingCount++;
1114 } 1195 }
1115 break; 1196 break;
1116 case PDF: 1197 case PDF:
1117 /* (X7) */ 1198 /* (X7) */
1118 flags|=DIRPROP_FLAG(BN); 1199 flags|=DIRPROP_FLAG(BN);
1200 levels[i]=previousLevel;
1119 /* handle all the overflow cases first */ 1201 /* handle all the overflow cases first */
1120 if(overflowIsolateCount) { 1202 if(overflowIsolateCount) {
1121 dirProps[i]|=IGNORE_CC;
1122 break; 1203 break;
1123 } 1204 }
1124 if(overflowEmbeddingCount) { 1205 if(overflowEmbeddingCount) {
1125 dirProps[i]|=IGNORE_CC;
1126 overflowEmbeddingCount--; 1206 overflowEmbeddingCount--;
1127 break; 1207 break;
1128 } 1208 }
1129 if(stackLast>0 && stack[stackLast]<ISOLATE) { /* not an isolat e entry */ 1209 if(stackLast>0 && stack[stackLast]<ISOLATE) { /* not an isolat e entry */
1130 lastCcPos=i; 1210 lastCcPos=i;
1131 stackLast--; 1211 stackLast--;
1132 embeddingLevel=(UBiDiLevel)stack[stackLast]; 1212 embeddingLevel=(UBiDiLevel)stack[stackLast];
1133 } else 1213 }
1134 dirProps[i]|=IGNORE_CC;
1135 break; 1214 break;
1136 case LRI: 1215 case LRI:
1137 case RLI: 1216 case RLI:
1138 if(embeddingLevel!=previousLevel) { 1217 flags|=(DIRPROP_FLAG(ON)|DIRPROP_FLAG_LR(embeddingLevel));
1218 levels[i]=NO_OVERRIDE(embeddingLevel);
1219 if(NO_OVERRIDE(embeddingLevel)!=NO_OVERRIDE(previousLevel)) {
1139 bracketProcessBoundary(&bracketData, lastCcPos, 1220 bracketProcessBoundary(&bracketData, lastCcPos,
1140 previousLevel, embeddingLevel); 1221 previousLevel, embeddingLevel);
1141 previousLevel=embeddingLevel; 1222 flags|=DIRPROP_FLAG_MULTI_RUNS;
1142 } 1223 }
1224 previousLevel=embeddingLevel;
1143 /* (X5a, X5b) */ 1225 /* (X5a, X5b) */
1144 flags|= DIRPROP_FLAG(ON) | DIRPROP_FLAG(BN) | DIRPROP_FLAG_LR(em beddingLevel);
1145 level=embeddingLevel;
1146 if(dirProp==LRI) 1226 if(dirProp==LRI)
1147 newLevel=(UBiDiLevel)((embeddingLevel+2)&~(UBIDI_LEVEL_OVERR IDE|1)); /* least greater even level */ 1227 /* least greater even level */
1228 newLevel=(UBiDiLevel)((embeddingLevel+2)&~(UBIDI_LEVEL_OVERR IDE|1));
1148 else 1229 else
1149 newLevel=(UBiDiLevel)(((embeddingLevel&~UBIDI_LEVEL_OVERRIDE )+1)|1); /* least greater odd level */ 1230 /* least greater odd level */
1231 newLevel=(UBiDiLevel)((NO_OVERRIDE(embeddingLevel)+1)|1);
1150 if(newLevel<=UBIDI_MAX_EXPLICIT_LEVEL && overflowIsolateCount==0 && 1232 if(newLevel<=UBIDI_MAX_EXPLICIT_LEVEL && overflowIsolateCount==0 &&
1151 overflowEmbeddingCount= =0) { 1233 overflowEmbeddingCount= =0) {
1234 flags|=DIRPROP_FLAG(dirProp);
1152 lastCcPos=i; 1235 lastCcPos=i;
1153 previousLevel=embeddingLevel;
1154 validIsolateCount++; 1236 validIsolateCount++;
1155 if(validIsolateCount>pBiDi->isolateCount) 1237 if(validIsolateCount>pBiDi->isolateCount)
1156 pBiDi->isolateCount=validIsolateCount; 1238 pBiDi->isolateCount=validIsolateCount;
1157 embeddingLevel=newLevel; 1239 embeddingLevel=newLevel;
1240 /* we can increment stackLast without checking because newLe vel
1241 will exceed UBIDI_MAX_EXPLICIT_LEVEL before stackLast ove rflows */
1158 stackLast++; 1242 stackLast++;
1159 stack[stackLast]=embeddingLevel+ISOLATE; 1243 stack[stackLast]=embeddingLevel+ISOLATE;
1160 bracketProcessLRI_RLI(&bracketData, embeddingLevel); 1244 bracketProcessLRI_RLI(&bracketData, embeddingLevel);
1161 } else { 1245 } else {
1162 dirProps[i]|=IGNORE_CC; 1246 /* make it WS so that it is handled by adjustWSLevels() */
1247 dirProps[i]=WS;
1163 overflowIsolateCount++; 1248 overflowIsolateCount++;
1164 } 1249 }
1165 break; 1250 break;
1166 case PDI: 1251 case PDI:
1167 if(embeddingLevel!=previousLevel) { 1252 if(NO_OVERRIDE(embeddingLevel)!=NO_OVERRIDE(previousLevel)) {
1168 bracketProcessBoundary(&bracketData, lastCcPos, 1253 bracketProcessBoundary(&bracketData, lastCcPos,
1169 previousLevel, embeddingLevel); 1254 previousLevel, embeddingLevel);
1255 flags|=DIRPROP_FLAG_MULTI_RUNS;
1170 } 1256 }
1171 /* (X6a) */ 1257 /* (X6a) */
1172 if(overflowIsolateCount) { 1258 if(overflowIsolateCount) {
1173 dirProps[i]|=IGNORE_CC;
1174 overflowIsolateCount--; 1259 overflowIsolateCount--;
1260 /* make it WS so that it is handled by adjustWSLevels() */
1261 dirProps[i]=WS;
1175 } 1262 }
1176 else if(validIsolateCount) { 1263 else if(validIsolateCount) {
1264 flags|=DIRPROP_FLAG(PDI);
1177 lastCcPos=i; 1265 lastCcPos=i;
1178 overflowEmbeddingCount=0; 1266 overflowEmbeddingCount=0;
1179 while(stack[stackLast]<ISOLATE) /* pop embedding entries */ 1267 while(stack[stackLast]<ISOLATE) /* pop embedding entries */
1180 stackLast--; /* until the last isolate en try */ 1268 stackLast--; /* until the last isolate en try */
1181 stackLast--; /* pop also the last isolate entry */ 1269 stackLast--; /* pop also the last isolate entry */
1182 validIsolateCount--; 1270 validIsolateCount--;
1183 bracketProcessPDI(&bracketData); 1271 bracketProcessPDI(&bracketData);
1184 } else 1272 } else
1185 dirProps[i]|=IGNORE_CC; 1273 /* make it WS so that it is handled by adjustWSLevels() */
1274 dirProps[i]=WS;
1186 embeddingLevel=(UBiDiLevel)stack[stackLast]&~ISOLATE; 1275 embeddingLevel=(UBiDiLevel)stack[stackLast]&~ISOLATE;
1187 previousLevel=level=embeddingLevel; 1276 flags|=(DIRPROP_FLAG(ON)|DIRPROP_FLAG_LR(embeddingLevel));
1188 flags|= DIRPROP_FLAG(ON) | DIRPROP_FLAG(BN) | DIRPROP_FLAG_LR(em beddingLevel); 1277 previousLevel=embeddingLevel;
1278 levels[i]=NO_OVERRIDE(embeddingLevel);
1189 break; 1279 break;
1190 case B: 1280 case B:
1191 level=GET_PARALEVEL(pBiDi, i); 1281 flags|=DIRPROP_FLAG(B);
1282 levels[i]=GET_PARALEVEL(pBiDi, i);
1192 if((i+1)<length) { 1283 if((i+1)<length) {
1193 if(text[i]==CR && text[i+1]==LF) 1284 if(text[i]==CR && text[i+1]==LF)
1194 break; /* skip CR when followed by LF */ 1285 break; /* skip CR when followed by LF */
1195 overflowEmbeddingCount=overflowIsolateCount=0; 1286 overflowEmbeddingCount=overflowIsolateCount=0;
1196 validIsolateCount=0; 1287 validIsolateCount=0;
1197 stackLast=0; 1288 stackLast=0;
1198 stack[0]=level; /* initialize base entry to para level, no o verride, no isolate */
1199 previousLevel=embeddingLevel=GET_PARALEVEL(pBiDi, i+1); 1289 previousLevel=embeddingLevel=GET_PARALEVEL(pBiDi, i+1);
1290 stack[0]=embeddingLevel; /* initialize base entry to para le vel, no override, no isolate */
1200 bracketProcessB(&bracketData, embeddingLevel); 1291 bracketProcessB(&bracketData, embeddingLevel);
1201 } 1292 }
1202 flags|=DIRPROP_FLAG(B);
1203 break; 1293 break;
1204 case BN: 1294 case BN:
1205 /* BN, LRE, RLE, and PDF are supposed to be removed (X9) */ 1295 /* BN, LRE, RLE, and PDF are supposed to be removed (X9) */
1206 /* they will get their levels set correctly in adjustWSLevels() */ 1296 /* they will get their levels set correctly in adjustWSLevels() */
1297 levels[i]=previousLevel;
1207 flags|=DIRPROP_FLAG(BN); 1298 flags|=DIRPROP_FLAG(BN);
1208 break; 1299 break;
1209 default: 1300 default:
1210 /* all other types get the "real" level */ 1301 /* all other types are normal characters and get the "real" leve l */
1211 level=embeddingLevel; 1302 if(NO_OVERRIDE(embeddingLevel)!=NO_OVERRIDE(previousLevel)) {
1212 if(embeddingLevel!=previousLevel) {
1213 bracketProcessBoundary(&bracketData, lastCcPos, 1303 bracketProcessBoundary(&bracketData, lastCcPos,
1214 previousLevel, embeddingLevel); 1304 previousLevel, embeddingLevel);
1215 previousLevel=embeddingLevel; 1305 flags|=DIRPROP_FLAG_MULTI_RUNS;
1306 if(embeddingLevel&UBIDI_LEVEL_OVERRIDE)
1307 flags|=DIRPROP_FLAG_O(embeddingLevel);
1308 else
1309 flags|=DIRPROP_FLAG_E(embeddingLevel);
1216 } 1310 }
1217 if(level&UBIDI_LEVEL_OVERRIDE) 1311 previousLevel=embeddingLevel;
1218 flags|=DIRPROP_FLAG_LR(level); 1312 levels[i]=embeddingLevel;
1219 else 1313 if(!bracketProcessChar(&bracketData, i))
1220 flags|=DIRPROP_FLAG(dirProp);
1221 if(!bracketProcessChar(&bracketData, i, dirProp))
1222 return -1; 1314 return -1;
1315 /* the dirProp may have been changed in bracketProcessChar() */
1316 flags|=DIRPROP_FLAG(dirProps[i]);
1223 break; 1317 break;
1224 } 1318 }
1225
1226 /*
1227 * We need to set reasonable levels even on BN codes and
1228 * explicit codes because we will later look at same-level runs (X10 ).
1229 */
1230 levels[i]=level;
1231 if(i>0 && levels[i-1]!=level) {
1232 flags|=DIRPROP_FLAG_MULTI_RUNS;
1233 if(level&UBIDI_LEVEL_OVERRIDE)
1234 flags|=DIRPROP_FLAG_O(level);
1235 else
1236 flags|=DIRPROP_FLAG_E(level);
1237 }
1238 if(DIRPROP_FLAG(dirProp)&MASK_ISO)
1239 level=embeddingLevel;
1240 } 1319 }
1241 if(flags&MASK_EMBEDDING) { 1320 if(flags&MASK_EMBEDDING)
1242 flags|=DIRPROP_FLAG_LR(pBiDi->paraLevel); 1321 flags|=DIRPROP_FLAG_LR(pBiDi->paraLevel);
1243 } 1322 if(pBiDi->orderParagraphsLTR && (flags&DIRPROP_FLAG(B)))
1244 if(pBiDi->orderParagraphsLTR && (flags&DIRPROP_FLAG(B))) {
1245 flags|=DIRPROP_FLAG(L); 1323 flags|=DIRPROP_FLAG(L);
1246 }
1247
1248 /* subsequently, ignore the explicit codes and BN (X9) */
1249
1250 /* again, determine if the text is mixed-directional or single-direction al */ 1324 /* again, determine if the text is mixed-directional or single-direction al */
1251 pBiDi->flags=flags; 1325 pBiDi->flags=flags;
1252 direction=directionFromFlags(pBiDi); 1326 direction=directionFromFlags(pBiDi);
1253 } 1327 }
1254 return direction; 1328 return direction;
1255 } 1329 }
1256 1330
1257 /* 1331 /*
1258 * Use a pre-specified embedding levels array: 1332 * Use a pre-specified embedding levels array:
1259 * 1333 *
(...skipping 37 matching lines...) Expand 10 before | Expand all | Expand 10 after
1297 flags|=DIRPROP_FLAG_E(level)|DIRPROP_FLAG(dirProp); 1371 flags|=DIRPROP_FLAG_E(level)|DIRPROP_FLAG(dirProp);
1298 } 1372 }
1299 if((level<GET_PARALEVEL(pBiDi, i) && 1373 if((level<GET_PARALEVEL(pBiDi, i) &&
1300 !((0==level)&&(dirProp==B))) || 1374 !((0==level)&&(dirProp==B))) ||
1301 (UBIDI_MAX_EXPLICIT_LEVEL<level)) { 1375 (UBIDI_MAX_EXPLICIT_LEVEL<level)) {
1302 /* level out of bounds */ 1376 /* level out of bounds */
1303 *pErrorCode=U_ILLEGAL_ARGUMENT_ERROR; 1377 *pErrorCode=U_ILLEGAL_ARGUMENT_ERROR;
1304 return UBIDI_LTR; 1378 return UBIDI_LTR;
1305 } 1379 }
1306 } 1380 }
1307 if(flags&MASK_EMBEDDING) { 1381 if(flags&MASK_EMBEDDING)
1308 flags|=DIRPROP_FLAG_LR(pBiDi->paraLevel); 1382 flags|=DIRPROP_FLAG_LR(pBiDi->paraLevel);
1309 }
1310
1311 /* determine if the text is mixed-directional or single-directional */ 1383 /* determine if the text is mixed-directional or single-directional */
1312 pBiDi->flags=flags; 1384 pBiDi->flags=flags;
1313 return directionFromFlags(pBiDi); 1385 return directionFromFlags(pBiDi);
1314 } 1386 }
1315 1387
1316 /****************************************************************** 1388 /******************************************************************
1317 The Properties state machine table 1389 The Properties state machine table
1318 ******************************************************************* 1390 *******************************************************************
1319 1391
1320 All table cells are 8 bits: 1392 All table cells are 8 bits:
(...skipping 79 matching lines...) Expand 10 before | Expand all | Expand 10 after
1400 /*16 AL:S */ { s(1,1), s(1,2), s(1,6), s(1,6), s(1,8), 16 ,s(1,17), s( 1,8), s(1,8), s(1,8), 16 , s(1,8), s(1,3),s(1,18),s(1,21), DirProp_S }, 1472 /*16 AL:S */ { s(1,1), s(1,2), s(1,6), s(1,6), s(1,8), 16 ,s(1,17), s( 1,8), s(1,8), s(1,8), 16 , s(1,8), s(1,3),s(1,18),s(1,21), DirProp_S },
1401 /*17 B */ { s(1,1), s(1,2), s(1,4), s(1,5), s(1,7),s(1,15), 17 , s( 1,7), s(1,9), s(1,7), 17 , s(1,7), s(1,3),s(1,18),s(1,21), DirProp_B }, 1473 /*17 B */ { s(1,1), s(1,2), s(1,4), s(1,5), s(1,7),s(1,15), 17 , s( 1,7), s(1,9), s(1,7), 17 , s(1,7), s(1,3),s(1,18),s(1,21), DirProp_B },
1402 /*18 ENL */ { s(1,1), s(1,2), 18 , s(1,5), s(1,7),s(1,15),s(1,17),s(2 ,19), 20 ,s(2,19), 18 , 18 , s(1,3), 18 , 21 , DirProp_L }, 1474 /*18 ENL */ { s(1,1), s(1,2), 18 , s(1,5), s(1,7),s(1,15),s(1,17),s(2 ,19), 20 ,s(2,19), 18 , 18 , s(1,3), 18 , 21 , DirProp_L },
1403 /*19 ENL+ES/CS */ { s(3,1), s(3,2), 18 , s(3,5), s(4,7),s(3,15),s(3,17), s( 4,7),s(4,14), s(4,7), 19 , s(4,7), s(3,3), 18 , 21 , DirProp_L }, 1475 /*19 ENL+ES/CS */ { s(3,1), s(3,2), 18 , s(3,5), s(4,7),s(3,15),s(3,17), s( 4,7),s(4,14), s(4,7), 19 , s(4,7), s(3,3), 18 , 21 , DirProp_L },
1404 /*20 ENL+ET */ { s(1,1), s(1,2), 18 , s(1,5), s(1,7),s(1,15),s(1,17), s( 1,7), 20 , s(1,7), 20 , 20 , s(1,3), 18 , 21 , DirProp_L }, 1476 /*20 ENL+ET */ { s(1,1), s(1,2), 18 , s(1,5), s(1,7),s(1,15),s(1,17), s( 1,7), 20 , s(1,7), 20 , 20 , s(1,3), 18 , 21 , DirProp_L },
1405 /*21 ENR */ { s(1,1), s(1,2), 21 , s(1,5), s(1,7),s(1,15),s(1,17),s(2 ,22), 23 ,s(2,22), 21 , 21 , s(1,3), 18 , 21 , DirProp_AN }, 1477 /*21 ENR */ { s(1,1), s(1,2), 21 , s(1,5), s(1,7),s(1,15),s(1,17),s(2 ,22), 23 ,s(2,22), 21 , 21 , s(1,3), 18 , 21 , DirProp_AN },
1406 /*22 ENR+ES/CS */ { s(3,1), s(3,2), 21 , s(3,5), s(4,7),s(3,15),s(3,17), s( 4,7),s(4,14), s(4,7), 22 , s(4,7), s(3,3), 18 , 21 , DirProp_AN }, 1478 /*22 ENR+ES/CS */ { s(3,1), s(3,2), 21 , s(3,5), s(4,7),s(3,15),s(3,17), s( 4,7),s(4,14), s(4,7), 22 , s(4,7), s(3,3), 18 , 21 , DirProp_AN },
1407 /*23 ENR+ET */ { s(1,1), s(1,2), 21 , s(1,5), s(1,7),s(1,15),s(1,17), s( 1,7), 23 , s(1,7), 23 , 23 , s(1,3), 18 , 21 , DirProp_AN } 1479 /*23 ENR+ET */ { s(1,1), s(1,2), 21 , s(1,5), s(1,7),s(1,15),s(1,17), s( 1,7), 23 , s(1,7), 23 , 23 , s(1,3), 18 , 21 , DirProp_AN }
1408 }; 1480 };
1409 1481
1410 /* we must undef macro s because the levels table have a different 1482 /* we must undef macro s because the levels tables have a different
1411 * structure (4 bits for action and 4 bits for next state. 1483 * structure (4 bits for action and 4 bits for next state.
1412 */ 1484 */
1413 #undef s 1485 #undef s
1414 1486
1415 /****************************************************************** 1487 /******************************************************************
1416 The levels state machine tables 1488 The levels state machine tables
1417 ******************************************************************* 1489 *******************************************************************
1418 1490
1419 All table cells are 8 bits: 1491 All table cells are 8 bits:
1420 bits 0..3: next state 1492 bits 0..3: next state
(...skipping 58 matching lines...) Expand 10 before | Expand all | Expand 10 after
1479 one or more following sequences are received. For instance, 1551 one or more following sequences are received. For instance,
1480 ON following an R sequence within an even-level paragraph. 1552 ON following an R sequence within an even-level paragraph.
1481 If the following sequence is R, the ON sequence will be 1553 If the following sequence is R, the ON sequence will be
1482 assigned basic run level+1, and so will the R sequence. 1554 assigned basic run level+1, and so will the R sequence.
1483 4) S is generally handled like ON, since its level will be fixed 1555 4) S is generally handled like ON, since its level will be fixed
1484 to paragraph level in adjustWSLevels(). 1556 to paragraph level in adjustWSLevels().
1485 1557
1486 */ 1558 */
1487 1559
1488 static const ImpTab impTabL_DEFAULT = /* Even paragraph level */ 1560 static const ImpTab impTabL_DEFAULT = /* Even paragraph level */
1489 /* In this table, conditional sequences receive the higher possible level 1561 /* In this table, conditional sequences receive the lower possible level
1490 until proven otherwise. 1562 until proven otherwise.
1491 */ 1563 */
1492 { 1564 {
1493 /* L , R , EN , AN , ON , S , B , R es */ 1565 /* L , R , EN , AN , ON , S , B , R es */
1494 /* 0 : init */ { 0 , 1 , 0 , 2 , 0 , 0 , 0 , 0 }, 1566 /* 0 : init */ { 0 , 1 , 0 , 2 , 0 , 0 , 0 , 0 },
1495 /* 1 : R */ { 0 , 1 , 3 , 3 , s(1,4), s(1,4), 0 , 1 }, 1567 /* 1 : R */ { 0 , 1 , 3 , 3 , s(1,4), s(1,4), 0 , 1 },
1496 /* 2 : AN */ { 0 , 1 , 0 , 2 , s(1,5), s(1,5), 0 , 2 }, 1568 /* 2 : AN */ { 0 , 1 , 0 , 2 , s(1,5), s(1,5), 0 , 2 },
1497 /* 3 : R+EN/AN */ { 0 , 1 , 3 , 3 , s(1,4), s(1,4), 0 , 2 }, 1569 /* 3 : R+EN/AN */ { 0 , 1 , 3 , 3 , s(1,4), s(1,4), 0 , 2 },
1498 /* 4 : R+ON */ { s(2,0), 1 , 3 , 3 , 4 , 4 , s(2,0), 1 }, 1570 /* 4 : R+ON */ { 0 , s(2,1), s(3,3), s(3,3), 4 , 4 , 0 , 0 },
1499 /* 5 : AN+ON */ { s(2,0), 1 , s(2,0), 2 , 5 , 5 , s(2,0), 1 } 1571 /* 5 : AN+ON */ { 0 , s(2,1), 0 , s(3,2), 5 , 5 , 0 , 0 }
1500 }; 1572 };
1501 static const ImpTab impTabR_DEFAULT = /* Odd paragraph level */ 1573 static const ImpTab impTabR_DEFAULT = /* Odd paragraph level */
1502 /* In this table, conditional sequences receive the lower possible level 1574 /* In this table, conditional sequences receive the lower possible level
1503 until proven otherwise. 1575 until proven otherwise.
1504 */ 1576 */
1505 { 1577 {
1506 /* L , R , EN , AN , ON , S , B , R es */ 1578 /* L , R , EN , AN , ON , S , B , R es */
1507 /* 0 : init */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 0 }, 1579 /* 0 : init */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 0 },
1508 /* 1 : L */ { 1 , 0 , 1 , 3 , s(1,4), s(1,4), 0 , 1 }, 1580 /* 1 : L */ { 1 , 0 , 1 , 3 , s(1,4), s(1,4), 0 , 1 },
1509 /* 2 : EN/AN */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 1 }, 1581 /* 2 : EN/AN */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 1 },
1510 /* 3 : L+AN */ { 1 , 0 , 1 , 3 , 5 , 5 , 0 , 1 }, 1582 /* 3 : L+AN */ { 1 , 0 , 1 , 3 , 5 , 5 , 0 , 1 },
1511 /* 4 : L+ON */ { s(2,1), 0 , s(2,1), 3 , 4 , 4 , 0 , 0 }, 1583 /* 4 : L+ON */ { s(2,1), 0 , s(2,1), 3 , 4 , 4 , 0 , 0 },
1512 /* 5 : L+AN+ON */ { 1 , 0 , 1 , 3 , 5 , 5 , 0 , 0 } 1584 /* 5 : L+AN+ON */ { 1 , 0 , 1 , 3 , 5 , 5 , 0 , 0 }
1513 }; 1585 };
1514 static const ImpAct impAct0 = {0,1,2,3,4,5,6}; 1586 static const ImpAct impAct0 = {0,1,2,3,4};
1515 static const ImpTabPair impTab_DEFAULT = {{&impTabL_DEFAULT, 1587 static const ImpTabPair impTab_DEFAULT = {{&impTabL_DEFAULT,
1516 &impTabR_DEFAULT}, 1588 &impTabR_DEFAULT},
1517 {&impAct0, &impAct0}}; 1589 {&impAct0, &impAct0}};
1518 1590
1519 static const ImpTab impTabL_NUMBERS_SPECIAL = /* Even paragraph level */ 1591 static const ImpTab impTabL_NUMBERS_SPECIAL = /* Even paragraph level */
1520 /* In this table, conditional sequences receive the higher possible level 1592 /* In this table, conditional sequences receive the lower possible level
1521 until proven otherwise. 1593 until proven otherwise.
1522 */ 1594 */
1523 { 1595 {
1524 /* L , R , EN , AN , ON , S , B , R es */ 1596 /* L , R , EN , AN , ON , S , B , R es */
1525 /* 0 : init */ { 0 , 2 , 1 , 1 , 0 , 0 , 0 , 0 }, 1597 /* 0 : init */ { 0 , 2 , s(1,1), s(1,1), 0 , 0 , 0 , 0 },
1526 /* 1 : L+EN/AN */ { 0 , 2 , 1 , 1 , 0 , 0 , 0 , 2 }, 1598 /* 1 : L+EN/AN */ { 0 , s(4,2), 1 , 1 , 0 , 0 , 0 , 0 },
1527 /* 2 : R */ { 0 , 2 , 4 , 4 , s(1,3), 0 , 0 , 1 }, 1599 /* 2 : R */ { 0 , 2 , 4 , 4 , s(1,3), s(1,3), 0 , 1 },
1528 /* 3 : R+ON */ { s(2,0), 2 , 4 , 4 , 3 , 3 , s(2,0), 1 }, 1600 /* 3 : R+ON */ { 0 , s(2,2), s(3,4), s(3,4), 3 , 3 , 0 , 0 },
1529 /* 4 : R+EN/AN */ { 0 , 2 , 4 , 4 , s(1,3), s(1,3), 0 , 2 } 1601 /* 4 : R+EN/AN */ { 0 , 2 , 4 , 4 , s(1,3), s(1,3), 0 , 2 }
1530 }; 1602 };
1531 static const ImpTabPair impTab_NUMBERS_SPECIAL = {{&impTabL_NUMBERS_SPECIAL, 1603 static const ImpTabPair impTab_NUMBERS_SPECIAL = {{&impTabL_NUMBERS_SPECIAL,
1532 &impTabR_DEFAULT}, 1604 &impTabR_DEFAULT},
1533 {&impAct0, &impAct0}}; 1605 {&impAct0, &impAct0}};
1534 1606
1535 static const ImpTab impTabL_GROUP_NUMBERS_WITH_R = 1607 static const ImpTab impTabL_GROUP_NUMBERS_WITH_R =
1536 /* In this table, EN/AN+ON sequences receive levels as if associated with R 1608 /* In this table, EN/AN+ON sequences receive levels as if associated with R
1537 until proven that there is L or sor/eor on both sides. AN is handled like EN . 1609 until proven that there is L or sor/eor on both sides. AN is handled like EN .
1538 */ 1610 */
1539 { 1611 {
1540 /* L , R , EN , AN , ON , S , B , R es */ 1612 /* L , R , EN , AN , ON , S , B , R es */
(...skipping 60 matching lines...) Expand 10 before | Expand all | Expand 10 after
1601 { 1673 {
1602 /* L , R , EN , AN , ON , S , B , R es */ 1674 /* L , R , EN , AN , ON , S , B , R es */
1603 /* 0 : init */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 0 }, 1675 /* 0 : init */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 0 },
1604 /* 1 : L */ { 1 , 0 , 1 , 2 , s(1,3), s(1,3), 0 , 1 }, 1676 /* 1 : L */ { 1 , 0 , 1 , 2 , s(1,3), s(1,3), 0 , 1 },
1605 /* 2 : EN/AN */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 1 }, 1677 /* 2 : EN/AN */ { 1 , 0 , 2 , 2 , 0 , 0 , 0 , 1 },
1606 /* 3 : L+ON */ { s(2,1), s(3,0), 6 , 4 , 3 , 3 , s(3,0), 0 }, 1678 /* 3 : L+ON */ { s(2,1), s(3,0), 6 , 4 , 3 , 3 , s(3,0), 0 },
1607 /* 4 : L+ON+AN */ { s(2,1), s(3,0), 6 , 4 , 5 , 5 , s(3,0), 3 }, 1679 /* 4 : L+ON+AN */ { s(2,1), s(3,0), 6 , 4 , 5 , 5 , s(3,0), 3 },
1608 /* 5 : L+AN+ON */ { s(2,1), s(3,0), 6 , 4 , 5 , 5 , s(3,0), 2 }, 1680 /* 5 : L+AN+ON */ { s(2,1), s(3,0), 6 , 4 , 5 , 5 , s(3,0), 2 },
1609 /* 6 : L+ON+EN */ { s(2,1), s(3,0), 6 , 4 , 3 , 3 , s(3,0), 1 } 1681 /* 6 : L+ON+EN */ { s(2,1), s(3,0), 6 , 4 , 3 , 3 , s(3,0), 1 }
1610 }; 1682 };
1611 static const ImpAct impAct1 = {0,1,11,12}; 1683 static const ImpAct impAct1 = {0,1,13,14};
1612 /* FOOD FOR THOUGHT: in LTR table below, check case "JKL 123abc" 1684 /* FOOD FOR THOUGHT: in LTR table below, check case "JKL 123abc"
1613 */ 1685 */
1614 static const ImpTabPair impTab_INVERSE_LIKE_DIRECT = { 1686 static const ImpTabPair impTab_INVERSE_LIKE_DIRECT = {
1615 {&impTabL_DEFAULT, 1687 {&impTabL_DEFAULT,
1616 &impTabR_INVERSE_LIKE_DIRECT}, 1688 &impTabR_INVERSE_LIKE_DIRECT},
1617 {&impAct0, &impAct1}}; 1689 {&impAct0, &impAct1}};
1618 1690
1619 static const ImpTab impTabL_INVERSE_LIKE_DIRECT_WITH_MARKS = 1691 static const ImpTab impTabL_INVERSE_LIKE_DIRECT_WITH_MARKS =
1620 /* The case handled in this table is (visually): R EN L 1692 /* The case handled in this table is (visually): R EN L
1621 */ 1693 */
(...skipping 14 matching lines...) Expand all
1636 { 1708 {
1637 /* L , R , EN , AN , ON , S , B , R es */ 1709 /* L , R , EN , AN , ON , S , B , R es */
1638 /* 0 : init */ { s(1,3), 0 , 1 , 1 , 0 , 0 , 0 , 0 }, 1710 /* 0 : init */ { s(1,3), 0 , 1 , 1 , 0 , 0 , 0 , 0 },
1639 /* 1 : R+EN/AN */ { s(2,3), 0 , 1 , 1 , 2 , s(4,0), 0 , 1 }, 1711 /* 1 : R+EN/AN */ { s(2,3), 0 , 1 , 1 , 2 , s(4,0), 0 , 1 },
1640 /* 2 : R+EN/AN+ON */ { s(2,3), 0 , 1 , 1 , 2 , s(4,0), 0 , 0 }, 1712 /* 2 : R+EN/AN+ON */ { s(2,3), 0 , 1 , 1 , 2 , s(4,0), 0 , 0 },
1641 /* 3 : L */ { 3 , 0 , 3 , s(3,6), s(1,4), s(4,0), 0 , 1 }, 1713 /* 3 : L */ { 3 , 0 , 3 , s(3,6), s(1,4), s(4,0), 0 , 1 },
1642 /* 4 : L+ON */ { s(5,3), s(4,0), 5 , s(3,6), 4 , s(4,0), s(4,0), 0 }, 1714 /* 4 : L+ON */ { s(5,3), s(4,0), 5 , s(3,6), 4 , s(4,0), s(4,0), 0 },
1643 /* 5 : L+ON+EN */ { s(5,3), s(4,0), 5 , s(3,6), 4 , s(4,0), s(4,0), 1 }, 1715 /* 5 : L+ON+EN */ { s(5,3), s(4,0), 5 , s(3,6), 4 , s(4,0), s(4,0), 1 },
1644 /* 6 : L+AN */ { s(5,3), s(4,0), 6 , 6 , 4 , s(4,0), s(4,0), 3 } 1716 /* 6 : L+AN */ { s(5,3), s(4,0), 6 , 6 , 4 , s(4,0), s(4,0), 3 }
1645 }; 1717 };
1646 static const ImpAct impAct2 = {0,1,7,8,9,10}; 1718 static const ImpAct impAct2 = {0,1,2,5,6,7,8};
1719 static const ImpAct impAct3 = {0,1,9,10,11,12};
1647 static const ImpTabPair impTab_INVERSE_LIKE_DIRECT_WITH_MARKS = { 1720 static const ImpTabPair impTab_INVERSE_LIKE_DIRECT_WITH_MARKS = {
1648 {&impTabL_INVERSE_LIKE_DIRECT_WITH_MARKS, 1721 {&impTabL_INVERSE_LIKE_DIRECT_WITH_MARKS,
1649 &impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS}, 1722 &impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS},
1650 {&impAct0, &impAct2}}; 1723 {&impAct2, &impAct3}};
1651 1724
1652 static const ImpTabPair impTab_INVERSE_FOR_NUMBERS_SPECIAL = { 1725 static const ImpTabPair impTab_INVERSE_FOR_NUMBERS_SPECIAL = {
1653 {&impTabL_NUMBERS_SPECIAL, 1726 {&impTabL_NUMBERS_SPECIAL,
1654 &impTabR_INVERSE_LIKE_DIRECT}, 1727 &impTabR_INVERSE_LIKE_DIRECT},
1655 {&impAct0, &impAct1}}; 1728 {&impAct0, &impAct1}};
1656 1729
1657 static const ImpTab impTabL_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS = 1730 static const ImpTab impTabL_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS =
1658 /* The case handled in this table is (visually): R EN L 1731 /* The case handled in this table is (visually): R EN L
1659 */ 1732 */
1660 { 1733 {
1661 /* L , R , EN , AN , ON , S , B , R es */ 1734 /* L , R , EN , AN , ON , S , B , R es */
1662 /* 0 : init */ { 0 , s(6,2), 1 , 1 , 0 , 0 , 0 , 0 }, 1735 /* 0 : init */ { 0 , s(6,2), 1 , 1 , 0 , 0 , 0 , 0 },
1663 /* 1 : L+EN/AN */ { 0 , s(6,2), 1 , 1 , 0 , s(3,0), 0 , 4 }, 1736 /* 1 : L+EN/AN */ { 0 , s(6,2), 1 , 1 , 0 , s(3,0), 0 , 4 },
1664 /* 2 : R */ { 0 , s(6,2), s(5,4), s(5,4), s(1,3), s(3,0), 0 , 3 }, 1737 /* 2 : R */ { 0 , s(6,2), s(5,4), s(5,4), s(1,3), s(3,0), 0 , 3 },
1665 /* 3 : R+ON */ { s(3,0), s(4,2), s(5,4), s(5,4), 3 , s(3,0), s(3,0), 3 }, 1738 /* 3 : R+ON */ { s(3,0), s(4,2), s(5,4), s(5,4), 3 , s(3,0), s(3,0), 3 },
1666 /* 4 : R+EN/AN */ { s(3,0), s(4,2), 4 , 4 , s(1,3), s(3,0), s(3,0), 4 } 1739 /* 4 : R+EN/AN */ { s(3,0), s(4,2), 4 , 4 , s(1,3), s(3,0), s(3,0), 4 }
1667 }; 1740 };
1668 static const ImpTabPair impTab_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS = { 1741 static const ImpTabPair impTab_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS = {
1669 {&impTabL_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS, 1742 {&impTabL_INVERSE_FOR_NUMBERS_SPECIAL_WITH_MARKS,
1670 &impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS}, 1743 &impTabR_INVERSE_LIKE_DIRECT_WITH_MARKS},
1671 {&impAct0, &impAct2}}; 1744 {&impAct2, &impAct3}};
1672 1745
1673 #undef s 1746 #undef s
1674 1747
1675 typedef struct { 1748 typedef struct {
1676 const ImpTab * pImpTab; /* level table pointer */ 1749 const ImpTab * pImpTab; /* level table pointer */
1677 const ImpAct * pImpAct; /* action map array */ 1750 const ImpAct * pImpAct; /* action map array */
1678 int32_t startON; /* start of ON sequence */ 1751 int32_t startON; /* start of ON sequence */
1679 int32_t startL2EN; /* start of level 2 sequence */ 1752 int32_t startL2EN; /* start of level 2 sequence */
1680 int32_t lastStrongRTL; /* index of last found R or AL */ 1753 int32_t lastStrongRTL; /* index of last found R or AL */
1681 int32_t state; /* current state */ 1754 int32_t state; /* current state */
(...skipping 36 matching lines...) Expand 10 before | Expand all | Expand 10 after
1718 } 1791 }
1719 else pInsertPoints->capacity*=2; 1792 else pInsertPoints->capacity*=2;
1720 } 1793 }
1721 point.pos=pos; 1794 point.pos=pos;
1722 point.flag=flag; 1795 point.flag=flag;
1723 pInsertPoints->points[pInsertPoints->size]=point; 1796 pInsertPoints->points[pInsertPoints->size]=point;
1724 pInsertPoints->size++; 1797 pInsertPoints->size++;
1725 #undef FIRSTALLOC 1798 #undef FIRSTALLOC
1726 } 1799 }
1727 1800
1801 static void
1802 setLevelsOutsideIsolates(UBiDi *pBiDi, int32_t start, int32_t limit, UBiDiLevel level)
1803 {
1804 DirProp *dirProps=pBiDi->dirProps, dirProp;
1805 UBiDiLevel *levels=pBiDi->levels;
1806 int32_t isolateCount=0, k;
1807 for(k=start; k<limit; k++) {
1808 dirProp=dirProps[k];
1809 if(dirProp==PDI)
1810 isolateCount--;
1811 if(isolateCount==0)
1812 levels[k]=level;
1813 if(dirProp==LRI || dirProp==RLI)
1814 isolateCount++;
1815 }
1816 }
1817
1728 /* perform rules (Wn), (Nn), and (In) on a run of the text ------------------ */ 1818 /* perform rules (Wn), (Nn), and (In) on a run of the text ------------------ */
1729 1819
1730 /* 1820 /*
1731 * This implementation of the (Wn) rules applies all rules in one pass. 1821 * This implementation of the (Wn) rules applies all rules in one pass.
1732 * In order to do so, it needs a look-ahead of typically 1 character 1822 * In order to do so, it needs a look-ahead of typically 1 character
1733 * (except for W5: sequences of ET) and keeps track of changes 1823 * (except for W5: sequences of ET) and keeps track of changes
1734 * in a rule Wp that affect a later Wq (p<q). 1824 * in a rule Wp that affect a later Wq (p<q).
1735 * 1825 *
1736 * The (Nn) and (In) rules are also performed in that same single loop, 1826 * The (Nn) and (In) rules are also performed in that same single loop,
1737 * but effectively one iteration behind for white space. 1827 * but effectively one iteration behind for white space.
(...skipping 23 matching lines...) Expand all
1761 if(actionSeq) { 1851 if(actionSeq) {
1762 switch(actionSeq) { 1852 switch(actionSeq) {
1763 case 1: /* init ON seq */ 1853 case 1: /* init ON seq */
1764 pLevState->startON=start0; 1854 pLevState->startON=start0;
1765 break; 1855 break;
1766 1856
1767 case 2: /* prepend ON seq to current seq */ 1857 case 2: /* prepend ON seq to current seq */
1768 start=pLevState->startON; 1858 start=pLevState->startON;
1769 break; 1859 break;
1770 1860
1771 case 3: /* L or S after possible relevant EN/AN */ 1861 case 3: /* EN/AN after R+ON */
1862 level=pLevState->runLevel+1;
1863 setLevelsOutsideIsolates(pBiDi, pLevState->startON, start0, level);
1864 break;
1865
1866 case 4: /* EN/AN before R for NUMBERS_SPECIAL */
1867 level=pLevState->runLevel+2;
1868 setLevelsOutsideIsolates(pBiDi, pLevState->startON, start0, level);
1869 break;
1870
1871 case 5: /* L or S after possible relevant EN/AN */
1772 /* check if we had EN after R/AL */ 1872 /* check if we had EN after R/AL */
1773 if (pLevState->startL2EN >= 0) { 1873 if (pLevState->startL2EN >= 0) {
1774 addPoint(pBiDi, pLevState->startL2EN, LRM_BEFORE); 1874 addPoint(pBiDi, pLevState->startL2EN, LRM_BEFORE);
1775 } 1875 }
1776 pLevState->startL2EN=-1; /* not within previous if since could also be -2 */ 1876 pLevState->startL2EN=-1; /* not within previous if since could also be -2 */
1777 /* check if we had any relevant EN/AN after R/AL */ 1877 /* check if we had any relevant EN/AN after R/AL */
1778 pInsertPoints=&(pBiDi->insertPoints); 1878 pInsertPoints=&(pBiDi->insertPoints);
1779 if ((pInsertPoints->capacity == 0) || 1879 if ((pInsertPoints->capacity == 0) ||
1780 (pInsertPoints->size <= pInsertPoints->confirmed)) 1880 (pInsertPoints->size <= pInsertPoints->confirmed))
1781 { 1881 {
(...skipping 20 matching lines...) Expand all
1802 /* mark insert points as confirmed */ 1902 /* mark insert points as confirmed */
1803 pInsertPoints->confirmed=pInsertPoints->size; 1903 pInsertPoints->confirmed=pInsertPoints->size;
1804 pLevState->lastStrongRTL=-1; 1904 pLevState->lastStrongRTL=-1;
1805 if (_prop == DirProp_S) /* add LRM before S */ 1905 if (_prop == DirProp_S) /* add LRM before S */
1806 { 1906 {
1807 addPoint(pBiDi, start0, LRM_BEFORE); 1907 addPoint(pBiDi, start0, LRM_BEFORE);
1808 pInsertPoints->confirmed=pInsertPoints->size; 1908 pInsertPoints->confirmed=pInsertPoints->size;
1809 } 1909 }
1810 break; 1910 break;
1811 1911
1812 case 4: /* R/AL after possible relevant EN/AN */ 1912 case 6: /* R/AL after possible relevant EN/AN */
1813 /* just clean up */ 1913 /* just clean up */
1814 pInsertPoints=&(pBiDi->insertPoints); 1914 pInsertPoints=&(pBiDi->insertPoints);
1815 if (pInsertPoints->capacity > 0) 1915 if (pInsertPoints->capacity > 0)
1816 /* remove all non confirmed insert points */ 1916 /* remove all non confirmed insert points */
1817 pInsertPoints->size=pInsertPoints->confirmed; 1917 pInsertPoints->size=pInsertPoints->confirmed;
1818 pLevState->startON=-1; 1918 pLevState->startON=-1;
1819 pLevState->startL2EN=-1; 1919 pLevState->startL2EN=-1;
1820 pLevState->lastStrongRTL=limit - 1; 1920 pLevState->lastStrongRTL=limit - 1;
1821 break; 1921 break;
1822 1922
1823 case 5: /* EN/AN after R/AL + possible cont */ 1923 case 7: /* EN/AN after R/AL + possible cont */
1824 /* check for real AN */ 1924 /* check for real AN */
1825 if ((_prop == DirProp_AN) && (pBiDi->dirProps[start0] == AN) && 1925 if ((_prop == DirProp_AN) && (pBiDi->dirProps[start0] == AN) &&
1826 (pBiDi->reorderingMode!=UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIA L)) 1926 (pBiDi->reorderingMode!=UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIA L))
1827 { 1927 {
1828 /* real AN */ 1928 /* real AN */
1829 if (pLevState->startL2EN == -1) /* if no relevant EN already fou nd */ 1929 if (pLevState->startL2EN == -1) /* if no relevant EN already fou nd */
1830 { 1930 {
1831 /* just note the righmost digit as a strong RTL */ 1931 /* just note the righmost digit as a strong RTL */
1832 pLevState->lastStrongRTL=limit - 1; 1932 pLevState->lastStrongRTL=limit - 1;
1833 break; 1933 break;
1834 } 1934 }
1835 if (pLevState->startL2EN >= 0) /* after EN, no AN */ 1935 if (pLevState->startL2EN >= 0) /* after EN, no AN */
1836 { 1936 {
1837 addPoint(pBiDi, pLevState->startL2EN, LRM_BEFORE); 1937 addPoint(pBiDi, pLevState->startL2EN, LRM_BEFORE);
1838 pLevState->startL2EN=-2; 1938 pLevState->startL2EN=-2;
1839 } 1939 }
1840 /* note AN */ 1940 /* note AN */
1841 addPoint(pBiDi, start0, LRM_BEFORE); 1941 addPoint(pBiDi, start0, LRM_BEFORE);
1842 break; 1942 break;
1843 } 1943 }
1844 /* if first EN/AN after R/AL */ 1944 /* if first EN/AN after R/AL */
1845 if (pLevState->startL2EN == -1) { 1945 if (pLevState->startL2EN == -1) {
1846 pLevState->startL2EN=start0; 1946 pLevState->startL2EN=start0;
1847 } 1947 }
1848 break; 1948 break;
1849 1949
1850 case 6: /* note location of latest R/AL */ 1950 case 8: /* note location of latest R/AL */
1851 pLevState->lastStrongRTL=limit - 1; 1951 pLevState->lastStrongRTL=limit - 1;
1852 pLevState->startON=-1; 1952 pLevState->startON=-1;
1853 break; 1953 break;
1854 1954
1855 case 7: /* L after R+ON/EN/AN */ 1955 case 9: /* L after R+ON/EN/AN */
1856 /* include possible adjacent number on the left */ 1956 /* include possible adjacent number on the left */
1857 for (k=start0-1; k>=0 && !(levels[k]&1); k--); 1957 for (k=start0-1; k>=0 && !(levels[k]&1); k--);
1858 if(k>=0) { 1958 if(k>=0) {
1859 addPoint(pBiDi, k, RLM_BEFORE); /* add RLM before */ 1959 addPoint(pBiDi, k, RLM_BEFORE); /* add RLM before */
1860 pInsertPoints=&(pBiDi->insertPoints); 1960 pInsertPoints=&(pBiDi->insertPoints);
1861 pInsertPoints->confirmed=pInsertPoints->size; /* confirm it */ 1961 pInsertPoints->confirmed=pInsertPoints->size; /* confirm it */
1862 } 1962 }
1863 pLevState->startON=start0; 1963 pLevState->startON=start0;
1864 break; 1964 break;
1865 1965
1866 case 8: /* AN after L */ 1966 case 10: /* AN after L */
1867 /* AN numbers between L text on both sides may be trouble. */ 1967 /* AN numbers between L text on both sides may be trouble. */
1868 /* tentatively bracket with LRMs; will be confirmed if followed by L */ 1968 /* tentatively bracket with LRMs; will be confirmed if followed by L */
1869 addPoint(pBiDi, start0, LRM_BEFORE); /* add LRM before */ 1969 addPoint(pBiDi, start0, LRM_BEFORE); /* add LRM before */
1870 addPoint(pBiDi, start0, LRM_AFTER); /* add LRM after */ 1970 addPoint(pBiDi, start0, LRM_AFTER); /* add LRM after */
1871 break; 1971 break;
1872 1972
1873 case 9: /* R after L+ON/EN/AN */ 1973 case 11: /* R after L+ON/EN/AN */
1874 /* false alert, infirm LRMs around previous AN */ 1974 /* false alert, infirm LRMs around previous AN */
1875 pInsertPoints=&(pBiDi->insertPoints); 1975 pInsertPoints=&(pBiDi->insertPoints);
1876 pInsertPoints->size=pInsertPoints->confirmed; 1976 pInsertPoints->size=pInsertPoints->confirmed;
1877 if (_prop == DirProp_S) /* add RLM before S */ 1977 if (_prop == DirProp_S) /* add RLM before S */
1878 { 1978 {
1879 addPoint(pBiDi, start0, RLM_BEFORE); 1979 addPoint(pBiDi, start0, RLM_BEFORE);
1880 pInsertPoints->confirmed=pInsertPoints->size; 1980 pInsertPoints->confirmed=pInsertPoints->size;
1881 } 1981 }
1882 break; 1982 break;
1883 1983
1884 case 10: /* L after L+ON/AN */ 1984 case 12: /* L after L+ON/AN */
1885 level=pLevState->runLevel + addLevel; 1985 level=pLevState->runLevel + addLevel;
1886 for(k=pLevState->startON; k<start0; k++) { 1986 for(k=pLevState->startON; k<start0; k++) {
1887 if (levels[k]<level) 1987 if (levels[k]<level)
1888 levels[k]=level; 1988 levels[k]=level;
1889 } 1989 }
1890 pInsertPoints=&(pBiDi->insertPoints); 1990 pInsertPoints=&(pBiDi->insertPoints);
1891 pInsertPoints->confirmed=pInsertPoints->size; /* confirm inserts * / 1991 pInsertPoints->confirmed=pInsertPoints->size; /* confirm inserts * /
1892 pLevState->startON=start0; 1992 pLevState->startON=start0;
1893 break; 1993 break;
1894 1994
1895 case 11: /* L after L+ON+EN/AN/ON */ 1995 case 13: /* L after L+ON+EN/AN/ON */
1896 level=pLevState->runLevel; 1996 level=pLevState->runLevel;
1897 for(k=start0-1; k>=pLevState->startON; k--) { 1997 for(k=start0-1; k>=pLevState->startON; k--) {
1898 if(levels[k]==level+3) { 1998 if(levels[k]==level+3) {
1899 while(levels[k]==level+3) { 1999 while(levels[k]==level+3) {
1900 levels[k--]-=2; 2000 levels[k--]-=2;
1901 } 2001 }
1902 while(levels[k]==level) { 2002 while(levels[k]==level) {
1903 k--; 2003 k--;
1904 } 2004 }
1905 } 2005 }
1906 if(levels[k]==level+2) { 2006 if(levels[k]==level+2) {
1907 levels[k]=level; 2007 levels[k]=level;
1908 continue; 2008 continue;
1909 } 2009 }
1910 levels[k]=level+1; 2010 levels[k]=level+1;
1911 } 2011 }
1912 break; 2012 break;
1913 2013
1914 case 12: /* R after L+ON+EN/AN/ON */ 2014 case 14: /* R after L+ON+EN/AN/ON */
1915 level=pLevState->runLevel+1; 2015 level=pLevState->runLevel+1;
1916 for(k=start0-1; k>=pLevState->startON; k--) { 2016 for(k=start0-1; k>=pLevState->startON; k--) {
1917 if(levels[k]>level) { 2017 if(levels[k]>level) {
1918 levels[k]-=2; 2018 levels[k]-=2;
1919 } 2019 }
1920 } 2020 }
1921 break; 2021 break;
1922 2022
1923 default: /* we should never get here */ 2023 default: /* we should never get here */
1924 U_ASSERT(FALSE); 2024 U_ASSERT(FALSE);
1925 break; 2025 break;
1926 } 2026 }
1927 } 2027 }
1928 if((addLevel) || (start < start0)) { 2028 if((addLevel) || (start < start0)) {
1929 level=pLevState->runLevel + addLevel; 2029 level=pLevState->runLevel + addLevel;
1930 if(start>=pLevState->runStart) { 2030 if(start>=pLevState->runStart) {
1931 for(k=start; k<limit; k++) { 2031 for(k=start; k<limit; k++) {
1932 levels[k]=level; 2032 levels[k]=level;
1933 } 2033 }
1934 } else { 2034 } else {
1935 DirProp *dirProps=pBiDi->dirProps, dirProp; 2035 setLevelsOutsideIsolates(pBiDi, start, limit, level);
1936 int32_t isolateCount=0;
1937 for(k=start; k<limit; k++) {
1938 dirProp=dirProps[k];
1939 if(dirProp==PDI)
1940 isolateCount--;
1941 if(isolateCount==0)
1942 levels[k]=level;
1943 if(dirProp==LRI || dirProp==RLI)
1944 isolateCount++;
1945 }
1946 } 2036 }
1947 } 2037 }
1948 } 2038 }
1949 2039
1950 /** 2040 /**
1951 * Returns the directionality of the last strong character at the end of the pro logue, if any. 2041 * Returns the directionality of the last strong character at the end of the pro logue, if any.
1952 * Requires prologue!=null. 2042 * Requires prologue!=null.
1953 */ 2043 */
1954 static DirProp 2044 static DirProp
1955 lastL_R_AL(UBiDi *pBiDi) { 2045 lastL_R_AL(UBiDi *pBiDi) {
(...skipping 70 matching lines...) Expand 10 before | Expand all | Expand 10 after
2026 * This would need a different properties state table (at least different 2116 * This would need a different properties state table (at least different
2027 * actions) and different levels state tables (maybe very similar to the 2117 * actions) and different levels state tables (maybe very similar to the
2028 * LTR corresponding ones. 2118 * LTR corresponding ones.
2029 */ 2119 */
2030 inverseRTL=(UBool) 2120 inverseRTL=(UBool)
2031 ((start<pBiDi->lastArabicPos) && (GET_PARALEVEL(pBiDi, start) & 1) && 2121 ((start<pBiDi->lastArabicPos) && (GET_PARALEVEL(pBiDi, start) & 1) &&
2032 (pBiDi->reorderingMode==UBIDI_REORDER_INVERSE_LIKE_DIRECT || 2122 (pBiDi->reorderingMode==UBIDI_REORDER_INVERSE_LIKE_DIRECT ||
2033 pBiDi->reorderingMode==UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL)); 2123 pBiDi->reorderingMode==UBIDI_REORDER_INVERSE_FOR_NUMBERS_SPECIAL));
2034 2124
2035 /* initialize for property and levels state tables */ 2125 /* initialize for property and levels state tables */
2036 levState.startON=-1;
2037 levState.startL2EN=-1; /* used for INVERSE_LIKE_DIRECT_WITH_MAR KS */ 2126 levState.startL2EN=-1; /* used for INVERSE_LIKE_DIRECT_WITH_MAR KS */
2038 levState.lastStrongRTL=-1; /* used for INVERSE_LIKE_DIRECT_WITH_MAR KS */ 2127 levState.lastStrongRTL=-1; /* used for INVERSE_LIKE_DIRECT_WITH_MAR KS */
2039 levState.runStart=start; 2128 levState.runStart=start;
2040 levState.runLevel=pBiDi->levels[start]; 2129 levState.runLevel=pBiDi->levels[start];
2041 levState.pImpTab=(const ImpTab*)((pBiDi->pImpTabPair)->pImpTab)[levState.run Level&1]; 2130 levState.pImpTab=(const ImpTab*)((pBiDi->pImpTabPair)->pImpTab)[levState.run Level&1];
2042 levState.pImpAct=(const ImpAct*)((pBiDi->pImpTabPair)->pImpAct)[levState.run Level&1]; 2131 levState.pImpAct=(const ImpAct*)((pBiDi->pImpTabPair)->pImpAct)[levState.run Level&1];
2043 if(start==0 && pBiDi->proLength>0) { 2132 if(start==0 && pBiDi->proLength>0) {
2044 DirProp lastStrong=lastL_R_AL(pBiDi); 2133 DirProp lastStrong=lastL_R_AL(pBiDi);
2045 if(lastStrong!=DirProp_ON) { 2134 if(lastStrong!=DirProp_ON) {
2046 sor=lastStrong; 2135 sor=lastStrong;
2047 } 2136 }
2048 } 2137 }
2049 /* The isolates[] entries contain enough information to 2138 /* The isolates[] entries contain enough information to
2050 resume the bidi algorithm in the same state as it was 2139 resume the bidi algorithm in the same state as it was
2051 when it was interrupted by an isolate sequence. */ 2140 when it was interrupted by an isolate sequence. */
2052 if(dirProps[start]==PDI) { 2141 if(dirProps[start]==PDI) {
2142 levState.startON=pBiDi->isolates[pBiDi->isolateCount].startON;
2053 start1=pBiDi->isolates[pBiDi->isolateCount].start1; 2143 start1=pBiDi->isolates[pBiDi->isolateCount].start1;
2054 stateImp=pBiDi->isolates[pBiDi->isolateCount].stateImp; 2144 stateImp=pBiDi->isolates[pBiDi->isolateCount].stateImp;
2055 levState.state=pBiDi->isolates[pBiDi->isolateCount].state; 2145 levState.state=pBiDi->isolates[pBiDi->isolateCount].state;
2056 pBiDi->isolateCount--; 2146 pBiDi->isolateCount--;
2057 } else { 2147 } else {
2148 levState.startON=-1;
2058 start1=start; 2149 start1=start;
2059 if(dirProps[start]==NSM) 2150 if(dirProps[start]==NSM)
2060 stateImp = 1 + sor; 2151 stateImp = 1 + sor;
2061 else 2152 else
2062 stateImp=0; 2153 stateImp=0;
2063 levState.state=0; 2154 levState.state=0;
2064 processPropertySeq(pBiDi, &levState, sor, start, start); 2155 processPropertySeq(pBiDi, &levState, sor, start, start);
2065 } 2156 }
2066 start2=start; 2157 start2=start; /* to make Java compiler happy */
2067 2158
2068 for(i=start; i<=limit; i++) { 2159 for(i=start; i<=limit; i++) {
2069 if(i>=limit) { 2160 if(i>=limit) {
2070 if(limit>start) { 2161 int32_t k;
2071 dirProp=pBiDi->dirProps[limit-1]; 2162 for(k=limit-1; k>start&&(DIRPROP_FLAG(dirProps[k])&MASK_BN_EXPLICIT) ; k--);
2072 if(dirProp==LRI || dirProp==RLI) 2163 dirProp=dirProps[k];
2073 break; /* no forced closing for sequence ending with LRI/RL I */ 2164 if(dirProp==LRI || dirProp==RLI)
2074 } 2165 break; /* no forced closing for sequence ending with LRI/RL I */
2075 gprop=eor; 2166 gprop=eor;
2076 } else { 2167 } else {
2077 DirProp prop, prop1; 2168 DirProp prop, prop1;
2078 prop=PURE_DIRPROP(dirProps[i]); 2169 prop=dirProps[i];
2170 if(prop==B) {
2171 pBiDi->isolateCount=-1; /* current isolates stack entry == none */
2172 }
2079 if(inverseRTL) { 2173 if(inverseRTL) {
2080 if(prop==AL) { 2174 if(prop==AL) {
2081 /* AL before EN does not make it AN */ 2175 /* AL before EN does not make it AN */
2082 prop=R; 2176 prop=R;
2083 } else if(prop==EN) { 2177 } else if(prop==EN) {
2084 if(nextStrongPos<=i) { 2178 if(nextStrongPos<=i) {
2085 /* look for next strong char (L/R/AL) */ 2179 /* look for next strong char (L/R/AL) */
2086 int32_t j; 2180 int32_t j;
2087 nextStrongProp=R; /* set default */ 2181 nextStrongProp=R; /* set default */
2088 nextStrongPos=limit; 2182 nextStrongPos=limit;
(...skipping 49 matching lines...) Expand 10 before | Expand all | Expand 10 after
2138 } 2232 }
2139 2233
2140 /* flush possible pending sequence, e.g. ON */ 2234 /* flush possible pending sequence, e.g. ON */
2141 if(limit==pBiDi->length && pBiDi->epiLength>0) { 2235 if(limit==pBiDi->length && pBiDi->epiLength>0) {
2142 DirProp firstStrong=firstL_R_AL_EN_AN(pBiDi); 2236 DirProp firstStrong=firstL_R_AL_EN_AN(pBiDi);
2143 if(firstStrong!=DirProp_ON) { 2237 if(firstStrong!=DirProp_ON) {
2144 eor=firstStrong; 2238 eor=firstStrong;
2145 } 2239 }
2146 } 2240 }
2147 2241
2148 dirProp=dirProps[limit-1]; 2242 /* look for the last char not a BN or LRE/RLE/LRO/RLO/PDF */
2243 for(i=limit-1; i>start&&(DIRPROP_FLAG(dirProps[i])&MASK_BN_EXPLICIT); i--);
2244 dirProp=dirProps[i];
2149 if((dirProp==LRI || dirProp==RLI) && limit<pBiDi->length) { 2245 if((dirProp==LRI || dirProp==RLI) && limit<pBiDi->length) {
2150 pBiDi->isolateCount++; 2246 pBiDi->isolateCount++;
2151 pBiDi->isolates[pBiDi->isolateCount].stateImp=stateImp; 2247 pBiDi->isolates[pBiDi->isolateCount].stateImp=stateImp;
2152 pBiDi->isolates[pBiDi->isolateCount].state=levState.state; 2248 pBiDi->isolates[pBiDi->isolateCount].state=levState.state;
2153 pBiDi->isolates[pBiDi->isolateCount].start1=start1; 2249 pBiDi->isolates[pBiDi->isolateCount].start1=start1;
2250 pBiDi->isolates[pBiDi->isolateCount].startON=levState.startON;
2154 } 2251 }
2155 else 2252 else
2156 processPropertySeq(pBiDi, &levState, eor, limit, limit); 2253 processPropertySeq(pBiDi, &levState, eor, limit, limit);
2157 } 2254 }
2158 2255
2159 /* perform (L1) and (X9) ---------------------------------------------------- */ 2256 /* perform (L1) and (X9) ---------------------------------------------------- */
2160 2257
2161 /* 2258 /*
2162 * Reset the embedding levels for some non-graphic characters (L1). 2259 * Reset the embedding levels for some non-graphic characters (L1).
2163 * This function also sets appropriate levels for BN, and 2260 * This function also sets appropriate levels for BN, and
2164 * explicit embedding types that are supposed to have been removed 2261 * explicit embedding types that are supposed to have been removed
2165 * from the paragraph in (X9). 2262 * from the paragraph in (X9).
2166 */ 2263 */
2167 static void 2264 static void
2168 adjustWSLevels(UBiDi *pBiDi) { 2265 adjustWSLevels(UBiDi *pBiDi) {
2169 const DirProp *dirProps=pBiDi->dirProps; 2266 const DirProp *dirProps=pBiDi->dirProps;
2170 UBiDiLevel *levels=pBiDi->levels; 2267 UBiDiLevel *levels=pBiDi->levels;
2171 int32_t i; 2268 int32_t i;
2172 2269
2173 if(pBiDi->flags&MASK_WS) { 2270 if(pBiDi->flags&MASK_WS) {
2174 UBool orderParagraphsLTR=pBiDi->orderParagraphsLTR; 2271 UBool orderParagraphsLTR=pBiDi->orderParagraphsLTR;
2175 Flags flag; 2272 Flags flag;
2176 2273
2177 i=pBiDi->trailingWSStart; 2274 i=pBiDi->trailingWSStart;
2178 while(i>0) { 2275 while(i>0) {
2179 /* reset a sequence of WS/BN before eop and B/S to the paragraph par aLevel */ 2276 /* reset a sequence of WS/BN before eop and B/S to the paragraph par aLevel */
2180 while(i>0 && (flag=DIRPROP_FLAG(PURE_DIRPROP(dirProps[--i])))&MASK_W S) { 2277 while(i>0 && (flag=DIRPROP_FLAG(dirProps[--i]))&MASK_WS) {
2181 if(orderParagraphsLTR&&(flag&DIRPROP_FLAG(B))) { 2278 if(orderParagraphsLTR&&(flag&DIRPROP_FLAG(B))) {
2182 levels[i]=0; 2279 levels[i]=0;
2183 } else { 2280 } else {
2184 levels[i]=GET_PARALEVEL(pBiDi, i); 2281 levels[i]=GET_PARALEVEL(pBiDi, i);
2185 } 2282 }
2186 } 2283 }
2187 2284
2188 /* reset BN to the next character's paraLevel until B/S, which resta rts above loop */ 2285 /* reset BN to the next character's paraLevel until B/S, which resta rts above loop */
2189 /* here, i+1 is guaranteed to be <length */ 2286 /* here, i+1 is guaranteed to be <length */
2190 while(i>0) { 2287 while(i>0) {
2191 flag=DIRPROP_FLAG(PURE_DIRPROP(dirProps[--i])); 2288 flag=DIRPROP_FLAG(dirProps[--i]);
2192 if(flag&MASK_BN_EXPLICIT) { 2289 if(flag&MASK_BN_EXPLICIT) {
2193 levels[i]=levels[i+1]; 2290 levels[i]=levels[i+1];
2194 } else if(orderParagraphsLTR&&(flag&DIRPROP_FLAG(B))) { 2291 } else if(orderParagraphsLTR&&(flag&DIRPROP_FLAG(B))) {
2195 levels[i]=0; 2292 levels[i]=0;
2196 break; 2293 break;
2197 } else if(flag&MASK_B_S) { 2294 } else if(flag&MASK_B_S) {
2198 levels[i]=GET_PARALEVEL(pBiDi, i); 2295 levels[i]=GET_PARALEVEL(pBiDi, i);
2199 break; 2296 break;
2200 } 2297 }
2201 } 2298 }
(...skipping 224 matching lines...) Expand 10 before | Expand all | Expand 10 after
2426 pBiDi->reorderingMode=UBIDI_REORDER_RUNS_ONLY; 2523 pBiDi->reorderingMode=UBIDI_REORDER_RUNS_ONLY;
2427 } 2524 }
2428 2525
2429 /* ubidi_setPara ------------------------------------------------------------ */ 2526 /* ubidi_setPara ------------------------------------------------------------ */
2430 2527
2431 U_CAPI void U_EXPORT2 2528 U_CAPI void U_EXPORT2
2432 ubidi_setPara(UBiDi *pBiDi, const UChar *text, int32_t length, 2529 ubidi_setPara(UBiDi *pBiDi, const UChar *text, int32_t length,
2433 UBiDiLevel paraLevel, UBiDiLevel *embeddingLevels, 2530 UBiDiLevel paraLevel, UBiDiLevel *embeddingLevels,
2434 UErrorCode *pErrorCode) { 2531 UErrorCode *pErrorCode) {
2435 UBiDiDirection direction; 2532 UBiDiDirection direction;
2533 DirProp *dirProps;
2436 2534
2437 /* check the argument values */ 2535 /* check the argument values */
2438 RETURN_VOID_IF_NULL_OR_FAILING_ERRCODE(pErrorCode); 2536 RETURN_VOID_IF_NULL_OR_FAILING_ERRCODE(pErrorCode);
2439 if(pBiDi==NULL || text==NULL || length<-1 || 2537 if(pBiDi==NULL || text==NULL || length<-1 ||
2440 (paraLevel>UBIDI_MAX_EXPLICIT_LEVEL && paraLevel<UBIDI_DEFAULT_LTR)) { 2538 (paraLevel>UBIDI_MAX_EXPLICIT_LEVEL && paraLevel<UBIDI_DEFAULT_LTR)) {
2441 *pErrorCode=U_ILLEGAL_ARGUMENT_ERROR; 2539 *pErrorCode=U_ILLEGAL_ARGUMENT_ERROR;
2442 return; 2540 return;
2443 } 2541 }
2444 2542
2445 if(length==-1) { 2543 if(length==-1) {
(...skipping 58 matching lines...) Expand 10 before | Expand all | Expand 10 after
2504 if(getDirPropsMemory(pBiDi, length)) { 2602 if(getDirPropsMemory(pBiDi, length)) {
2505 pBiDi->dirProps=pBiDi->dirPropsMemory; 2603 pBiDi->dirProps=pBiDi->dirPropsMemory;
2506 if(!getDirProps(pBiDi)) { 2604 if(!getDirProps(pBiDi)) {
2507 *pErrorCode=U_MEMORY_ALLOCATION_ERROR; 2605 *pErrorCode=U_MEMORY_ALLOCATION_ERROR;
2508 return; 2606 return;
2509 } 2607 }
2510 } else { 2608 } else {
2511 *pErrorCode=U_MEMORY_ALLOCATION_ERROR; 2609 *pErrorCode=U_MEMORY_ALLOCATION_ERROR;
2512 return; 2610 return;
2513 } 2611 }
2612 dirProps=pBiDi->dirProps;
2514 /* the processed length may have changed if UBIDI_OPTION_STREAMING */ 2613 /* the processed length may have changed if UBIDI_OPTION_STREAMING */
2515 length= pBiDi->length; 2614 length= pBiDi->length;
2516 pBiDi->trailingWSStart=length; /* the levels[] will reflect the WS run */ 2615 pBiDi->trailingWSStart=length; /* the levels[] will reflect the WS run */
2517 2616
2518 /* are explicit levels specified? */ 2617 /* are explicit levels specified? */
2519 if(embeddingLevels==NULL) { 2618 if(embeddingLevels==NULL) {
2520 /* no: determine explicit levels according to the (Xn) rules */\ 2619 /* no: determine explicit levels according to the (Xn) rules */\
2521 if(getLevelsMemory(pBiDi, length)) { 2620 if(getLevelsMemory(pBiDi, length)) {
2522 pBiDi->levels=pBiDi->levelsMemory; 2621 pBiDi->levels=pBiDi->levelsMemory;
2523 direction=resolveExplicitLevels(pBiDi, pErrorCode); 2622 direction=resolveExplicitLevels(pBiDi, pErrorCode);
2524 if(U_FAILURE(*pErrorCode)) { 2623 if(U_FAILURE(*pErrorCode)) {
2525 return; 2624 return;
2526 } 2625 }
2527 } else { 2626 } else {
2528 *pErrorCode=U_MEMORY_ALLOCATION_ERROR; 2627 *pErrorCode=U_MEMORY_ALLOCATION_ERROR;
2529 return; 2628 return;
2530 } 2629 }
2531 } else { 2630 } else {
2532 /* set BN for all explicit codes, check that all levels are 0 or paraLev el..UBIDI_MAX_EXPLICIT_LEVEL */ 2631 /* set BN for all explicit codes, check that all levels are 0 or paraLev el..UBIDI_MAX_EXPLICIT_LEVEL */
2533 pBiDi->levels=embeddingLevels; 2632 pBiDi->levels=embeddingLevels;
2534 direction=checkExplicitLevels(pBiDi, pErrorCode); 2633 direction=checkExplicitLevels(pBiDi, pErrorCode);
2535 if(U_FAILURE(*pErrorCode)) { 2634 if(U_FAILURE(*pErrorCode)) {
2536 return; 2635 return;
2537 } 2636 }
2538 } 2637 }
2539 2638
2540 /* allocate isolate memory */ 2639 /* allocate isolate memory */
2541 if(pBiDi->isolateCount<=SIMPLE_ISOLATES_SIZE) 2640 if(pBiDi->isolateCount<=SIMPLE_ISOLATES_COUNT)
2542 pBiDi->isolates=pBiDi->simpleIsolates; 2641 pBiDi->isolates=pBiDi->simpleIsolates;
2543 else 2642 else
2544 if(pBiDi->isolateCount<=pBiDi->isolatesSize) 2643 if((int32_t)(pBiDi->isolateCount*sizeof(Isolate))<=pBiDi->isolatesSize)
2545 pBiDi->isolates=pBiDi->isolatesMemory; 2644 pBiDi->isolates=pBiDi->isolatesMemory;
2546 else { 2645 else {
2547 if(getInitialIsolatesMemory(pBiDi, pBiDi->isolateCount)) { 2646 if(getInitialIsolatesMemory(pBiDi, pBiDi->isolateCount)) {
2548 pBiDi->isolates=pBiDi->isolatesMemory; 2647 pBiDi->isolates=pBiDi->isolatesMemory;
2549 } else { 2648 } else {
2550 *pErrorCode=U_MEMORY_ALLOCATION_ERROR; 2649 *pErrorCode=U_MEMORY_ALLOCATION_ERROR;
2551 return; 2650 return;
2552 } 2651 }
2553 } 2652 }
2554 pBiDi->isolateCount=-1; /* current isolates stack entry == none */ 2653 pBiDi->isolateCount=-1; /* current isolates stack entry == none */
2555 2654
2556 /* 2655 /*
2557 * The steps after (X9) in the UBiDi algorithm are performed only if 2656 * The steps after (X9) in the UBiDi algorithm are performed only if
2558 * the paragraph text has mixed directionality! 2657 * the paragraph text has mixed directionality!
2559 */ 2658 */
2560 pBiDi->direction=direction; 2659 pBiDi->direction=direction;
2561 switch(direction) { 2660 switch(direction) {
2562 case UBIDI_LTR: 2661 case UBIDI_LTR:
2563 /* make sure paraLevel is even */
2564 pBiDi->paraLevel=(UBiDiLevel)((pBiDi->paraLevel+1)&~1);
2565
2566 /* all levels are implicitly at paraLevel (important for ubidi_getLevels ()) */ 2662 /* all levels are implicitly at paraLevel (important for ubidi_getLevels ()) */
2567 pBiDi->trailingWSStart=0; 2663 pBiDi->trailingWSStart=0;
2568 break; 2664 break;
2569 case UBIDI_RTL: 2665 case UBIDI_RTL:
2570 /* make sure paraLevel is odd */
2571 pBiDi->paraLevel|=1;
2572
2573 /* all levels are implicitly at paraLevel (important for ubidi_getLevels ()) */ 2666 /* all levels are implicitly at paraLevel (important for ubidi_getLevels ()) */
2574 pBiDi->trailingWSStart=0; 2667 pBiDi->trailingWSStart=0;
2575 break; 2668 break;
2576 default: 2669 default:
2577 /* 2670 /*
2578 * Choose the right implicit state table 2671 * Choose the right implicit state table
2579 */ 2672 */
2580 switch(pBiDi->reorderingMode) { 2673 switch(pBiDi->reorderingMode) {
2581 case UBIDI_REORDER_DEFAULT: 2674 case UBIDI_REORDER_DEFAULT:
2582 pBiDi->pImpTabPair=&impTab_DEFAULT; 2675 pBiDi->pImpTabPair=&impTab_DEFAULT;
(...skipping 57 matching lines...) Expand 10 before | Expand all | Expand 10 after
2640 } else { 2733 } else {
2641 eor=GET_LR_FROM_LEVEL(level); 2734 eor=GET_LR_FROM_LEVEL(level);
2642 } 2735 }
2643 2736
2644 do { 2737 do {
2645 /* determine start and limit of the run (end points just behind the run) */ 2738 /* determine start and limit of the run (end points just behind the run) */
2646 2739
2647 /* the values for this run's start are the same as for the previ ous run's end */ 2740 /* the values for this run's start are the same as for the previ ous run's end */
2648 start=limit; 2741 start=limit;
2649 level=nextLevel; 2742 level=nextLevel;
2650 if((start>0) && (pBiDi->dirProps[start-1]==B)) { 2743 if((start>0) && (dirProps[start-1]==B)) {
2651 /* except if this is a new paragraph, then set sor = para le vel */ 2744 /* except if this is a new paragraph, then set sor = para le vel */
2652 sor=GET_LR_FROM_LEVEL(GET_PARALEVEL(pBiDi, start)); 2745 sor=GET_LR_FROM_LEVEL(GET_PARALEVEL(pBiDi, start));
2653 } else { 2746 } else {
2654 sor=eor; 2747 sor=eor;
2655 } 2748 }
2656 2749
2657 /* search for the limit of this run */ 2750 /* search for the limit of this run */
2658 while(++limit<length && levels[limit]==level) {} 2751 while((++limit<length) &&
2752 ((levels[limit]==level) ||
2753 (DIRPROP_FLAG(dirProps[limit])&MASK_BN_EXPLICIT))) {}
2659 2754
2660 /* get the correct level of the next run */ 2755 /* get the correct level of the next run */
2661 if(limit<length) { 2756 if(limit<length) {
2662 nextLevel=levels[limit]; 2757 nextLevel=levels[limit];
2663 } else { 2758 } else {
2664 nextLevel=GET_PARALEVEL(pBiDi, length-1); 2759 nextLevel=GET_PARALEVEL(pBiDi, length-1);
2665 } 2760 }
2666 2761
2667 /* determine eor from max(level, nextLevel); sor is last run's e or */ 2762 /* determine eor from max(level, nextLevel); sor is last run's e or */
2668 if((level&~UBIDI_LEVEL_OVERRIDE)<(nextLevel&~UBIDI_LEVEL_OVERRID E)) { 2763 if(NO_OVERRIDE(level)<NO_OVERRIDE(nextLevel)) {
2669 eor=GET_LR_FROM_LEVEL(nextLevel); 2764 eor=GET_LR_FROM_LEVEL(nextLevel);
2670 } else { 2765 } else {
2671 eor=GET_LR_FROM_LEVEL(level); 2766 eor=GET_LR_FROM_LEVEL(level);
2672 } 2767 }
2673 2768
2674 /* if the run consists of overridden directional types, then the re 2769 /* if the run consists of overridden directional types, then the re
2675 are no implicit types to be resolved */ 2770 are no implicit types to be resolved */
2676 if(!(level&UBIDI_LEVEL_OVERRIDE)) { 2771 if(!(level&UBIDI_LEVEL_OVERRIDE)) {
2677 resolveImplicitLevels(pBiDi, start, limit, sor, eor); 2772 resolveImplicitLevels(pBiDi, start, limit, sor, eor);
2678 } else { 2773 } else {
(...skipping 24 matching lines...) Expand all
2703 int32_t i, j, start, last; 2798 int32_t i, j, start, last;
2704 UBiDiLevel level; 2799 UBiDiLevel level;
2705 DirProp dirProp; 2800 DirProp dirProp;
2706 for(i=0; i<pBiDi->paraCount; i++) { 2801 for(i=0; i<pBiDi->paraCount; i++) {
2707 last=(pBiDi->paras[i].limit)-1; 2802 last=(pBiDi->paras[i].limit)-1;
2708 level=pBiDi->paras[i].level; 2803 level=pBiDi->paras[i].level;
2709 if(level==0) 2804 if(level==0)
2710 continue; /* LTR paragraph */ 2805 continue; /* LTR paragraph */
2711 start= i==0 ? 0 : pBiDi->paras[i-1].limit; 2806 start= i==0 ? 0 : pBiDi->paras[i-1].limit;
2712 for(j=last; j>=start; j--) { 2807 for(j=last; j>=start; j--) {
2713 dirProp=pBiDi->dirProps[j]; 2808 dirProp=dirProps[j];
2714 if(dirProp==L) { 2809 if(dirProp==L) {
2715 if(j<last) { 2810 if(j<last) {
2716 while(pBiDi->dirProps[last]==B) { 2811 while(dirProps[last]==B) {
2717 last--; 2812 last--;
2718 } 2813 }
2719 } 2814 }
2720 addPoint(pBiDi, last, RLM_BEFORE); 2815 addPoint(pBiDi, last, RLM_BEFORE);
2721 break; 2816 break;
2722 } 2817 }
2723 if(DIRPROP_FLAG(dirProp) & MASK_R_AL) { 2818 if(DIRPROP_FLAG(dirProp) & MASK_R_AL) {
2724 break; 2819 break;
2725 } 2820 }
2726 } 2821 }
(...skipping 181 matching lines...) Expand 10 before | Expand all | Expand 10 after
2908 if( pBiDi->fnClassCallback == NULL || 3003 if( pBiDi->fnClassCallback == NULL ||
2909 (dir = (*pBiDi->fnClassCallback)(pBiDi->coClassCallback, c)) == U_BIDI_C LASS_DEFAULT ) 3004 (dir = (*pBiDi->fnClassCallback)(pBiDi->coClassCallback, c)) == U_BIDI_C LASS_DEFAULT )
2910 { 3005 {
2911 dir = ubidi_getClass(pBiDi->bdp, c); 3006 dir = ubidi_getClass(pBiDi->bdp, c);
2912 } 3007 }
2913 if(dir >= U_CHAR_DIRECTION_COUNT) { 3008 if(dir >= U_CHAR_DIRECTION_COUNT) {
2914 dir = ON; 3009 dir = ON;
2915 } 3010 }
2916 return dir; 3011 return dir;
2917 } 3012 }
OLDNEW
« no previous file with comments | « source/common/stringpiece.cpp ('k') | source/common/ubidi_props.h » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698