DescriptionWIP: Improve performance of Character::isCJKIdeographOrSymbol by using trie tree
This is a re-land of https://codereview.chromium.org/1541393003/
This patch is another effort to make Character::isCJKIdeographOrSymbol
faster.
The previous CL[1] made it faster by ~90% for codepoints below U+2020,
but codepoints abvoe U+2020 were not as fast. This CL makes all
codepoints faster, as fast as ICU functions.
Before After Improve ICU
All 2569 => 292 88% 298
ASCII 68 => 68 0% 160
Han 2958 => 263 91% 344
Hira 258 => 11 95% 14
Arabic 37 => 32 13% 44
* # of code points and iterations vary by rows.
The previous CL[1] clarified that binary search is not as fast as ICU
functions such as uscript_getScript(). This patch changes to use
UTrie2, which is the data structure ICU property functions use.
In addition in this patch:
* U+2763 and U+2764 are added as requested by drott@.
* Character::isUprightInMixedVertical() was switched to UTrie2 too.
* Character::isCJKIdeograph() was removed because it is no longer used.
[1] https://codereview.chromium.org/1545073002
BUG=571943
Committed: https://crrev.com/9db08cc3e1979d3f974f8486420fb65bc8ccf247
Cr-Commit-Position: refs/heads/master@{#372207}
Patch Set 1 #Patch Set 2 : asan/msan/tsan build fix #Messages
Total messages: 9 (3 generated)
|