In event path calculation, it's expected that we'd have a lot of queries to check if given two tree scopes in the tree of trees are in ancestor/descendant relationships.
In the current implementation (TreeScope::isOlderSiblingOrInclusiveAncestor), this query takes O(N), where N is the height of the tree of trees.
That causes the entire calculation of event.path for each node, on which even.path would be possibly called, to take O(MN^2).
(M: The number of the nodes in event path, N: The number of the involved TreeScopes)
That's the reason why event dispatching doesn't scale well once the height of tree of trees becomes too high.
Pre-calculating event.path for each node would dominate event dispatching time.
To reduce the order of computational cost, I've implemented O(1) query algorithm, instead of the current O(N) query time algorithm. The algorithm is:
1. Pre-process each TreeScope by depth-first-search and make them have numbers of 'preVisit' and 'postVisit'.
Pre-processing takes O(N) in time and needs additional O(N) space. That shouldn't matter in this case.
2. By using these numbers, we can answer the query of isInclusiveAncestorOf() in O(1) by checking the range of numbers as follows:
preVisit <= other.preVist && other.postVisit <= postVisit
Furthermore, we get an additional, but very significant bonus by this data structures which are used in O(1) algorithm.
Now we can delay the actual calculation of event.path for each nodes.
We can calculate event.path *on demand* because we can answer the query of ancestor-descendant relationships only by checking these pre-visit and post-visit numbers.
The structure of tree of trees at the beginning of the event dispatching would be *preserved* with these numbers.
This is a huge win from the view of the performance of event dispatching.
The following are the results of the benchmarks for event dispatching. The height of node trees in each benchmark is about 50.
The major difference is the height of the tree of trees.
(The number of involved treeScope is small.)
- Without this CL: 581.9 runs/s
- With this CL: 697.2 runs/s (1.2 times faster)
(The number of involved treeScope is about 10)
- Without this CL: 5105.9 runs/s
- With this CL: 12448.1 runs/s (2.4 times faster)
(This is a new test and extreme case. The number of involved treeScope is about 200. That means event.path contains about 10000 (= 50 * 200) nodes)
- Without this CL: 1.64 runs/s
- With this CL: 708.75 runs/s (432 times faster. Now event dispatching can scale well in shadow trees)
The relevant specs are:
TEST=No behavior changes