|
|
Created:
6 years, 10 months ago by maheshkk Modified:
6 years, 10 months ago CC:
blink-reviews, krit, fs, ed+blinkwatch_opera.com, gyuyoung.kim_webkit.org, Stephen Chennney Base URL:
https://chromium.googlesource.com/chromium/blink.git@master Visibility:
Public. |
DescriptionSimplify SVGSVGElement::getElementById method
Simplify SVGSVGElement::getElementByID method code and
also by using getAllElementsById which caches elements,
we can avoid looking for matching element SVG subtree everytime.
Committed: https://src.chromium.org/viewvc/blink?view=rev&revision=167890
Patch Set 1 #
Total comments: 2
Patch Set 2 : using containsMultipleElementsWithId for single tag use case #
Total comments: 3
Patch Set 3 : incorporate review comments #
Total comments: 1
Messages
Total messages: 29 (0 generated)
PTAL
LGTM https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... File Source/core/svg/SVGSVGElement.cpp (right): https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... Source/core/svg/SVGSVGElement.cpp:772: for (Vector<Element*>::iterator it = elements.begin(); it != elements.end(); ++it) { I'm not sure how much it matters here, but I typically see the end() moved into a local variable like so: Vector<Element*>::iterator end = elements.end(); for (Vector<Element*>::iterator it = elements.begin(); it != end; ++it) { I don't feel strongly about this though.
On 2014/02/24 18:39:43, pdr wrote: > LGTM > > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > File Source/core/svg/SVGSVGElement.cpp (right): > > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > Source/core/svg/SVGSVGElement.cpp:772: for (Vector<Element*>::iterator it = > elements.begin(); it != elements.end(); ++it) { > I'm not sure how much it matters here, but I typically see the end() moved into > a local variable like so: > > Vector<Element*>::iterator end = elements.end(); > for (Vector<Element*>::iterator it = elements.begin(); it != end; ++it) { > > I don't feel strongly about this though. Thanks for the quick review pdr. I had similar feeling of moving end() to a variable. Will do that and hit commit box.
Hi Mahesh, On 2014/02/24 18:46:08, maheshkk wrote: > On 2014/02/24 18:39:43, pdr wrote: > > LGTM > > > > > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > > File Source/core/svg/SVGSVGElement.cpp (right): > > > > > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > > Source/core/svg/SVGSVGElement.cpp:772: for (Vector<Element*>::iterator it = > > elements.begin(); it != elements.end(); ++it) { > > I'm not sure how much it matters here, but I typically see the end() moved > into > > a local variable like so: > > > > Vector<Element*>::iterator end = elements.end(); > > for (Vector<Element*>::iterator it = elements.begin(); it != end; ++it) { > > > > I don't feel strongly about this though. > > Thanks for the quick review pdr. I had similar feeling of moving end() to a > variable. Will do that and hit commit box. Only nit from me, the CL Description seems wrong, you probably wanted to write Simplify.
https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... File Source/core/svg/SVGSVGElement.cpp (right): https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... Source/core/svg/SVGSVGElement.cpp:770: Vector<Element*> elements = treeScope().getAllElementsById(id); This may cause a performance regression in the non-duplicate id case, which is the most common case. Your CL causes some unnecessary overhead in the single-id case (Vector creation, some checks in getAllElementsById). It would be safer to use TreeScope::containsMultipleElementsWithId() and use getElementById() if it returns true (fast path). Then fallback to using getAllElementsById() instead of traversing.
On 2014/02/24 20:12:41, Chris Dumez wrote: > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > File Source/core/svg/SVGSVGElement.cpp (right): > > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > Source/core/svg/SVGSVGElement.cpp:770: Vector<Element*> elements = > treeScope().getAllElementsById(id); > This may cause a performance regression in the non-duplicate id case, which is > the most common case. Your CL causes some unnecessary overhead in the single-id > case (Vector creation, some checks in getAllElementsById). > > It would be safer to use TreeScope::containsMultipleElementsWithId() and use > getElementById() if it returns true (fast path). Then fallback to using > getAllElementsById() instead of traversing. Thanks for the comments Chris. Now code uses containsMultipleElementsWithId to identify best case scenario and call getElementById instead. PTAL
https://codereview.chromium.org/178323002/diff/100001/Source/core/svg/SVGSVGE... File Source/core/svg/SVGSVGElement.cpp (right): https://codereview.chromium.org/178323002/diff/100001/Source/core/svg/SVGSVGE... Source/core/svg/SVGSVGElement.cpp:774: } else { you need to return 0 before the else. And then there is no need for an else case any more. https://codereview.chromium.org/178323002/diff/100001/Source/core/svg/SVGSVGE... Source/core/svg/SVGSVGElement.cpp:775: Vector<Element*> elements = treeScope().getAllElementsById(id); Should be a const Vector<Element*>& to avoid copying the vector... https://codereview.chromium.org/178323002/diff/100001/Source/core/svg/SVGSVGE... Source/core/svg/SVGSVGElement.cpp:778: Element* element = *it; You don't really need this assignment, you can use it directly below.
Also please s/Simply/Simplify everywhere in your changelog :)
On 2014/02/24 22:24:09, Chris Dumez wrote: > Also please s/Simply/Simplify everywhere in your changelog :) Done.
LGTM but please make sure pdr is happy with this as well before landing.
https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... File Source/core/svg/SVGSVGElement.cpp (right): https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... Source/core/svg/SVGSVGElement.cpp:771: Element* element = treeScope().getElementById(id); Isn't this introducing an extra lookup for the common case? (one for containsMultipleElementsWithId() and another one for getElementById()) Whereas before, the fast path only took one lookup: getElementById().
On 2014/02/24 23:47:24, Florin Malita wrote: > https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... > File Source/core/svg/SVGSVGElement.cpp (right): > > https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... > Source/core/svg/SVGSVGElement.cpp:771: Element* element = > treeScope().getElementById(id); > Isn't this introducing an extra lookup for the common case? > > (one for containsMultipleElementsWithId() and another one for getElementById()) > > Whereas before, the fast path only took one lookup: getElementById(). Florin, yes it is an extra lookup, However containsMultipleElementsWithId() is map lookup and always O(1). IMO adding getAllElementsById() much better solution than existing subtree tree traversal solution. Let me know if I can improve this solution.
On 2014/02/24 20:12:41, Chris Dumez wrote: > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > File Source/core/svg/SVGSVGElement.cpp (right): > > https://codereview.chromium.org/178323002/diff/1/Source/core/svg/SVGSVGElemen... > Source/core/svg/SVGSVGElement.cpp:770: Vector<Element*> elements = > treeScope().getAllElementsById(id); > This may cause a performance regression in the non-duplicate id case, which is > the most common case. Your CL causes some unnecessary overhead in the single-id > case (Vector creation, some checks in getAllElementsById). > > It would be safer to use TreeScope::containsMultipleElementsWithId() and use > getElementById() if it returns true (fast path). Then fallback to using > getAllElementsById() instead of traversing. I think we probably improved perf (reducing malloc at the cost of an additional hashmap lookup for every call) but, without data, I can't say. A negative of this approach is that we've made an infrequently-called function (only via JS, no?) more complex than it was before. LGTM because it's likely performance improved. If you'd like to go the extra mile, I'd be curious if perf improved on a small test benchmark.
On 2014/02/24 23:54:58, maheshkk wrote: > On 2014/02/24 23:47:24, Florin Malita wrote: > > > https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... > > File Source/core/svg/SVGSVGElement.cpp (right): > > > > > https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... > > Source/core/svg/SVGSVGElement.cpp:771: Element* element = > > treeScope().getElementById(id); > > Isn't this introducing an extra lookup for the common case? > > > > (one for containsMultipleElementsWithId() and another one for > getElementById()) > > > > Whereas before, the fast path only took one lookup: getElementById(). > > Florin, yes it is an extra lookup, However containsMultipleElementsWithId() is > map lookup and always O(1). Well, it's constant but not free :) Then there's the key hash cost to consider (not sure whether we're caching that, maybe we are). > IMO adding getAllElementsById() much better solution than existing subtree tree > traversal solution. It certainly looks better, but note that now we're always checking isDescendant(), whereas before we didn't need to do that. Since isDescendant() does an O(log(N)) ancestor crawl, I'm not sure this is really saving much perf-wise. But it does look better :) > Let me know if I can improve this solution. Why do we need a special case for !containsMultipleElementsWithId()/getElementById() at all? I think we can simply get all ID hits and iterate searching for a descendant, no? Element* SVGSVGElement::getElementById(const AtomicString& id) const { // If duplicate IDs are there, return the first descendant of the svg element. const Vector<Element*>& elements = treeScope().getAllElementsById(id); Vector<Element*>::const_iterator end = elements.end(); for (Vector<Element*>::const_iterator it = elements.begin(); it != end; ++it) { if ((*it)->isDescendantOf(this)) return *it; } return 0; }
On 2014/02/25 00:18:15, Florin Malita wrote: > On 2014/02/24 23:54:58, maheshkk wrote: > > On 2014/02/24 23:47:24, Florin Malita wrote: > > > > > > https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... > > > File Source/core/svg/SVGSVGElement.cpp (right): > > > > > > > > > https://codereview.chromium.org/178323002/diff/120001/Source/core/svg/SVGSVGE... > > > Source/core/svg/SVGSVGElement.cpp:771: Element* element = > > > treeScope().getElementById(id); > > > Isn't this introducing an extra lookup for the common case? > > > > > > (one for containsMultipleElementsWithId() and another one for > > getElementById()) > > > > > > Whereas before, the fast path only took one lookup: getElementById(). > > > > Florin, yes it is an extra lookup, However containsMultipleElementsWithId() is > > map lookup and always O(1). > > Well, it's constant but not free :) Then there's the key hash cost to consider > (not sure whether we're caching that, maybe we are). > > > IMO adding getAllElementsById() much better solution than existing subtree > tree > > traversal solution. > > It certainly looks better, but note that now we're always checking > isDescendant(), whereas before we didn't need to do that. Since isDescendant() > does an O(log(N)) ancestor crawl, I'm not sure this is really saving much > perf-wise. But it does look better :) > > > Let me know if I can improve this solution. > > Why do we need a special case for > !containsMultipleElementsWithId()/getElementById() at all? I think we can simply > get all ID hits and iterate searching for a descendant, no? > > Element* SVGSVGElement::getElementById(const AtomicString& id) const > { > // If duplicate IDs are there, return the first descendant of the svg > element. > const Vector<Element*>& elements = treeScope().getAllElementsById(id); > Vector<Element*>::const_iterator end = elements.end(); > for (Vector<Element*>::const_iterator it = elements.begin(); it != end; > ++it) { > if ((*it)->isDescendantOf(this)) > return *it; > } > > return 0; > } Mahesh actually did this in his original patch. I advised against it because I did not want to take the risk of making the common case (no duplicate id) slower due to calling a more complex method and constructing needlessly a Vector in this case. The approach I proposed (and in the lastest version of the CL) is the one I used in SelectorQuery.cpp. I proposed that one because this particular SelectorQuery code is well covered by performance tests so we know it performs well. As you mention, there is still an extra hash lookup in the regular case, but this did not cause any trouble in the SelectorQuery code. However, using getAllElementsById() in the multiple id case had a significant performance impact in SelectorQuery (which is why I introduced getAllElementsById() in the first place). We could do something different here than in SelectorQuery but then we better have this covered by performance tests (not sure we do currently).
On 2014/02/25 00:27:48, Chris Dumez wrote: > Mahesh actually did this in his original patch. I advised against it because I > did not want to take the risk of making the common case (no duplicate id) slower > due to calling a more complex method and constructing needlessly a Vector in > this case. Oh, the joy of missing context. Don't let me hold up this CL, we can always tweak it later. LGTM. > As you mention, there is still an extra hash lookup in the regular case, but > this did not cause any trouble in the SelectorQuery code. However, using > getAllElementsById() in the multiple id case had a significant performance > impact in SelectorQuery (which is why I introduced getAllElementsById() in the > first place). Something doesn't click for me, help me understand: if getAllElementsById() is expensive (presumably on first call for a given ID, when building the cached vector), why isn't containsMultipleElementsWithId() just as expensive? Looking at DocumentOrderedMap::containsMultiple(), it seems it's not really building the vector and only looking at already cached values - but then, to turn the question around, is containsMultipleElementsWithId() really giving the correct answer?
On 2014/02/25 00:41:26, Florin Malita wrote: > Something doesn't click for me, help me understand: if getAllElementsById() is > expensive (presumably on first call for a given ID, when building the cached > vector), why isn't containsMultipleElementsWithId() just as expensive? Looking > at DocumentOrderedMap::containsMultiple(), it seems it's not really building the > vector and only looking at already cached values - but then, to turn the > question around, is containsMultipleElementsWithId() really giving the correct > answer? Never mind, I see it now: containsMultiple() uses a separately managed count which doesn't rely on the cached orderedList vector.
On 2014/02/25 00:41:26, Florin Malita wrote: > Something doesn't click for me, help me understand: if getAllElementsById() is > expensive (presumably on first call for a given ID, when building the cached > vector), why isn't containsMultipleElementsWithId() just as expensive? Looking > at DocumentOrderedMap::containsMultiple(), it seems it's not really building the > vector and only looking at already cached values - but then, to turn the > question around, is containsMultipleElementsWithId() really giving the correct > answer? What I am saying is that getAllElementsById() is more costly for the more general case (no duplicate id) than getElementById(). There are several reasons for this: 1. getAllElementsById() needs to construct a Vector (at least the first time it is called after cache invalidation) 2. getAllElementsById() is a bit more complex than getElementById() some extra checks that we don't really need if there is no duplicate id 3. We need to iterate over the returned Vector instead of dealing with a simple Element*. Sure, the for loop will break quite fast but we will still unnecessarily evaluate the end-loop condition twice. The reason containsMultipleElementsWithId() isn't very expensive is that it is inlined and it does: - A simple hash lookup - An integer comparison (count > 1). containsMultipleElementsWithId() does not cause any tree traversal because the "count" of Elements with a given id is always valid. What is lazily created is the Vector containing all the Elements with a given id (See DocumentOrderedMap.*).
On 2014/02/25 00:58:39, Chris Dumez wrote: > On 2014/02/25 00:41:26, Florin Malita wrote: > > Something doesn't click for me, help me understand: if getAllElementsById() is > > expensive (presumably on first call for a given ID, when building the cached > > vector), why isn't containsMultipleElementsWithId() just as expensive? Looking > > at DocumentOrderedMap::containsMultiple(), it seems it's not really building > the > > vector and only looking at already cached values - but then, to turn the > > question around, is containsMultipleElementsWithId() really giving the correct > > answer? > > What I am saying is that getAllElementsById() is more costly for the more > general case (no duplicate id) than getElementById(). There are several reasons > for this: > 1. getAllElementsById() needs to construct a Vector (at least the first time it > is called after cache invalidation) > 2. getAllElementsById() is a bit more complex than getElementById() some extra > checks that we don't really need if there is no duplicate id > 3. We need to iterate over the returned Vector instead of dealing with a simple > Element*. Sure, the for loop will break quite fast but we will still > unnecessarily evaluate the end-loop condition twice. > > The reason containsMultipleElementsWithId() isn't very expensive is that it is > inlined and it does: > - A simple hash lookup > - An integer comparison (count > 1). > > containsMultipleElementsWithId() does not cause any tree traversal because the > "count" of Elements with a given id is always valid. What is lazily created is > the Vector containing all the Elements with a given id (See > DocumentOrderedMap.*). Oh, and I forgot to mention that when DocumentOrderedMap::getElementById() is called, its implementation does not construct a Vector with one element. We have an Element pointer for this fast / common use case and we lazily populate the Vector only when getAllElementsById() is called. This is why the values in the DocumentOrderedMap hashmap looks like: struct MapEntry { Element* element; // Cache for getElementById unsigned count; // Maintained to be valid at all times Vector<Element*> orderedList; // Cache for getAllElementsById };
On 2014/02/25 01:02:53, Chris Dumez wrote: > Oh, and I forgot to mention that when DocumentOrderedMap::getElementById() is > called, its implementation does not construct a Vector with one element. We have > an Element pointer for this fast / common use case and we lazily populate the > Vector only when getAllElementsById() is called. This is why the values in the > DocumentOrderedMap hashmap looks like: > struct MapEntry { > Element* element; // Cache for getElementById > unsigned count; // Maintained to be valid at all times > Vector<Element*> orderedList; // Cache for getAllElementsById > }; Yes, I was assuming containsMultiple() would use the same info as getAllElementsById(), but I see that's not the case - thanks for explaining. It seems we could further optimize the fast path down to a single lookup with something like DocumentOrderedMap::getUniqueElementById(), which would return MapEntry::element iff count == 1. Certainly outside the scope of this cl though.
The CQ bit was checked by mahesh.kk@samsung.com
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mahesh.kk@samsung.com/178323002/120001
Thanks you all for the review! I will commit this now and will continue experimenting for better API. Thanks!
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mahesh.kk@samsung.com/178323002/120001
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mahesh.kk@samsung.com/178323002/120001
The CQ bit was unchecked by phajdan.jr@chromium.org
The CQ bit was checked by phajdan.jr@chromium.org
CQ is trying da patch. Follow status at https://chromium-status.appspot.com/cq/mahesh.kk@samsung.com/178323002/120001
Message was sent while issue was closed.
Change committed as 167890 |