java/org/chromium/distiller/webdocument/WebText.java - Issue 1230583006: Fix for keeping lists structure

Unified Diff: java/org/chromium/distiller/webdocument/WebText.java

Issue 1230583006: Fix for keeping lists structure (Closed) Base URL: https://github.com/chromium/dom-distiller.git@master

Patch Set: Classes were documented. Created 5 years, 4 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

Index: java/org/chromium/distiller/webdocument/WebText.java

diff --git a/java/org/chromium/distiller/webdocument/WebText.java b/java/org/chromium/distiller/webdocument/WebText.java

index ae81ee6f97f2d9ca41ddfa00be521cf7646ab7bb..9a0fa04620694be7fd52d4e25279993a9036c383 100644

--- a/java/org/chromium/distiller/webdocument/WebText.java

+++ b/java/org/chromium/distiller/webdocument/WebText.java

@@ -69,10 +69,16 @@ public class WebText extends WebElement {

DomUtil.stripIds(clonedRoot);

DomUtil.stripFontColorAttributes(clonedRoot);

+ // Since LI Tag is being wrapped by a pair of {@link WebTag}s,

+ // we only need to get the innerHTML, otherwise

+ // LI tag would be duplicated.

+ Element elementClonedRoot = Element.as(clonedRoot);

if (textOnly) {

- return Element.as(clonedRoot).getInnerText();

+ return elementClonedRoot.getInnerText();

+ } else if (elementClonedRoot.getTagName().equals("LI")) {

wychen 2015/08/05 19:46:15 UL and OL might need to use innerHTML as well. Eve

Marcelo Correa 2015/08/05 20:31:24 Yes, You are right. We might need to prevent that.

wychen 2015/08/06 00:04:52 It would be cleaner to aggregate the tag list in o

Marcelo Correa 2015/08/06 01:17:59 That makes two of us :P

+ return elementClonedRoot.getInnerHTML();

}

- return Element.as(clonedRoot).getString();

+ return elementClonedRoot.getString();

}

public List<Node> getTextNodes() {