Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(158)

Issue 2287113005: Skip unrecognized iframes (Closed)

Created:
4 years, 3 months ago by wychen
Modified:
4 years, 3 months ago
Reviewers:
mdjones
Base URL:
git@github.com:chromium/dom-distiller.git@innertext
Target Ref:
refs/heads/master
Visibility:
Public.

Description

Skip unrecognized iframes IFrames are usually useless unless they are recognized by embed extractors. The result is mostly good. The exception is documented at: http://crbug.com/641678 Score changes: https://x20web.corp.google.com/~wychen/domdistillerscore/noiframe/knowledge.html Average precision: 0.959 → 0.961 https://x20web.corp.google.com/~wychen/domdistillerscore/noiframe/multi-page.html Average precision: 0.746 → 0.746+ https://x20web.corp.google.com/~wychen/domdistillerscore/noiframe/page-links.html Average precision: 0.919 → 0.920 https://x20web.corp.google.com/~wychen/domdistillerscore/noiframe/reader-golden.html Average precision: 0.945 → 0.946 The recall is not changed because less things are extracted. BUG=641678 R=mdjones@chromium.org Committed: 50efabe82fa19b3bb9d44e8812907d447d9746a5

Patch Set 1 #

Total comments: 2

Patch Set 2 : address comment #

Unified diffs Side-by-side diffs Delta from patch set Stats (+14 lines, -0 lines) Patch
M java/org/chromium/distiller/webdocument/DomConverter.java View 1 1 chunk +1 line, -0 lines 0 comments Download
M javatests/org/chromium/distiller/ContentExtractorTest.java View 1 chunk +13 lines, -0 lines 0 comments Download

Depends on Patchset:

Dependent Patchsets:

Messages

Total messages: 7 (2 generated)
wychen
PTAL
4 years, 3 months ago (2016-08-29 18:46:00 UTC) #2
mdjones
https://codereview.chromium.org/2287113005/diff/1/java/org/chromium/distiller/webdocument/DomConverter.java File java/org/chromium/distiller/webdocument/DomConverter.java (right): https://codereview.chromium.org/2287113005/diff/1/java/org/chromium/distiller/webdocument/DomConverter.java#newcode103 java/org/chromium/distiller/webdocument/DomConverter.java:103: if (e.getTagName().equals("IFRAME")) { Can we just add "IFRAME" to ...
4 years, 3 months ago (2016-08-29 20:58:22 UTC) #3
wychen
https://codereview.chromium.org/2287113005/diff/1/java/org/chromium/distiller/webdocument/DomConverter.java File java/org/chromium/distiller/webdocument/DomConverter.java (right): https://codereview.chromium.org/2287113005/diff/1/java/org/chromium/distiller/webdocument/DomConverter.java#newcode103 java/org/chromium/distiller/webdocument/DomConverter.java:103: if (e.getTagName().equals("IFRAME")) { On 2016/08/29 20:58:22, mdjones wrote: > ...
4 years, 3 months ago (2016-08-29 21:21:09 UTC) #4
mdjones
lgtm
4 years, 3 months ago (2016-08-29 21:32:52 UTC) #5
wychen
4 years, 3 months ago (2016-08-29 21:38:02 UTC) #7
Message was sent while issue was closed.
Committed patchset #2 (id:20001) manually as
50efabe82fa19b3bb9d44e8812907d447d9746a5 (presubmit successful).

Powered by Google App Engine
This is Rietveld 408576698