Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(201)

Issue 1178633002: implement parser for new pagination algorithm (Closed)

Created:
5 years, 6 months ago by kuan
Modified:
5 years, 1 month ago
Reviewers:
wychen
Base URL:
https://github.com/chromium/dom-distiller.git@master
Target Ref:
refs/heads/master
Visibility:
Public.

Description

implement parser for new pagination algorithm PageParameterParser - parses a document to collect groups of adjacent plain text numbers and outlinks with digital anchor text - calls PageParameterDetector to detect pagination URLs - add tests. ParsedUrl - add fns to get and set hash - add test. BUG=464123 R=wychen@chromium.org Committed: 36a509ca42a49285ad4a770b38a8558f102c3337

Patch Set 1 #

Total comments: 2

Patch Set 2 : addr chris's comments #

Total comments: 13

Patch Set 3 : addr chris's comments, fixes for dataset urls #

Patch Set 4 : addr chris's comments, fixes for dataset #

Total comments: 18

Patch Set 5 : addr wychen's comments #

Unified diffs Side-by-side diffs Delta from patch set Stats (+701 lines, -4 lines) Patch
M java/org/chromium/distiller/MonotonicPageInfosGroups.java View 1 2 chunks +10 lines, -4 lines 0 comments Download
A java/org/chromium/distiller/PageParameterParser.java View 1 2 3 4 1 chunk +357 lines, -0 lines 0 comments Download
M java/org/chromium/distiller/ParsedUrl.java View 4 chunks +23 lines, -0 lines 0 comments Download
A javatests/org/chromium/distiller/PageParameterParserTest.java View 1 2 3 4 1 chunk +299 lines, -0 lines 0 comments Download
M javatests/org/chromium/distiller/ParsedUrlTest.java View 1 chunk +12 lines, -0 lines 0 comments Download

Messages

Total messages: 15 (2 generated)
kuan
5 years, 6 months ago (2015-06-10 21:01:32 UTC) #2
cjhopman
https://codereview.chromium.org/1178633002/diff/1/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://codereview.chromium.org/1178633002/diff/1/java/org/chromium/distiller/PageParameterParser.java#newcode88 java/org/chromium/distiller/PageParameterParser.java:88: NodeList<Element> allLinks = root.getElementsByTagName("A"); What do you think of ...
5 years, 6 months ago (2015-06-26 00:35:25 UTC) #3
kuan
i've addressed all comments in patch set 2. ptal. thx. https://codereview.chromium.org/1178633002/diff/1/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://codereview.chromium.org/1178633002/diff/1/java/org/chromium/distiller/PageParameterParser.java#newcode88 ...
5 years, 5 months ago (2015-07-13 22:57:04 UTC) #4
cjhopman
https://codereview.chromium.org/1178633002/diff/20001/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://codereview.chromium.org/1178633002/diff/20001/java/org/chromium/distiller/PageParameterParser.java#newcode105 java/org/chromium/distiller/PageParameterParser.java:105: Node parentWrapper = null; What's this parent wrapper? I ...
5 years, 4 months ago (2015-07-29 01:07:53 UTC) #5
kuan
i've addressed all comments in patch set 3, which also includes fixes for urls in ...
5 years, 4 months ago (2015-07-30 16:47:01 UTC) #6
cjhopman
https://codereview.chromium.org/1178633002/diff/20001/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://codereview.chromium.org/1178633002/diff/20001/java/org/chromium/distiller/PageParameterParser.java#newcode105 java/org/chromium/distiller/PageParameterParser.java:105: Node parentWrapper = null; On 2015/07/30 16:47:00, kuan wrote: ...
5 years, 4 months ago (2015-08-04 21:58:41 UTC) #7
kuan
https://codereview.chromium.org/1178633002/diff/20001/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://codereview.chromium.org/1178633002/diff/20001/java/org/chromium/distiller/PageParameterParser.java#newcode105 java/org/chromium/distiller/PageParameterParser.java:105: Node parentWrapper = null; On 2015/08/04 21:58:41, cjhopman wrote: ...
5 years, 4 months ago (2015-08-04 22:38:37 UTC) #8
kuan
i've addressed all comments in patch set 4, which also includes fixes for urls in ...
5 years, 4 months ago (2015-08-11 19:09:39 UTC) #9
kuan
chris's gone. wei-yin, pls help take a look. thx!
5 years, 4 months ago (2015-08-18 21:47:44 UTC) #11
wychen
https://codereview.chromium.org/1178633002/diff/60001/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://codereview.chromium.org/1178633002/diff/60001/java/org/chromium/distiller/PageParameterParser.java#newcode50 java/org/chromium/distiller/PageParameterParser.java:50: * Parses the document to collect outlinks with digital ...
5 years, 3 months ago (2015-09-21 23:08:03 UTC) #12
kuan
i've addressed all comments in patch set 5. ptal. thx. https://chromiumcodereview.appspot.com/1178633002/diff/60001/java/org/chromium/distiller/PageParameterParser.java File java/org/chromium/distiller/PageParameterParser.java (right): https://chromiumcodereview.appspot.com/1178633002/diff/60001/java/org/chromium/distiller/PageParameterParser.java#newcode50 ...
5 years, 2 months ago (2015-10-02 15:59:17 UTC) #13
wychen
lgtm
5 years, 2 months ago (2015-10-05 19:04:47 UTC) #14
kuan
5 years, 2 months ago (2015-10-05 20:14:47 UTC) #15
Message was sent while issue was closed.
Committed patchset #5 (id:80001) manually as
36a509ca42a49285ad4a770b38a8558f102c3337 (presubmit successful).

Powered by Google App Engine
This is Rietveld 408576698