Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(209)

Issue 1620043002: Add scripts for distillability modelling (Closed)

Created:
4 years, 11 months ago by wychen
Modified:
4 years, 10 months ago
Reviewers:
mdjones
Base URL:
git@github.com:chromium/dom-distiller.git@master
Target Ref:
refs/heads/master
Visibility:
Public.

Description

Add scripts for distillability modeling The model produced from these scripts was added to Chrome here: http://crrev.com/8587df371e24c3003b743a3bb0fc87ad77152804 This is based on cjhopman's work here: http://crrev.com/1289123002/ Highlight of improvements: * Feature extraction - Add new features that are effective and also cheap to compute in Blink - Remove expensive features from the derived features * Scrawling - Merge feature extraction and screenshot taking scripts - Enable concurrent scrawling for parallelization - Add MHTML archiving for reproducibility * Labeling server - Automatically advance to the next unrated item - Prefetch screenshots of next item and next unrated item - Use location.hash to access individual entires. If not specified, randomly pick one. - Restore state from all the archive files - Fix HTTP expiration of files - Improve color scheme, show decoded URLs, show rating stats, etc * Add documentation R=mdjones@chromium.org Committed: 72998ef3164652dd7aeab7e211dfe100dcd46378

Patch Set 1 #

Patch Set 2 : update #

Patch Set 3 : set upstream patchset, identical to patch set 2 #

Unified diffs Side-by-side diffs Delta from patch set Stats (+595 lines, -1227 lines) Patch
D calculate_derived_features.py View 1 2 1 chunk +0 lines, -99 lines 0 comments Download
D extract_features.js View 1 2 1 chunk +0 lines, -24 lines 0 comments Download
D foo/index View 1 2 1 chunk +0 lines, -1 line 0 comments Download
D foo/server.conf View 1 2 1 chunk +0 lines, -3 lines 0 comments Download
D foo/server.py View 1 2 1 chunk +0 lines, -187 lines 0 comments Download
D foo/test.css View 1 2 0 chunks +-1 lines, --1 lines 0 comments Download
D foo/test.html View 1 2 1 chunk +0 lines, -164 lines 0 comments Download
D foo/test.js View 1 2 1 chunk +0 lines, -301 lines 0 comments Download
D get_features.py View 1 2 1 chunk +0 lines, -120 lines 0 comments Download
D get_screenshots.py View 1 2 1 chunk +0 lines, -124 lines 0 comments Download
A heuristics/distillable/README.md View 1 1 chunk +105 lines, -0 lines 0 comments Download
A + heuristics/distillable/calculate_derived_features.py View 1 2 4 chunks +35 lines, -25 lines 0 comments Download
A heuristics/distillable/extract_features.js View 1 1 chunk +98 lines, -0 lines 0 comments Download
A heuristics/distillable/get_screenshots.py View 1 1 chunk +201 lines, -0 lines 0 comments Download
A + heuristics/distillable/index.html View 1 2 5 chunks +42 lines, -14 lines 0 comments Download
A + heuristics/distillable/index.js View 1 2 11 chunks +88 lines, -13 lines 0 comments Download
A + heuristics/distillable/server.py View 1 2 4 chunks +26 lines, -6 lines 0 comments Download
A + heuristics/distillable/write_features_csv.py View 1 2 1 chunk +1 line, -1 line 0 comments Download
D quick_score.py View 1 2 1 chunk +0 lines, -54 lines 0 comments Download
D write_features_csv.py View 1 2 1 chunk +0 lines, -60 lines 0 comments Download
D write_html.py View 1 2 1 chunk +0 lines, -32 lines 0 comments Download

Depends on Patchset:

Messages

Total messages: 14 (11 generated)
wychen
PTAL
4 years, 11 months ago (2016-01-26 02:28:53 UTC) #4
mdjones
Did a quick scan and lgtm. At some point I would add documentation for some ...
4 years, 10 months ago (2016-02-03 16:53:36 UTC) #5
wychen
4 years, 10 months ago (2016-02-11 04:38:45 UTC) #14
Message was sent while issue was closed.
Committed patchset #3 (id:60001) manually as
72998ef3164652dd7aeab7e211dfe100dcd46378 (presubmit successful).

Powered by Google App Engine
This is Rietveld 408576698