DescriptionUpdate distillability modeling scripts to predict long articles
The model produced from these scripts was added to Chrome here:
https://crrev.com/e9ef74ab7411ca08359015167b2c2bc1b566f95b
Highlight of improvements:
* Generate label according to the distilled length
* Feature extraction and selection
- Derive features on the fly for scalability
- Output features by group or one by one
- Support mobile emulation
- Support native feature extraction
- Support using MHTML as input
* Added sanity checking tools
* Add documentation
BUG=610944
R=mdjones@chromium.org
Committed: f8f3308f99ec3dcfa83420b304dec3cc083c9008
Patch Set 1 #Patch Set 2 : fix load-mhtml, add dev mode arguments #Patch Set 3 : update docs #
Depends on Patchset: Messages
Total messages: 8 (3 generated)
|