| Index: utils/apidoc/mdn/README.txt
|
| diff --git a/utils/apidoc/mdn/README.txt b/utils/apidoc/mdn/README.txt
|
| deleted file mode 100644
|
| index 1666752f6fe8deb7efa8035130da1e3b238d241f..0000000000000000000000000000000000000000
|
| --- a/utils/apidoc/mdn/README.txt
|
| +++ /dev/null
|
| @@ -1,76 +0,0 @@
|
| -***** Current status
|
| -
|
| -Currently it runs all the way through, but the database.json has all
|
| -members[] lists empty. Most entries are skipped for "Suspect title";
|
| -some have ".pageText not found".
|
| -
|
| -Currently only works on Linux; OS X (or other) will need minor path changes.
|
| -
|
| -You will need a reasonably modern node.js installed.
|
| -0.5.9 is too old; 0.8.8 is not too old.
|
| -
|
| -I needed to add my own "DumpRenderTree_resources/missingImage.gif",
|
| -for some reason.
|
| -
|
| -For the reasons above, we're currently just using the checked-in
|
| -database.json from Feb 2012, but it has some bogus entries. In
|
| -particular, the one for UnknownElement would inject irrelevant German
|
| -text into our docs. So a hack in apidoc.dart (_mdnTypeNamesToSkip)
|
| -works around this.
|
| -
|
| -***** Overview
|
| -
|
| -Here's a rough walkthrough of how this works. The ultimate output file is
|
| -database.filtered.json.
|
| -
|
| -full_run.sh executes all of the scripts in the correct order.
|
| -
|
| -search.js
|
| -- read data/domTypes.json
|
| -- for each dom type:
|
| - - search for page on www.googleapis.com
|
| - - write search results to output/search/<type>.json
|
| - . this is a list of search results and urls to pages
|
| -
|
| -crawl.js
|
| -- read data/domTypes.json
|
| -- for each dom type:
|
| - - for each output/search/<type>.json:
|
| - - for each result in the file:
|
| - - try to scrape that cached MDN page from webcache.googleusercontent.com
|
| - - write mdn page to output/crawl/<type><index of result>.html
|
| -- write output/crawl/cache.json
|
| - . it maps types -> search result page urls and titles
|
| -
|
| -extract.sh
|
| -- compile extract.dart to js
|
| -- run extractRunner.js
|
| - - read data/domTypes.json
|
| - - read output/crawl/cache.json
|
| - - read data/dartIdl.json
|
| - - for each scraped search result page:
|
| - - create a cleaned up html page in output/extract/<type><index>.html that
|
| - contains the scraped content + a script tag that includes extract.dart.js.
|
| - - create an args file in output/extract/<type><index>.html.json with some
|
| - data on how that file should be processed
|
| - - invoke dump render tree on that file
|
| - - when that returns, parse the console output and add it to database.json
|
| - - add any errors to output/errors.json
|
| - - save output/database.json
|
| -
|
| -extract.dart
|
| -- xhr output/extract/<type><index>.html.json
|
| -- all sorts of shenanigans to actually pull the content out of the html
|
| -- build a JSON object with the results
|
| -- do a postmessage with that object so extractRunner.js can pull it out
|
| -
|
| -- run postProcess.dart
|
| - - go through the results for each type looking for the best match
|
| - - write output/database.html
|
| - - write output/examples.html
|
| - - write output/obsolete.html
|
| - - write output/database.filtered.json which is the best matches
|
| -
|
| -***** Process for updating database.json using these scripts.
|
| -
|
| -TODO(eub) when I get the scripts to work all the way through.
|
|
|