Index: utils/apidoc/mdn/README.txt |
diff --git a/utils/apidoc/mdn/README.txt b/utils/apidoc/mdn/README.txt |
deleted file mode 100644 |
index 1666752f6fe8deb7efa8035130da1e3b238d241f..0000000000000000000000000000000000000000 |
--- a/utils/apidoc/mdn/README.txt |
+++ /dev/null |
@@ -1,76 +0,0 @@ |
-***** Current status |
- |
-Currently it runs all the way through, but the database.json has all |
-members[] lists empty. Most entries are skipped for "Suspect title"; |
-some have ".pageText not found". |
- |
-Currently only works on Linux; OS X (or other) will need minor path changes. |
- |
-You will need a reasonably modern node.js installed. |
-0.5.9 is too old; 0.8.8 is not too old. |
- |
-I needed to add my own "DumpRenderTree_resources/missingImage.gif", |
-for some reason. |
- |
-For the reasons above, we're currently just using the checked-in |
-database.json from Feb 2012, but it has some bogus entries. In |
-particular, the one for UnknownElement would inject irrelevant German |
-text into our docs. So a hack in apidoc.dart (_mdnTypeNamesToSkip) |
-works around this. |
- |
-***** Overview |
- |
-Here's a rough walkthrough of how this works. The ultimate output file is |
-database.filtered.json. |
- |
-full_run.sh executes all of the scripts in the correct order. |
- |
-search.js |
-- read data/domTypes.json |
-- for each dom type: |
- - search for page on www.googleapis.com |
- - write search results to output/search/<type>.json |
- . this is a list of search results and urls to pages |
- |
-crawl.js |
-- read data/domTypes.json |
-- for each dom type: |
- - for each output/search/<type>.json: |
- - for each result in the file: |
- - try to scrape that cached MDN page from webcache.googleusercontent.com |
- - write mdn page to output/crawl/<type><index of result>.html |
-- write output/crawl/cache.json |
- . it maps types -> search result page urls and titles |
- |
-extract.sh |
-- compile extract.dart to js |
-- run extractRunner.js |
- - read data/domTypes.json |
- - read output/crawl/cache.json |
- - read data/dartIdl.json |
- - for each scraped search result page: |
- - create a cleaned up html page in output/extract/<type><index>.html that |
- contains the scraped content + a script tag that includes extract.dart.js. |
- - create an args file in output/extract/<type><index>.html.json with some |
- data on how that file should be processed |
- - invoke dump render tree on that file |
- - when that returns, parse the console output and add it to database.json |
- - add any errors to output/errors.json |
- - save output/database.json |
- |
-extract.dart |
-- xhr output/extract/<type><index>.html.json |
-- all sorts of shenanigans to actually pull the content out of the html |
-- build a JSON object with the results |
-- do a postmessage with that object so extractRunner.js can pull it out |
- |
-- run postProcess.dart |
- - go through the results for each type looking for the best match |
- - write output/database.html |
- - write output/examples.html |
- - write output/obsolete.html |
- - write output/database.filtered.json which is the best matches |
- |
-***** Process for updating database.json using these scripts. |
- |
-TODO(eub) when I get the scripts to work all the way through. |